It appears that ip ranges can overlap so. In that case lookup_rec()
returns whatever results it got last even if it found nothing in last
searched page.
This breaks an obscure livepatch late module patching usecase:
- load livepatch
- load the patched module
- unload livepatch
- try to load livepatch again
To fix this return from lookup_rec() as soon as it found the record
containing searched-for ip. This used to be this way prior lookup_rec()
introduction.
Link: http://lkml.kernel.org/r/20200306174317.21699-1-asavkov@redhat.com
Cc: stable@vger.kernel.org
Fixes: 7e16f581a817 ("ftrace: Separate out functionality from ftrace_location_range()")
Change-Id: Ibfd941aa40df53bce30b7973d58c3665a4a4a8d8
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Create a new function called lookup_rec() from the functionality of
ftrace_location_range(). The difference between lookup_rec() is that it
returns the record that it finds, where as ftrace_location_range() returns
only if it found a match or not.
The lookup_rec() is static, and can be used for new functionality where
ftrace needs to find a record of a specific address.
Change-Id: I7e5a80df3f1486b889d5fa533728794f79afa24a
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
version 4.19.325-cip124
* tag 'v4.19.325-cip124' of https://git.kernel.org/pub/scm/linux/kernel/git/cip/linux-cip:
CIP: Bump version suffix to -cip124 after merge from cip/linux-4.19.y-st tree
Update localversion-st, tree is up-to-date with 5.4.298.
f2fs: fix to do sanity check on ino and xnid
squashfs: fix memory leak in squashfs_fill_super
pNFS: Handle RPC size limit for layoutcommits
wifi: iwlwifi: fw: Fix possible memory leak in iwl_fw_dbg_collect
usb: core: usb_submit_urb: downgrade type check
udf: Verify partition map count
f2fs: fix to avoid panic in f2fs_evict_inode
usb: hub: Fix flushing and scheduling of delayed work that tunes runtime pm
Revert "drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS"
net: usb: qmi_wwan: add Telit Cinterion LE910C4-WWX new compositions
HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version()
HID: asus: fix UAF via HID_CLAIMED_INPUT validation
efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare
sctp: initialize more fields in sctp_v6_from_sk()
net: stmmac: xgmac: Do not enable RX FIFO Overflow interrupts
net/mlx5e: Set local Xoff after FW update
net: dlink: fix multicast stats being counted incorrectly
atm: atmtcp: Prevent arbitrary write in atmtcp_recv_control().
net/atm: remove the atmdev_ops {get, set}sockopt methods
Bluetooth: hci_event: Detect if HCI_EV_NUM_COMP_PKTS is unbalanced
powerpc/kvm: Fix ifdef to remove build warning
net: ipv4: fix regression in local-broadcast routes
vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()
scsi: core: sysfs: Correct sysfs attributes access rights
ftrace: Fix potential warning in trace_printk_seq during ftrace_dump
alloc_fdtable(): change calling conventions.
ALSA: usb-audio: Use correct sub-type for UAC3 feature unit validation
net/sched: Make cake_enqueue return NET_XMIT_CN when past buffer_limit
ipv6: sr: validate HMAC algorithm ID in seg6_hmac_info_add
ALSA: usb-audio: Fix size validation in convert_chmap_v3()
scsi: qla4xxx: Prevent a potential error pointer dereference
usb: xhci: Fix slot_id resource race conflict
nfs: fix UAF in direct writes
NFS: Fix up commit deadlocks
Bluetooth: fix use-after-free in device_for_each_child()
selftests: forwarding: tc_actions.sh: add matchall mirror test
codel: remove sch->q.qlen check before qdisc_tree_reduce_backlog()
sch_qfq: make qfq_qlen_notify() idempotent
sch_hfsc: make hfsc_qlen_notify() idempotent
sch_drr: make drr_qlen_notify() idempotent
btrfs: populate otime when logging an inode item
media: venus: hfi: explicitly release IRQ during teardown
f2fs: fix to avoid out-of-boundary access in dnode page
media: venus: protect against spurious interrupts during probe
media: venus: vdec: Clamp param smaller than 1fps and bigger than 240.
drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS
media: rainshadow-cec: fix TOCTOU race condition in rain_interrupt()
media: v4l2-ctrls: Don't reset handler's error in v4l2_ctrl_handler_free()
ata: Fix SATA_MOBILE_LPM_POLICY description in Kconfig
usb: musb: omap2430: fix device leak at unbind
NFS: Fix the setting of capabilities when automounting a new filesystem
NFS: Fix up handling of outstanding layoutcommit in nfs_update_inode()
NFSv4: Fix nfs4_bitmap_copy_adjust()
usb: typec: fusb302: cache PD RX state
cdc-acm: fix race between initial clearing halt and open
USB: cdc-acm: do not log successful probe on later errors
nfsd: handle get_client_locked() failure in nfsd4_setclientid_confirm()
tracing: Add down_write(trace_event_sem) when adding trace event
usb: hub: Don't try to recover devices lost during warm reset.
usb: hub: avoid warm port reset during USB3 disconnect
x86/mce/amd: Add default names for MCA banks and blocks
iio: hid-sensor-prox: Fix incorrect OFFSET calculation
mm/zsmalloc: do not pass __GFP_MOVABLE if CONFIG_COMPACTION=n
mm/zsmalloc.c: convert to use kmem_cache_zalloc in cache_alloc_zspage()
net: usbnet: Fix the wrong netif_carrier_on() call
net: usbnet: Avoid potential RCU stall on LINK_CHANGE event
PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable ports
ACPI: processor: idle: Check acpi_fetch_acpi_dev() return value
kbuild: Add KBUILD_CPPFLAGS to as-option invocation
kbuild: add $(CLANG_FLAGS) to KBUILD_CPPFLAGS
kbuild: Add CLANG_FLAGS to as-instr
mips: Include KBUILD_CPPFLAGS in CHECKFLAGS invocation
kbuild: Update assembler calls to use proper flags and language target
ARM: 9448/1: Use an absolute path to unified.h in KBUILD_AFLAGS
usb: dwc3: Ignore late xferNotReady event to prevent halt timeout
USB: storage: Ignore driver CD mode for Realtek multi-mode Wi-Fi dongles
usb: storage: realtek_cr: Use correct byte order for bcs->Residue
USB: storage: Add unusual-devs entry for Novatek NTK96550-based camera
usb: quirks: Add DELAY_INIT quick for another SanDisk 3.2Gen1 Flash Drive
iio: proximity: isl29501: fix buffered read on big-endian systems
ftrace: Also allocate and copy hash for reading of filter files
fpga: zynq_fpga: Fix the wrong usage of dma_map_sgtable()
fs/buffer: fix use-after-free when call bh_read() helper
drm/amd/display: Fix fractional fb divider in set_pixel_clock_v3
media: venus: Add a check for packet size after reading from shared memory
media: ov2659: Fix memory leaks in ov2659_probe()
media: usbtv: Lock resolution while streaming
media: gspca: Add bounds checking to firmware parser
jbd2: prevent softlockup in jbd2_log_do_checkpoint()
PCI: endpoint: Fix configfs group removal on driver teardown
PCI: endpoint: Fix configfs group list head handling
mtd: rawnand: fsmc: Add missing check after DMA map
wifi: brcmsmac: Remove const from tbl_ptr parameter in wlc_lcnphy_common_read_table()
zynq_fpga: use sgtable-based scatterlist wrappers
ata: libata-scsi: Fix ata_to_sense_error() status handling
ext4: fix reserved gdt blocks handling in fsmap
ext4: fix fsmap end of range reporting with bigalloc
ext4: check fast symlink for ea_inode correctly
Revert "vgacon: Add check for vc_origin address range in vgacon_scroll()"
vt: defkeymap: Map keycodes above 127 to K_HOLE
usb: gadget: udc: renesas_usb3: fix device leak at unbind
usb: atm: cxacru: Merge cxacru_upload_firmware() into cxacru_heavy_init()
m68k: Fix lost column on framebuffer debug console
serial: 8250: fix panic due to PSLVERR
media: uvcvideo: Do not mark valid metadata as invalid
media: uvcvideo: Fix 1-byte out-of-bounds read in uvc_parse_format()
btrfs: fix log tree replay failure due to file with 0 links and extents
thunderbolt: Fix copy+paste error in match_service_id()
misc: rtsx: usb: Ensure mmc child device is active when card is present
scsi: lpfc: Remove redundant assignment to avoid memory leak
rtc: ds1307: remove clear of oscillator stop flag (OSF) in probe
pNFS: Fix uninited ptr deref in block/scsi layout
pNFS: Fix disk addr range check in block/scsi layout
pNFS: Fix stripe mapping in block/scsi layout
ipmi: Fix strcpy source and destination the same
kconfig: lxdialog: fix 'space' to (de)select options
kconfig: gconf: fix potential memory leak in renderer_edited()
kconfig: gconf: avoid hardcoding model2 in on_treeview2_cursor_changed()
scsi: aacraid: Stop using PCI_IRQ_AFFINITY
scsi: Fix sas_user_scan() to handle wildcard and multi-channel scans
kconfig: nconf: Ensure null termination where strncpy is used
kconfig: lxdialog: replace strcpy() with strncpy() in inputbox.c
PCI: pnv_php: Work around switches with broken presence detection
media: uvcvideo: Fix bandwidth issue for Alcor camera
media: dvb-frontends: w7090p: fix null-ptr-deref in w7090p_tuner_write_serpar and w7090p_tuner_read_serpar
media: dvb-frontends: dib7090p: fix null-ptr-deref in dib7090p_rw_on_apb()
media: usb: hdpvr: disable zero-length read messages
media: tc358743: Increase FIFO trigger level to 374
media: tc358743: Return an appropriate colorspace from tc358743_set_fmt
media: tc358743: Check I2C succeeded during probe
pinctrl: stm32: Manage irq affinity settings
scsi: mpt3sas: Correctly handle ATA device errors
RDMA: hfi1: fix possible divide-by-zero in find_hw_thread_mask()
MIPS: Don't crash in stack_top() for tasks without ABI or vDSO
jfs: upper bound check of tree index in dbAllocAG
jfs: Regular file corruption check
jfs: truncate good inode pages when hard link is 0
scsi: bfa: Double-free fix
MIPS: vpe-mt: add missing prototypes for vpe_{alloc,start,stop,free}
watchdog: dw_wdt: Fix default timeout
fs/orangefs: use snprintf() instead of sprintf()
scsi: libiscsi: Initialize iscsi_conn->dd_data only if memory is allocated
ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr
vhost: fail early when __vhost_add_used() fails
uapi: in6: restore visibility of most IPv6 socket options
net: ncsi: Fix buffer overflow in fetching version id
net: dsa: b53: fix b53_imp_vlan_setup for BCM5325
net: vlan: Replace BUG() with WARN_ON_ONCE() in vlan_dev_* stubs
wifi: iwlegacy: Check rate_idx range after addition
netmem: fix skb_frag_address_safe with unreadable skbs
wifi: rtlwifi: fix possible skb memory leak in `_rtl_pci_rx_interrupt()`.
wifi: iwlwifi: dvm: fix potential overflow in rs_fill_link_cmd()
net: fec: allow disable coalescing
(powerpc/512) Fix possible `dma_unmap_single()` on uninitialized pointer
s390/stp: Remove udelay from stp_sync_clock()
wifi: iwlwifi: mvm: fix scan request validation
net: thunderx: Fix format-truncation warning in bgx_acpi_match_id()
net: ipv4: fix incorrect MTU in broadcast routes
wifi: cfg80211: Fix interface type validation
et131x: Add missing check after DMA map
be2net: Use correct byte order and format string for TCP seq and ack_seq
s390/time: Use monotonic clock in get_cycles()
wifi: cfg80211: reject HTC bit for management frames
ktest.pl: Prevent recursion of default variable options
ASoC: codecs: rt5640: Retry DEVICE_ID verification
ALSA: usb-audio: Avoid precedence issues in mixer_quirks macros
ALSA: hda/ca0132: Fix buffer overflow in add_tuning_control
platform/x86: thinkpad_acpi: Handle KCOV __init vs inline mismatches
pm: cpupower: Fix the snapshot-order of tsc,mperf, clock in mperf_stop()
ALSA: intel8x0: Fix incorrect codec index usage in mixer for ICH4
ASoC: hdac_hdmi: Rate limit logging on connection and disconnection
mmc: rtsx_usb_sdmmc: Fix error-path in sd_set_power_mode()
ACPI: processor: fix acpi_object initialization
PM: sleep: console: Fix the black screen issue
thermal: sysfs: Return ENODATA instead of EAGAIN for reads
selftests: tracing: Use mutex_unlock for testing glob filter
ARM: tegra: Use I/O memcpy to write to IRAM
gpio: tps65912: check the return value of regmap_update_bits()
ASoC: soc-dapm: set bias_level if snd_soc_dapm_set_bias_level() was successed
cpufreq: Exit governor when failed to start old governor
usb: xhci: Avoid showing errors during surprise removal
usb: xhci: Set avg_trb_len = 8 for EP0 during Address Device Command
usb: xhci: Avoid showing warnings for dying controller
selftests/futex: Define SYS_futex on 32-bit architectures with 64-bit time_t
usb: xhci: print xhci->xhc_state when queue_command failed
securityfs: don't pin dentries twice, once is enough...
hfs: fix not erasing deleted b-tree node issue
drbd: add missing kref_get in handle_write_conflicts
arm64: Handle KCOV __init vs inline mismatches
hfsplus: don't use BUG_ON() in hfsplus_create_attributes_file()
hfsplus: fix slab-out-of-bounds read in hfsplus_uni2asc()
hfsplus: fix slab-out-of-bounds in hfsplus_bnode_read()
hfs: fix slab-out-of-bounds in hfs_bnode_read()
sctp: linearize cloned gso packets in sctp_rcv
netfilter: ctnetlink: fix refcount leak on table dump
udp: also consider secpath when evaluating ipsec use for checksumming
fs: Prevent file descriptor table allocations exceeding INT_MAX
sunvdc: Balance device refcount in vdc_port_mpgroup_check
NFSD: detect mismatch of file handle and delegation stateid in OPEN op
net: dpaa: fix device leak when querying time stamp info
net: gianfar: fix device leak when querying time stamp info
netlink: avoid infinite retry looping in netlink_unicast()
ALSA: usb-audio: Validate UAC3 cluster segment descriptors
ALSA: usb-audio: Validate UAC3 power domain descriptors, too
usb: gadget : fix use-after-free in composite_dev_cleanup()
MIPS: mm: tlb-r4k: Uniquify TLB entries on init
USB: serial: option: add Foxconn T99W709
vsock: Do not allow binding to VMADDR_PORT_ANY
net/packet: fix a race in packet_set_ring() and packet_notifier()
perf/core: Prevent VMA split of buffer mappings
perf/core: Exit early on perf_mmap() fail
perf/core: Don't leak AUX buffer refcount on allocation failure
pptp: fix pptp_xmit() error path
smb: client: let recv_done() cleanup before notifying the callers.
benet: fix BUG when creating VFs
ipv6: reject malicious packets in ipv6_gso_segment()
pptp: ensure minimal skb length in pptp_xmit()
netpoll: prevent hanging NAPI when netcons gets enabled
NFS: Fix filehandle bounds checking in nfs_fh_to_dentry()
pci/hotplug/pnv-php: Wrap warnings in macro
pci/hotplug/pnv-php: Improve error msg on power state change failure
usb: chipidea: udc: fix sleeping function called from invalid context
f2fs: fix to avoid out-of-boundary access in devs.path
f2fs: fix to avoid UAF in f2fs_sync_inode_meta()
rtc: pcf8563: fix incorrect maximum clock rate handling
rtc: hym8563: fix incorrect maximum clock rate handling
rtc: ds1307: fix incorrect maximum clock rate handling
mtd: rawnand: atmel: set pmecc data setup time
mtd: rawnand: atmel: Fix dma_mapping_error() address
jfs: fix metapage reference count leak in dbAllocCtl
fbdev: imxfb: Check fb_add_videomode to prevent null-ptr-deref
crypto: qat - fix seq_file position update in adf_ring_next()
dmaengine: nbpfaxi: Add missing check after DMA map
dmaengine: mv_xor: Fix missing check after DMA map and missing unmap
fs/orangefs: Allow 2 more characters in do_c_string()
crypto: img-hash - Fix dma_unmap_sg() nents value
scsi: isci: Fix dma_unmap_sg() nents value
scsi: mvsas: Fix dma_unmap_sg() nents value
scsi: ibmvscsi_tgt: Fix dma_unmap_sg() nents value
perf tests bp_account: Fix leaked file descriptor
crypto: ccp - Fix crash when rebind ccp device for ccp.ko
pinctrl: sunxi: Fix memory leak on krealloc failure
power: supply: max14577: Handle NULL pdata when CONFIG_OF is not set
clk: davinci: Add NULL check in davinci_lpsc_clk_register()
mtd: fix possible integer overflow in erase_xfer()
crypto: marvell/cesa - Fix engine load inaccuracy
PCI: rockchip-host: Fix "Unexpected Completion" log message
vrf: Drop existing dst reference in vrf_ip6_input_dst
netfilter: xt_nfacct: don't assume acct name is null-terminated
can: kvaser_usb: Assign netdev.dev_port based on device channel index
wifi: brcmfmac: fix P2P discovery failure in P2P peer due to missing P2P IE
Reapply "wifi: mac80211: Update skb's control block key in ieee80211_tx_dequeue()"
mwl8k: Add missing check after DMA map
wifi: rtl8xxxu: Fix RX skb size for aggregation disabled
net/sched: Restrict conditions for adding duplicating netems to qdisc tree
arch: powerpc: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX
netfilter: nf_tables: adjust lockdep assertions handling
drm/amd/pm/powerplay/hwmgr/smu_helper: fix order of mask and value
m68k: Don't unregister boot console needlessly
tcp: fix tcp_ofo_queue() to avoid including too much DUP SACK range
iwlwifi: Add missing check for alloc_ordered_workqueue
wifi: iwlwifi: Fix memory leak in iwl_mvm_init()
wifi: rtl818x: Kill URBs before clearing tx status queue
caif: reduce stack size, again
staging: nvec: Fix incorrect null termination of battery manufacturer
samples: mei: Fix building on musl libc
usb: early: xhci-dbc: Fix early_ioremap leak
Revert "vmci: Prevent the dispatching of uninitialized payloads"
pps: fix poll support
vmci: Prevent the dispatching of uninitialized payloads
staging: fbtft: fix potential memory leak in fbtft_framebuffer_alloc()
ARM: dts: vfxxx: Correctly use two tuples for timer address
ASoC: ops: dynamically allocate struct snd_ctl_elem_value
hfsplus: remove mutex_lock check in hfsplus_free_extents
ASoC: Intel: fix SND_SOC_SOF dependencies
ethernet: intel: fix building with large NR_CPUS
usb: phy: mxs: disconnect line when USB charger is attached
usb: chipidea: udc: protect usb interrupt enable
usb: chipidea: udc: add new API ci_hdrc_gadget_connect
comedi: comedi_test: Fix possible deletion of uninitialized timers
nilfs2: reject invalid file types when reading inodes
i2c: qup: jump out of the loop in case of timeout
net/sched: sch_qfq: Avoid triggering might_sleep in atomic context in qfq_delete_class
net: appletalk: Fix use-after-free in AARP proxy probe
net: appletalk: fix kerneldoc warnings
RDMA/core: Rate limit GID cache warning messages
usb: hub: fix detection of high tier USB3 devices behind suspended hubs
net_sched: sch_sfq: reject invalid perturb period
net_sched: sch_sfq: move the limit validation
net_sched: sch_sfq: use a temporary work area for validating configuration
net_sched: sch_sfq: don't allow 1 packet limit
net_sched: sch_sfq: handle bigger packets
net_sched: sch_sfq: annotate data-races around q->perturb_period
power: supply: bq24190_charger: Fix runtime PM imbalance on error
xhci: Disable stream for xHC controller with XHCI_BROKEN_STREAMS
virtio-net: ensure the received length does not exceed allocated size
usb: dwc3: qcom: Don't leave BCR asserted
usb: musb: fix gadget state on disconnect
net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree
net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime
Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU
Bluetooth: SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout
Bluetooth: SMP: If an unallowed command is received consider it a failure
Bluetooth: Fix null-ptr-deref in l2cap_sock_resume_cb()
usb: net: sierra: check for no status endpoint
net/sched: sch_qfq: Fix race condition on qfq_aggregate
net: emaclite: Fix missing pointer increment in aligned_read()
comedi: Fix use of uninitialized data in insn_rw_emulate_bits()
comedi: Fix some signed shift left operations
comedi: das6402: Fix bit shift out of bounds
comedi: das16m1: Fix bit shift out of bounds
comedi: aio_iiro_16: Fix bit shift out of bounds
comedi: pcl812: Fix bit shift out of bounds
iio: adc: max1363: Reorder mode_list[] entries
iio: adc: max1363: Fix MAX1363_4X_CHANS/MAX1363_8X_CHANS[]
soc: aspeed: lpc-snoop: Don't disable channels that aren't enabled
soc: aspeed: lpc-snoop: Cleanup resources in stack-order
mmc: sdhci-pci: Quirk for broken command queuing on Intel GLK-based Positivo models
memstick: core: Zero initialize id_reg in h_memstick_read_dev_id()
isofs: Verify inode mode when loading from disk
dmaengine: nbpfaxi: Fix memory corruption in probe()
af_packet: fix soft lockup issue caused by tpacket_snd()
af_packet: fix the SO_SNDTIMEO constraint not effective on tpacked_snd()
phonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept()
HID: core: do not bypass hid_hw_raw_request
HID: core: ensure __hid_request reserves the report ID as the first byte
HID: core: ensure the allocated report buffer can contain the reserved report ID
pch_uart: Fix dma_sync_sg_for_device() nents value
Input: xpad - set correct controller type for Acer NGR200
i2c: stm32: fix the device used for the DMA map
usb: gadget: configfs: Fix OOB read on empty string write
USB: serial: ftdi_sio: add support for NDI EMGUIDE GEMINI
USB: serial: option: add Foxconn T99W640
USB: serial: option: add Telit Cinterion FE910C04 (ECM) composition
dma-mapping: add generic helpers for mapping sgtable objects
usb: renesas_usbhs: Flush the notify_hotplug_work
gpio: rcar: Use raw_spinlock to protect register access
Conflicts:
Makefile
fs/f2fs/inode.c
mm/zsmalloc.c
Change-Id: If00246b113234f4ee7e5bb72cffd5d6f195de087
Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except
at places where these are defined. Later patches will remove the unused
definition of FIELD_SIZEOF().
This patch is generated using following script:
EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h"
git grep -l -e "\bFIELD_SIZEOF\b" | while read file;
do
if [[ "$file" =~ $EXCLUDE_FILES ]]; then
continue
fi
sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file;
done
Change-Id: I24296633f28fea05d12618c8e47dc8acb8df18d8
Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: David Miller <davem@davemloft.net> # for net
[ Upstream commit 7a4b21250bf79eef26543d35bd390448646c536b ]
The stackmap code relies on roundup_pow_of_two() to compute the number
of hash buckets, and contains an overflow check by checking if the
resulting value is 0. However, on 32-bit arches, the roundup code itself
can overflow by doing a 32-bit left-shift of an unsigned long value,
which is undefined behaviour, so it is not guaranteed to truncate
neatly. This was triggered by syzbot on the DEVMAP_HASH type, which
contains the same check, copied from the hashtab code.
The commit in the fixes tag actually attempted to fix this, but the fix
did not account for the UB, so the fix only works on CPUs where an
overflow does result in a neat truncation to zero, which is not
guaranteed. Checking the value before rounding does not have this
problem.
Fixes: 6183f4d3a0a2 ("bpf: Check for integer overflow when using roundup_pow_of_two()")
Change-Id: Id67d50b83af553ac5c1087ebded62c4526e95235
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Bui Quang Minh <minhquangbui99@gmail.com>
Message-ID: <20240307120340.99577-4-toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Through this vendor hook, we can get the timing to check
current running task for the validation of its credential
and bpf operations.
Bug: 191291287
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Change-Id: Ie4ed8df7ad66df2486fc7e52a26d9191fc0c176e
Header frame.h is getting more code annotations to help objtool analyze
object files.
Rename the file to objtool.h.
[ jpoimboe: add objtool.h to MAINTAINERS ]
Change-Id: I0b95a75fb3cfe673bf18d8d5a886b2809ea3b5f5
Signed-off-by: Julien Thierry <jthierry@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
This reverts commit 788bbf4f261fc558b714bdedd4122d7115efc940.
Reason for revert: fixes a conflict with upcoming upstream BPF changes.
Bug: 145210207
Change-Id: I3bbc1279fc613be0d2e833008413ad3561b851df
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
commit aebfd12521d9c7d0b502cf6d06314cfbcdccfe3b upstream.
Currently a lot of ftrace code assumes __fentry__ is at sym+0. However
with Intel IBT enabled the first instruction of a function will most
likely be ENDBR.
Change ftrace_location() to not only return the __fentry__ location
when called for the __fentry__ location, but also when called for the
sym+0 location.
Then audit/update all callsites of this function to consistently use
these new semantics.
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Change-Id: I72966b96df528f86121b6f6c866b56bf09a4227f
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.227581603@infradead.org
Stable-dep-of: e60b613df8b6 ("ftrace: Fix possible use-after-free issue in ftrace_location()")
[Shivani: Modified to apply on v5.10.y]
Signed-off-by: Shivani Agarwal <shivani.agarwal@broadcom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a direct ftrace callback is at a location that does not have any other
ftrace helpers attached to it, it is possible to simply just change the
text to call the new caller (if the architecture supports it). But this
requires special architecture code. Currently, modify_ftrace_direct() uses a
trick to add a stub ftrace callback to the location forcing it to call the
ftrace iterator. Then it can change the direct helper to call the new
function in C, and then remove the stub. Removing the stub will have the
location now call the new location that the direct helper is using.
The new helper function does the registering the stub trick, but is a weak
function, allowing an architecture to override it to do something a bit more
direct.
Link: https://lore.kernel.org/r/20191115215125.mbqv7taqnx376yed@ast-mbp.dhcp.thefacebook.com
Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Change-Id: I24ee1bebc80aa17ee382063b3cd7d58ea6126508
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
As function_graph tracer modifies the return address to insert a trampoline
to trace the return of a function, it must be aware of a direct caller, as
when it gets called, the function's return address may not be at on the
stack where it expects. It may have to see if that return address points to
the a direct caller and adjust if it is.
Change-Id: I80bd23932c426ec3b2db76538dacd283b972db0a
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Both unregister_ftrace_direct() and modify_ftrace_direct() needs to
normalize the ip passed in to match the rec->ip, as it is acceptable to have
the ip on the ftrace call site but not the start. There are also common
validity checks with the record found by the ip, these should be done for
both unregister_ftrace_direct() and modify_ftrace_direct().
Change-Id: Ib74ac2ae2f0c9d6c261b409bd87ce9f908e7c8da
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Add vendor hook for bpf, so we can get memory type and
use it to do memory type check for architecture
dependent page table setting.
Bug: 181639260
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Change-Id: Icac325a040fb88c7f6b04b2409029b623bd8515f
This change ensures that the set*uid family of syscalls in kernel/sys.c
(setreuid, setuid, setresuid, setfsuid) all call ns_capable_common with
the CAP_OPT_INSETID flag, so capability checks in the security_capable
hook can know whether they are being called from within a set*uid
syscall. This change is a no-op by itself, but is needed for the
proposed SafeSetID LSM.
Change-Id: Ie661692d340f57b74c5cd6623159c028795d481f
Signed-off-by: Micah Morton <mortonm@chromium.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.morris@microsoft.com>
Fix NULL pointer dereference when adding new psi monitor to the root
cgroup. PSI files for root cgroup was introduced in df5ba5be742 by using
system wide psi struct when reading, but file write/monitor was not
properly fixed. Since the PSI config for the root cgroup isn't
initialized, the current implementation tries to lock a NULL ptr,
resulting in a crash.
Can be triggered by running this as root:
$ tee /sys/fs/cgroup/cpu.pressure <<< "some 10000 1000000"
Change-Id: Id2137e41eab9efa13f52ac2afa937f66b69847e1
Signed-off-by: Odin Ugedal <odin@uged.al>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Dan Schatzberg <dschatzberg@fb.com>
Fixes: df5ba5be7425 ("kernel/sched/psi.c: expose pressure metrics on root cgroup")
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: stable@vger.kernel.org # 5.2+
Signed-off-by: Tejun Heo <tj@kernel.org>
Similar to the commit d7495343228f ("cgroup: fix incorrect
WARN_ON_ONCE() in cgroup_setup_root()"), cgroup_id(root_cgrp) does not
equal to 1 on 32bit ino archs which triggers all sorts of issues with
psi_show() on s390x. For example,
BUG: KASAN: slab-out-of-bounds in collect_percpu_times+0x2d0/
Read of size 4 at addr 000000001e0ce000 by task read_all/3667
collect_percpu_times+0x2d0/0x798
psi_show+0x7c/0x2a8
seq_read+0x2ac/0x830
vfs_read+0x92/0x150
ksys_read+0xe2/0x188
system_call+0xd8/0x2b4
Fix it by using cgroup_ino().
Fixes: 743210386c03 ("cgroup: use cgrp->kn->id as the cgroup ID")
Change-Id: Iefbc4965c651541e1bb1b23ac67c991d913ed5a7
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # v5.5
Introduce XDP_REDIRECT support for eBPF programs attached to cpumap
entries.
This patch has been tested on Marvell ESPRESSObin using a modified
version of xdp_redirect_cpu sample in order to attach a XDP program
to CPUMAP entries to perform a redirect on the mvneta interface.
In particular the following scenario has been tested:
rq (cpu0) --> mvneta - XDP_REDIRECT (cpu0) --> CPUMAP - XDP_REDIRECT (cpu1) --> mvneta
$./xdp_redirect_cpu -p xdp_cpu_map0 -d eth0 -c 1 -e xdp_redirect \
-f xdp_redirect_kern.o -m tx_port -r eth0
tx: 285.2 Kpps rx: 285.2 Kpps
Attaching a simple XDP program on eth0 to perform XDP_TX gives
comparable results:
tx: 288.4 Kpps rx: 288.4 Kpps
Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Change-Id: I485e490be5fc60d91fb37a4e9b694b72ef83c627
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/2cf8373a731867af302b00c4ff16c122630c4980.1594734381.git.lorenzo@kernel.org
[ Upstream commit ff40e51043af63715ab413995ff46996ecf9583f ]
Commit 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown")
added an implementation of the locked_down LSM hook to SELinux, with the aim
to restrict which domains are allowed to perform operations that would breach
lockdown. This is indirectly also getting audit subsystem involved to report
events. The latter is problematic, as reported by Ondrej and Serhei, since it
can bring down the whole system via audit:
1) The audit events that are triggered due to calls to security_locked_down()
can OOM kill a machine, see below details [0].
2) It also seems to be causing a deadlock via avc_has_perm()/slow_avc_audit()
when trying to wake up kauditd, for example, when using trace_sched_switch()
tracepoint, see details in [1]. Triggering this was not via some hypothetical
corner case, but with existing tools like runqlat & runqslower from bcc, for
example, which make use of this tracepoint. Rough call sequence goes like:
rq_lock(rq) -> -------------------------+
trace_sched_switch() -> |
bpf_prog_xyz() -> +-> deadlock
selinux_lockdown() -> |
audit_log_end() -> |
wake_up_interruptible() -> |
try_to_wake_up() -> |
rq_lock(rq) --------------+
What's worse is that the intention of 59438b46471a to further restrict lockdown
settings for specific applications in respect to the global lockdown policy is
completely broken for BPF. The SELinux policy rule for the current lockdown check
looks something like this:
allow <who> <who> : lockdown { <reason> };
However, this doesn't match with the 'current' task where the security_locked_down()
is executed, example: httpd does a syscall. There is a tracing program attached
to the syscall which triggers a BPF program to run, which ends up doing a
bpf_probe_read_kernel{,_str}() helper call. The selinux_lockdown() hook does
the permission check against 'current', that is, httpd in this example. httpd
has literally zero relation to this tracing program, and it would be nonsensical
having to write an SELinux policy rule against httpd to let the tracing helper
pass. The policy in this case needs to be against the entity that is installing
the BPF program. For example, if bpftrace would generate a histogram of syscall
counts by user space application:
bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }'
bpftrace would then go and generate a BPF program from this internally. One way
of doing it [for the sake of the example] could be to call bpf_get_current_task()
helper and then access current->comm via one of bpf_probe_read_kernel{,_str}()
helpers. So the program itself has nothing to do with httpd or any other random
app doing a syscall here. The BPF program _explicitly initiated_ the lockdown
check. The allow/deny policy belongs in the context of bpftrace: meaning, you
want to grant bpftrace access to use these helpers, but other tracers on the
system like my_random_tracer _not_.
Therefore fix all three issues at the same time by taking a completely different
approach for the security_locked_down() hook, that is, move the check into the
program verification phase where we actually retrieve the BPF func proto. This
also reliably gets the task (current) that is trying to install the BPF tracing
program, e.g. bpftrace/bcc/perf/systemtap/etc, and it also fixes the OOM since
we're moving this out of the BPF helper's fast-path which can be called several
millions of times per second.
The check is then also in line with other security_locked_down() hooks in the
system where the enforcement is performed at open/load time, for example,
open_kcore() for /proc/kcore access or module_sig_check() for module signatures
just to pick few random ones. What's out of scope in the fix as well as in
other security_locked_down() hook locations /outside/ of BPF subsystem is that
if the lockdown policy changes on the fly there is no retrospective action.
This requires a different discussion, potentially complex infrastructure, and
it's also not clear whether this can be solved generically. Either way, it is
out of scope for a suitable stable fix which this one is targeting. Note that
the breakage is specifically on 59438b46471a where it started to rely on 'current'
as UAPI behavior, and _not_ earlier infrastructure such as 9d1f8be5cf42 ("bpf:
Restrict bpf when kernel lockdown is in confidentiality mode").
[0] https://bugzilla.redhat.com/show_bug.cgi?id=1955585, Jakub Hrozek says:
I starting seeing this with F-34. When I run a container that is traced with
BPF to record the syscalls it is doing, auditd is flooded with messages like:
type=AVC msg=audit(1619784520.593:282387): avc: denied { confidentiality }
for pid=476 comm="auditd" lockdown_reason="use of bpf to read kernel RAM"
scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0
tclass=lockdown permissive=0
This seems to be leading to auditd running out of space in the backlog buffer
and eventually OOMs the machine.
[...]
auditd running at 99% CPU presumably processing all the messages, eventually I get:
Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded
Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded
Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152579 > audit_backlog_limit=64
Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152626 > audit_backlog_limit=64
Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152694 > audit_backlog_limit=64
Apr 30 12:20:42 fedora kernel: audit: audit_lost=6878426 audit_rate_limit=0 audit_backlog_limit=64
Apr 30 12:20:45 fedora kernel: oci-seccomp-bpf invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-1000
Apr 30 12:20:45 fedora kernel: CPU: 0 PID: 13284 Comm: oci-seccomp-bpf Not tainted 5.11.12-300.fc34.x86_64 #1
Apr 30 12:20:45 fedora kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
[...]
[1] https://lore.kernel.org/linux-audit/CANYvDQN7H5tVp47fbYcRasv4XF07eUbsDwT_eDCHXJUj43J7jQ@mail.gmail.com/,
Serhei Makarov says:
Upstream kernel 5.11.0-rc7 and later was found to deadlock during a
bpf_probe_read_compat() call within a sched_switch tracepoint. The problem
is reproducible with the reg_alloc3 testcase from SystemTap's BPF backend
testsuite on x86_64 as well as the runqlat, runqslower tools from bcc on
ppc64le. Example stack trace:
[...]
[ 730.868702] stack backtrace:
[ 730.869590] CPU: 1 PID: 701 Comm: in:imjournal Not tainted, 5.12.0-0.rc2.20210309git144c79ef3353.166.fc35.x86_64 #1
[ 730.871605] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
[ 730.873278] Call Trace:
[ 730.873770] dump_stack+0x7f/0xa1
[ 730.874433] check_noncircular+0xdf/0x100
[ 730.875232] __lock_acquire+0x1202/0x1e10
[ 730.876031] ? __lock_acquire+0xfc0/0x1e10
[ 730.876844] lock_acquire+0xc2/0x3a0
[ 730.877551] ? __wake_up_common_lock+0x52/0x90
[ 730.878434] ? lock_acquire+0xc2/0x3a0
[ 730.879186] ? lock_is_held_type+0xa7/0x120
[ 730.880044] ? skb_queue_tail+0x1b/0x50
[ 730.880800] _raw_spin_lock_irqsave+0x4d/0x90
[ 730.881656] ? __wake_up_common_lock+0x52/0x90
[ 730.882532] __wake_up_common_lock+0x52/0x90
[ 730.883375] audit_log_end+0x5b/0x100
[ 730.884104] slow_avc_audit+0x69/0x90
[ 730.884836] avc_has_perm+0x8b/0xb0
[ 730.885532] selinux_lockdown+0xa5/0xd0
[ 730.886297] security_locked_down+0x20/0x40
[ 730.887133] bpf_probe_read_compat+0x66/0xd0
[ 730.887983] bpf_prog_250599c5469ac7b5+0x10f/0x820
[ 730.888917] trace_call_bpf+0xe9/0x240
[ 730.889672] perf_trace_run_bpf_submit+0x4d/0xc0
[ 730.890579] perf_trace_sched_switch+0x142/0x180
[ 730.891485] ? __schedule+0x6d8/0xb20
[ 730.892209] __schedule+0x6d8/0xb20
[ 730.892899] schedule+0x5b/0xc0
[ 730.893522] exit_to_user_mode_prepare+0x11d/0x240
[ 730.894457] syscall_exit_to_user_mode+0x27/0x70
[ 730.895361] entry_SYSCALL_64_after_hwframe+0x44/0xae
[...]
Fixes: 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown")
Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
Reported-by: Jakub Hrozek <jhrozek@redhat.com>
Reported-by: Serhei Makarov <smakarov@redhat.com>
Reported-by: Jiri Olsa <jolsa@redhat.com>
Change-Id: Ie9ec178238bbe62a9e284d52eecff1ee569b307d
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jamorris@linux.microsoft.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Frank Eigler <fche@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/bpf/01135120-8bf7-df2e-cff0-1d73f1f841c3@iogearbox.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e21aa341785c679dd409c8cb71f864c00fe6c463 ]
The fexit/fmod_ret programs can be attached to kernel functions that can sleep.
The synchronize_rcu_tasks() will not wait for such tasks to complete.
In such case the trampoline image will be freed and when the task
wakes up the return IP will point to freed memory causing the crash.
Solve this by adding percpu_ref_get/put for the duration of trampoline
and separate trampoline vs its image life times.
The "half page" optimization has to be removed, since
first_half->second_half->first_half transition cannot be guaranteed to
complete in deterministic time. Every trampoline update becomes a new image.
The image with fmod_ret or fexit progs will be freed via percpu_ref_kill and
call_rcu_tasks. Together they will wait for the original function and
trampoline asm to complete. The trampoline is patched from nop to jmp to skip
fexit progs. They are freed independently from the trampoline. The image with
fentry progs only will be freed via call_rcu_tasks_trace+call_rcu_tasks which
will wait for both sleepable and non-sleepable progs to complete.
Fixes: fec56f5890d9 ("bpf: Introduce BPF trampoline")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Change-Id: I7a377e1e19cf91b796e56159b875524154c49c57
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Paul E. McKenney <paulmck@kernel.org> # for RCU
Link: https://lore.kernel.org/bpf/20210316210007.38949-1-alexei.starovoitov@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4013aef2ced9b756a410f50d12df9ebe6a883e4a ]
When calling ftrace_dump_one() concurrently with reading trace_pipe,
a WARN_ON_ONCE() in trace_printk_seq() can be triggered due to a race
condition.
The issue occurs because:
CPU0 (ftrace_dump) CPU1 (reader)
echo z > /proc/sysrq-trigger
!trace_empty(&iter)
trace_iterator_reset(&iter) <- len = size = 0
cat /sys/kernel/tracing/trace_pipe
trace_find_next_entry_inc(&iter)
__find_next_entry
ring_buffer_empty_cpu <- all empty
return NULL
trace_printk_seq(&iter.seq)
WARN_ON_ONCE(s->seq.len >= s->seq.size)
In the context between trace_empty() and trace_find_next_entry_inc()
during ftrace_dump, the ring buffer data was consumed by other readers.
This caused trace_find_next_entry_inc to return NULL, failing to populate
`iter.seq`. At this point, due to the prior trace_iterator_reset, both
`iter.seq.len` and `iter.seq.size` were set to 0. Since they are equal,
the WARN_ON_ONCE condition is triggered.
Move the trace_printk_seq() into the if block that checks to make sure the
return value of trace_find_next_entry_inc() is non-NULL in
ftrace_dump_one(), ensuring the 'iter.seq' is properly populated before
subsequent operations.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Ingo Molnar <mingo@elte.hu>
Link: https://lore.kernel.org/20250822033343.3000289-1-wutengda@huaweicloud.com
Fixes: d769041f86 ("ring_buffer: implement new locking")
Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Ulrich Hecht <uli@kernel.org>
[ Upstream commit 4266e8fa56d3d982bf451d382a410b9db432015c ]
When the computer enters sleep status without a monitor
connected, the system switches the console to the virtual
terminal tty63(SUSPEND_CONSOLE).
If a monitor is subsequently connected before waking up,
the system skips the required VT restoration process
during wake-up, leaving the console on tty63 instead of
switching back to tty1.
To fix this issue, a global flag vt_switch_done is introduced
to record whether the system has successfully switched to
the suspend console via vt_move_to_console() during suspend.
If the switch was completed, vt_switch_done is set to 1.
Later during resume, this flag is checked to ensure that
the original console is restored properly by calling
vt_move_to_console(orig_fgconsole, 0).
This prevents scenarios where the resume logic skips console
restoration due to incorrect detection of the console state,
especially when a monitor is reconnected before waking up.
Signed-off-by: tuhaowen <tuhaowen@uniontech.com>
Link: https://patch.msgid.link/20250611032345.29962-1-tuhaowen@uniontech.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Ulrich Hecht <uli@kernel.org>
commit b024d7b56c77191cde544f838debb7f8451cd0d6 upstream.
The perf mmap code is careful about mmap()'ing the user page with the
ringbuffer and additionally the auxiliary buffer, when the event supports
it. Once the first mapping is established, subsequent mapping have to use
the same offset and the same size in both cases. The reference counting for
the ringbuffer and the auxiliary buffer depends on this being correct.
Though perf does not prevent that a related mapping is split via mmap(2),
munmap(2) or mremap(2). A split of a VMA results in perf_mmap_open() calls,
which take reference counts, but then the subsequent perf_mmap_close()
calls are not longer fulfilling the offset and size checks. This leads to
reference count leaks.
As perf already has the requirement for subsequent mappings to match the
initial mapping, the obvious consequence is that VMA splits, caused by
resizing of a mapping or partial unmapping, have to be prevented.
Implement the vm_operations_struct::may_split() callback and return
unconditionally -EINVAL.
That ensures that the mapping offsets and sizes cannot be changed after the
fact. Remapping to a different fixed address with the same size is still
possible as it takes the references for the new mapping and drops those of
the old mapping.
Fixes: 45bfb2e504 ("perf/core: Add AUX area to ring buffer for raw data streams")
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-27504
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ulrich Hecht <uli@kernel.org>
commit 07091aade394f690e7b655578140ef84d0e8d7b0 upstream.
When perf_mmap() fails to allocate a buffer, it still invokes the
event_mapped() callback of the related event. On X86 this might increase
the perf_rdpmc_allowed reference counter. But nothing undoes this as
perf_mmap_close() is never called in this case, which causes another
reference count leak.
Return early on failure to prevent that.
Fixes: 1e0fb9ec67 ("perf/core: Add pmu callbacks to track event mapping and unmapping")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ulrich Hecht <uli@kernel.org>
commit 5468c0fbccbb9d156522c50832244a8b722374fb upstream.
Failure of the AUX buffer allocation leaks the reference count.
Set the reference count to 1 only when the allocation succeeds.
Fixes: 45bfb2e504 ("perf/core: Add AUX area to ring buffer for raw data streams")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ulrich Hecht <uli@kernel.org>
version 4.19.325-cip123
* tag 'v4.19.325-cip123' of https://git.kernel.org/pub/scm/linux/kernel/git/cip/linux-cip: (2182 commits)
CIP: Bump version suffix to -cip123 after merge from cip/linux-4.19.y-st tree
Update localversion-st, tree is up-to-date with 5.4.296.
emulex/benet: Fix build by return mismatch in be_cmd_unlock()
net/sched: Abort __tc_modify_qdisc if parent class does not exist
mtk-sd: Prevent memory corruption from DMA map failure
mmc: mediatek: use data instead of mrq parameter from msdc_{un}prepare_data()
scsi: qla4xxx: Fix missing DMA mapping error in qla4xxx_alloc_pdu()
btrfs: don't abort filesystem when attempting to snapshot deleted subvolume
VMCI: fix race between vmci_host_setup_notify and vmci_ctx_unset_notify
net: ipv6: Discard next-hop MTU less than minimum link MTU
Input: atkbd - do not skip atkbd_deactivate() when skipping ATKBD_CMD_GETID
HID: quirks: Add quirk for 2 Chicony Electronics HP 5MP Cameras
HID: Add IGNORE quirk for SMARTLINKTECHNOLOGY
vt: add missing notification when switching back to text mode
net: usb: qmi_wwan: add SIMCom 8230C composition
atm: idt77252: Add missing `dma_map_error()`
bnxt_en: Fix DCB ETS validation
can: m_can: m_can_handle_lost_msg(): downgrade msg lost in rx message to debug level
net: appletalk: Fix device refcount leak in atrtr_create()
md/raid1: Fix stack memory use after return in raid1_reshape
...
Conflicts:
Documentation/devicetree/bindings/arm/shmobile.txt
Documentation/devicetree/bindings/ata/sata_rcar.txt
Documentation/devicetree/bindings/clock/renesas,cpg-mssr.txt
Documentation/devicetree/bindings/display/bridge/renesas,dw-hdmi.txt
Documentation/devicetree/bindings/display/bridge/renesas,lvds.txt
Documentation/devicetree/bindings/display/renesas,du.txt
Documentation/devicetree/bindings/dma/renesas,rcar-dmac.txt
Documentation/devicetree/bindings/dma/renesas,usb-dmac.txt
Documentation/devicetree/bindings/gpio/renesas,gpio-rcar.txt
Documentation/devicetree/bindings/i2c/renesas,i2c.txt
Documentation/devicetree/bindings/i2c/renesas,iic.txt
Documentation/devicetree/bindings/interrupt-controller/renesas,irqc.txt
Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.txt
Documentation/devicetree/bindings/media/rcar_vin.txt
Documentation/devicetree/bindings/media/renesas,fcp.txt
Documentation/devicetree/bindings/media/renesas,rcar-csi2.txt
Documentation/devicetree/bindings/media/renesas,vsp1.txt
Documentation/devicetree/bindings/mmc/tmio_mmc.txt
Documentation/devicetree/bindings/net/can/rcar_can.txt
Documentation/devicetree/bindings/net/can/rcar_canfd.txt
Documentation/devicetree/bindings/net/renesas,ravb.txt
Documentation/devicetree/bindings/pci/rcar-pci.txt
Documentation/devicetree/bindings/phy/rcar-gen3-phy-usb2.txt
Documentation/devicetree/bindings/phy/rcar-gen3-phy-usb3.txt
Documentation/devicetree/bindings/pinctrl/renesas,pfc-pinctrl.txt
Documentation/devicetree/bindings/power/renesas,rcar-sysc.txt
Documentation/devicetree/bindings/pwm/renesas,pwm-rcar.txt
Documentation/devicetree/bindings/reset/renesas,rst.txt
Documentation/devicetree/bindings/serial/renesas,sci-serial.txt
Documentation/devicetree/bindings/sound/renesas,rsnd.txt
Documentation/devicetree/bindings/spi/sh-msiof.txt
Documentation/devicetree/bindings/thermal/rcar-gen3-thermal.txt
Documentation/devicetree/bindings/thermal/rcar-thermal.txt
Documentation/devicetree/bindings/timer/renesas,cmt.txt
Documentation/devicetree/bindings/timer/renesas,tmu.txt
Documentation/devicetree/bindings/trivial-devices.txt
Documentation/devicetree/bindings/usb/renesas,usb3-peri.txt
Documentation/devicetree/bindings/usb/renesas,usbhs.txt
Documentation/devicetree/bindings/usb/usb-xhci.txt
Documentation/devicetree/bindings/vendor-prefixes.txt
Documentation/devicetree/bindings/watchdog/renesas,wdt.txt
drivers/clk/qcom/clk-alpha-pll.c
drivers/hid/hid-ids.h
drivers/irqchip/irq-gic-v3.c
drivers/mmc/host/sdhci.h
drivers/platform/x86/intel_cht_int33fe.c
drivers/usb/dwc3/core.c
drivers/usb/dwc3/gadget.c
drivers/usb/gadget/function/f_fs.c
drivers/usb/typec/hd3ss3220.c
drivers/usb/typec/mux.c
kernel/time/posix-timers.c
mm/oom_kill.c
Change-Id: I9d0df93b34a99e5a7071f60b3dfd2fb5d943e6c7
Open code it in __bpf_map_area_alloc, which is the only caller. Also
clean up __bpf_map_area_alloc to have a single vmalloc call with slightly
different flags instead of the current two different calls.
For this to compile for the nommu case add a __vmalloc_node_range stub to
nommu.c.
[akpm@linux-foundation.org: fix nommu.c build]
Change-Id: Ic44eb2e1ae04523a6d32533c8d5bb15a281f1914
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Kelley <mikelley@microsoft.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200414131348.444715-27-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Just use __vmalloc_node instead which gets and extra argument. To be able
to to use __vmalloc_node in all caller make it available outside of
vmalloc and implement it in nommu.c.
[akpm@linux-foundation.org: fix nommu build]
Change-Id: Ie9839e4ad6222037f0e9697e77e0395f92a8ccef
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Kelley <mikelley@microsoft.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200414131348.444715-25-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When CONFIG_NET is set but CONFIG_INET isn't, build fails with:
ld: kernel/bpf/net_namespace.o: in function `netns_bpf_attach_type_unneed':
kernel/bpf/net_namespace.c:32: undefined reference to `bpf_sk_lookup_enabled'
ld: kernel/bpf/net_namespace.o: in function `netns_bpf_attach_type_need':
kernel/bpf/net_namespace.c:43: undefined reference to `bpf_sk_lookup_enabled'
This is because without CONFIG_INET bpf_sk_lookup_enabled symbol is not
available. Wrap references to bpf_sk_lookup_enabled with preprocessor
conditionals.
Fixes: 1559b4aa1db4 ("inet: Run SK_LOOKUP BPF program on socket lookup")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Change-Id: I9d13ff7b307142e5aa0d5e7ff6adbca6e55dab95
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Link: https://lore.kernel.org/bpf/20200721100716.720477-1-jakub@cloudflare.com
Convert the bpf filesystem to the new internal mount API as the old
one will be obsoleted and removed. This allows greater flexibility in
communication of mount parameters between userspace, the VFS and the
filesystem.
See Documentation/filesystems/mount_api.txt for more information.
Change-Id: Ia3063e999314c1b2c0d3f1f659e0e31141490c95
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Alexei Starovoitov <ast@kernel.org>
cc: Daniel Borkmann <daniel@iogearbox.net>
cc: Martin KaFai Lau <kafai@fb.com>
cc: Song Liu <songliubraving@fb.com>
cc: Yonghong Song <yhs@fb.com>
cc: netdev@vger.kernel.org
cc: bpf@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[ Upstream commit d203b0fd863a2261e5d00b97f3d060c4c2a6db71 ]
Instead of relying on current env->pass_cnt, use the seen count from the
old aux data in adjust_insn_aux_data(), and expand it to the new range of
patched instructions. This change is valid given we always expand 1:n
with n>=1, so what applies to the old/original instruction needs to apply
for the replacement as well.
Not relying on env->pass_cnt is a prerequisite for a later change where we
want to avoid marking an instruction seen when verified under speculative
execution path.
Change-Id: Ia1a3e3c49c1c3154b56a67134aca79a1e2e106d7
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Benedict Schlueter <benedict.schlueter@rub.de>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 4bb7ea946a370707315ab774432963ce47291946 upstream.
Fix an edge case in __mark_chain_precision() which prematurely stops
backtracking instructions in a state if it happens that state's first
and last instruction indexes are the same. This situations doesn't
necessarily mean that there were no instructions simulated in a state,
but rather that we starting from the instruction, jumped around a bit,
and then ended up at the same instruction before checkpointing or
marking precision.
To distinguish between these two possible situations, we need to consult
jump history. If it's empty or contain a single record "bridging" parent
state and first instruction of processed state, then we indeed
backtracked all instructions in this state. But if history is not empty,
we are definitely not done yet.
Move this logic inside get_prev_insn_idx() to contain it more nicely.
Use -ENOENT return code to denote "we are out of instructions"
situation.
This bug was exposed by verifier_loop1.c's bounded_recursion subtest, once
the next fix in this patch set is applied.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Fixes: b5dc0163d8fd ("bpf: precise scalar_value tracking")
Change-Id: Ib2ea37a2973a2fbd9455c1e749b4b477d74a8a29
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20231110002638.4168352-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Aaron Lu <ziqianlu@bytedance.com>
Reported-by: Wei Wei <weiwei.danny@bytedance.com>
Closes: https://lore.kernel.org/all/20250605070921.GA3795@bytedance/
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Upstream commit 973c7a0d8a38 ("bpf: fix precision backtracking
instruction iteration") slightly changes the logic in the verifier which
results in the verifier log growing. This results in the log being too
small when loading the filterPowerSupplyEvents BPF program in Android,
and therefore causing the program loading to fail. Because this
program is labeled 'critical', a load failure forces a boot loop.
This BPF program exists on the vendor partition, and therefore we must
maintain the GRF/ treble boundary and modify the kernel logic.
The kernel's bpf log logic is refactored in the 6.4 kernel and
acknowledges the shortcomings of the existing approach which causes the
program load to fail. Instead of backporting the significant changes,
this change simply ignores the fact that the log is full.
For more information see commit 121664093803 ("bpf: Switch BPF verifier
log to be a rotating log by default")
Bug: 432207940
Bug: 433641053
Test: verify pixel 6 boots on a 5.10 kernel including commit
973c7a0d8a38
Change-Id: I35c3d2074dd9b39e44bfdbaf66fa56ec917df0a6
Signed-off-by: Neill Kapron <nkapron@google.com>
The SELinux specific credential poisioning only makes sense
if SELinux is managing the credentials. As the intent of this
patch set is to move the blob management out of the modules
and into the infrastructure, the SELinux specific code has
to go. The poisioning could be introduced into the infrastructure
at some later date.
Change-Id: I815715bc05f62f5011f269b7e10c3059697a47a2
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
This step is already done in rebind_subsystems().
Not necessary to do it again.
Change-Id: Ia4b551c06aec63d33e9253ca5a6691993c8a43ae
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
[ Upstream commit 2190df6c91373fdec6db9fc07e427084f232f57e ]
Only cgroup v2 can be attached by bpf programs, so this patch introduces
that cgroup_bpf_inherit and cgroup_bpf_offline can only be called in
cgroup v2, and this can fix the memleak mentioned by commit 04f8ef5643bc
("cgroup: Fix memory leak caused by missing cgroup_bpf_offline"), which
has been reverted.
Fixes: 2b0d3d3e4fcf ("percpu_ref: reduce memory footprint of percpu_ref in fast path")
Fixes: 4bfc0bb2c60e ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself")
Link: https://lore.kernel.org/cgroups/aka2hk5jsel5zomucpwlxsej6iwnfw4qu5jkrmjhyfhesjlfdw@46zxhg5bdnr7/
Change-Id: Icf3748467371377f3e2c7635925daeb66b99ad36
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
743210386c03 ("cgroup: use cgrp->kn->id as the cgroup ID") added WARN
which triggers if cgroup_id(root_cgrp) is not 1. This is fine on
64bit ino archs but on 32bit archs cgroup ID is ((gen << 32) | ino)
and gen starts at 1, so the root id is 0x1_0000_0001 instead of 1
always triggering the WARN.
What we wanna make sure is that the ino part is 1. Fix it.
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Fixes: 743210386c03 ("cgroup: use cgrp->kn->id as the cgroup ID")
Change-Id: Icae86c7960754c68b10a4b6fe4c8869f0f906fee
Signed-off-by: Tejun Heo <tj@kernel.org>
Run a BPF program before looking up a listening socket on the receive path.
Program selects a listening socket to yield as result of socket lookup by
calling bpf_sk_assign() helper and returning SK_PASS code. Program can
revert its decision by assigning a NULL socket with bpf_sk_assign().
Alternatively, BPF program can also fail the lookup by returning with
SK_DROP, or let the lookup continue as usual with SK_PASS on return, when
no socket has been selected with bpf_sk_assign().
This lets the user match packets with listening sockets freely at the last
possible point on the receive path, where we know that packets are destined
for local delivery after undergoing policing, filtering, and routing.
With BPF code selecting the socket, directing packets destined to an IP
range or to a port range to a single socket becomes possible.
In case multiple programs are attached, they are run in series in the order
in which they were attached. The end result is determined from return codes
of all the programs according to following rules:
1. If any program returned SK_PASS and selected a valid socket, the socket
is used as result of socket lookup.
2. If more than one program returned SK_PASS and selected a socket,
last selection takes effect.
3. If any program returned SK_DROP, and no program returned SK_PASS and
selected a socket, socket lookup fails with -ECONNREFUSED.
4. If all programs returned SK_PASS and none of them selected a socket,
socket lookup continues to htable-based lookup.
Suggested-by: Marek Majkowski <marek@cloudflare.com>
Change-Id: I4ce282a5e57c7d59a4f119b972d3d8809a4c1ce3
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717103536.397595-5-jakub@cloudflare.com