Commit Graph

90 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
8fd5b33ea5 Merge 5.15.61 into android13-5.15-lts
Changes in 5.15.61
	Makefile: link with -z noexecstack --no-warn-rwx-segments
	x86: link vdso and boot with -z noexecstack --no-warn-rwx-segments
	Revert "pNFS: nfs3_set_ds_client should set NFS_CS_NOPING"
	scsi: Revert "scsi: qla2xxx: Fix disk failure to rediscover"
	pNFS/flexfiles: Report RDMA connection errors to the server
	NFSD: Clean up the show_nf_flags() macro
	nfsd: eliminate the NFSD_FILE_BREAK_* flags
	ALSA: usb-audio: Add quirk for Behringer UMC202HD
	ALSA: bcd2000: Fix a UAF bug on the error path of probing
	ALSA: hda/realtek: Add quirk for Clevo NV45PZ
	ALSA: hda/realtek: Add quirk for HP Spectre x360 15-eb0xxx
	wifi: mac80211_hwsim: fix race condition in pending packet
	wifi: mac80211_hwsim: add back erroneously removed cast
	wifi: mac80211_hwsim: use 32-bit skb cookie
	add barriers to buffer_uptodate and set_buffer_uptodate
	lockd: detect and reject lock arguments that overflow
	HID: hid-input: add Surface Go battery quirk
	HID: wacom: Only report rotation for art pen
	HID: wacom: Don't register pad_input for touch switch
	KVM: nVMX: Snapshot pre-VM-Enter BNDCFGS for !nested_run_pending case
	KVM: nVMX: Snapshot pre-VM-Enter DEBUGCTL for !nested_run_pending case
	KVM: SVM: Don't BUG if userspace injects an interrupt with GIF=0
	KVM: s390: pv: don't present the ecall interrupt twice
	KVM: x86: Split kvm_is_valid_cr4() and export only the non-vendor bits
	KVM: nVMX: Let userspace set nVMX MSR to any _host_ supported value
	KVM: nVMX: Account for KVM reserved CR4 bits in consistency checks
	KVM: nVMX: Inject #UD if VMXON is attempted with incompatible CR0/CR4
	KVM: x86: Mark TSS busy during LTR emulation _after_ all fault checks
	KVM: x86: Set error code to segment selector on LLDT/LTR non-canonical #GP
	KVM: nVMX: Always enable TSC scaling for L2 when it was enabled for L1
	KVM: x86: Tag kvm_mmu_x86_module_init() with __init
	KVM: x86: do not report preemption if the steal time cache is stale
	KVM: x86: revalidate steal time cache if MSR value changes
	riscv: set default pm_power_off to NULL
	ALSA: hda/conexant: Add quirk for LENOVO 20149 Notebook model
	ALSA: hda/cirrus - support for iMac 12,1 model
	ALSA: hda/realtek: Add quirk for another Asus K42JZ model
	ALSA: hda/realtek: Add a quirk for HP OMEN 15 (8786) mute LED
	tty: vt: initialize unicode screen buffer
	vfs: Check the truncate maximum size in inode_newsize_ok()
	fs: Add missing umask strip in vfs_tmpfile
	thermal: sysfs: Fix cooling_device_stats_setup() error code path
	fbcon: Fix boundary checks for fbcon=vc:n1-n2 parameters
	fbcon: Fix accelerated fbdev scrolling while logo is still shown
	usbnet: Fix linkwatch use-after-free on disconnect
	fix short copy handling in copy_mc_pipe_to_iter()
	crypto: ccp - Use kzalloc for sev ioctl interfaces to prevent kernel memory leak
	ovl: drop WARN_ON() dentry is NULL in ovl_encode_fh()
	parisc: Fix device names in /proc/iomem
	parisc: Drop pa_swapper_pg_lock spinlock
	parisc: Check the return value of ioremap() in lba_driver_probe()
	parisc: io_pgetevents_time64() needs compat syscall in 32-bit compat mode
	riscv:uprobe fix SR_SPIE set/clear handling
	dt-bindings: riscv: fix SiFive l2-cache's cache-sets
	RISC-V: kexec: Fixup use of smp_processor_id() in preemptible context
	RISC-V: Fixup get incorrect user mode PC for kernel mode regs
	RISC-V: Fixup schedule out issue in machine_crash_shutdown()
	RISC-V: Add modules to virtual kernel memory layout dump
	rtc: rx8025: fix 12/24 hour mode detection on RX-8035
	drm/gem: Properly annotate WW context on drm_gem_lock_reservations() error
	drm/shmem-helper: Add missing vunmap on error
	drm/vc4: hdmi: Disable audio if dmas property is present but empty
	drm/hyperv-drm: Include framebuffer and EDID headers
	drm/nouveau: fix another off-by-one in nvbios_addr
	drm/nouveau: Don't pm_runtime_put_sync(), only pm_runtime_put_autosuspend()
	drm/nouveau/acpi: Don't print error when we get -EINPROGRESS from pm_runtime
	drm/nouveau/kms: Fix failure path for creating DP connectors
	drm/amdgpu: Check BO's requested pinning domains against its preferred_domains
	drm/amdgpu: fix check in fbdev init
	bpf: Fix KASAN use-after-free Read in compute_effective_progs
	btrfs: reject log replay if there is unsupported RO compat flag
	mtd: rawnand: arasan: Fix clock rate in NV-DDR
	mtd: rawnand: arasan: Update NAND bus clock instead of system clock
	um: Remove straying parenthesis
	um: seed rng using host OS rng
	iio: fix iio_format_avail_range() printing for none IIO_VAL_INT
	iio: light: isl29028: Fix the warning in isl29028_remove()
	scsi: sg: Allow waiting for commands to complete on removed device
	scsi: qla2xxx: Fix incorrect display of max frame size
	scsi: qla2xxx: Zero undefined mailbox IN registers
	soundwire: qcom: Check device status before reading devid
	ksmbd: fix memory leak in smb2_handle_negotiate
	ksmbd: prevent out of bound read for SMB2_TREE_CONNNECT
	ksmbd: fix use-after-free bug in smb2_tree_disconect
	fuse: limit nsec
	fuse: ioctl: translate ENOSYS
	serial: mvebu-uart: uart2 error bits clearing
	md-raid: destroy the bitmap after destroying the thread
	md-raid10: fix KASAN warning
	mbcache: don't reclaim used entries
	mbcache: add functions to delete entry if unused
	media: [PATCH] pci: atomisp_cmd: fix three missing checks on list iterator
	ia64, processor: fix -Wincompatible-pointer-types in ia64_get_irr()
	PCI: Add defines for normal and subtractive PCI bridges
	powerpc/fsl-pci: Fix Class Code of PCIe Root Port
	powerpc/ptdump: Fix display of RW pages on FSL_BOOK3E
	powerpc/powernv: Avoid crashing if rng is NULL
	MIPS: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK
	coresight: Clear the connection field properly
	usb: typec: ucsi: Acknowledge the GET_ERROR_STATUS command completion
	USB: HCD: Fix URB giveback issue in tasklet function
	ARM: dts: uniphier: Fix USB interrupts for PXs2 SoC
	arm64: dts: uniphier: Fix USB interrupts for PXs3 SoC
	usb: dwc3: gadget: refactor dwc3_repare_one_trb
	usb: dwc3: gadget: fix high speed multiplier setting
	netfilter: nf_tables: do not allow SET_ID to refer to another table
	netfilter: nf_tables: do not allow CHAIN_ID to refer to another table
	netfilter: nf_tables: do not allow RULE_ID to refer to another chain
	netfilter: nf_tables: fix null deref due to zeroed list head
	epoll: autoremove wakers even more aggressively
	x86: Handle idle=nomwait cmdline properly for x86_idle
	arch: make TRACE_IRQFLAGS_NMI_SUPPORT generic
	arm64: Do not forget syscall when starting a new thread.
	arm64: fix oops in concurrently setting insn_emulation sysctls
	arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"
	ext2: Add more validity checks for inode counts
	sched/fair: Introduce SIS_UTIL to search idle CPU based on sum of util_avg
	genirq: Don't return error on missing optional irq_request_resources()
	irqchip/mips-gic: Only register IPI domain when SMP is enabled
	genirq: GENERIC_IRQ_IPI depends on SMP
	sched/core: Always flush pending blk_plug
	irqchip/mips-gic: Check the return value of ioremap() in gic_of_init()
	wait: Fix __wait_event_hrtimeout for RT/DL tasks
	ARM: dts: imx6ul: add missing properties for sram
	ARM: dts: imx6ul: change operating-points to uint32-matrix
	ARM: dts: imx6ul: fix keypad compatible
	ARM: dts: imx6ul: fix csi node compatible
	ARM: dts: imx6ul: fix lcdif node compatible
	ARM: dts: imx6ul: fix qspi node compatible
	ARM: dts: BCM5301X: Add DT for Meraki MR26
	ARM: dts: ux500: Fix Codina accelerometer mounting matrix
	ARM: dts: ux500: Fix Gavini accelerometer mounting matrix
	spi: synquacer: Add missing clk_disable_unprepare()
	ARM: OMAP2+: display: Fix refcount leak bug
	ARM: OMAP2+: pdata-quirks: Fix refcount leak bug
	ACPI: EC: Remove duplicate ThinkPad X1 Carbon 6th entry from DMI quirks
	ACPI: EC: Drop the EC_FLAGS_IGNORE_DSDT_GPE quirk
	ACPI: PM: save NVS memory for Lenovo G40-45
	ACPI: LPSS: Fix missing check in register_device_clock()
	ARM: dts: qcom: sdx55: Fix the IRQ trigger type for UART
	arm64: dts: qcom: ipq8074: fix NAND node name
	arm64: dts: allwinner: a64: orangepi-win: Fix LED node name
	ARM: shmobile: rcar-gen2: Increase refcount for new reference
	firmware: tegra: Fix error check return value of debugfs_create_file()
	hwmon: (dell-smm) Add Dell XPS 13 7390 to fan control whitelist
	hwmon: (sht15) Fix wrong assumptions in device remove callback
	PM: hibernate: defer device probing when resuming from hibernation
	selinux: fix memleak in security_read_state_kernel()
	selinux: Add boundary check in put_entry()
	kasan: test: Silence GCC 12 warnings
	drm/amdgpu: Remove one duplicated ef removal
	powerpc/64s: Disable stack variable initialisation for prom_init
	spi: spi-rspi: Fix PIO fallback on RZ platforms
	ARM: findbit: fix overflowing offset
	meson-mx-socinfo: Fix refcount leak in meson_mx_socinfo_init
	arm64: dts: renesas: beacon: Fix regulator node names
	spi: spi-altera-dfl: Fix an error handling path
	ARM: bcm: Fix refcount leak in bcm_kona_smc_init
	ACPI: processor/idle: Annotate more functions to live in cpuidle section
	ARM: dts: imx7d-colibri-emmc: add cpu1 supply
	soc: renesas: r8a779a0-sysc: Fix A2DP1 and A2CV[2357] PDR values
	scsi: hisi_sas: Use managed PCI functions
	dt-bindings: iio: accel: Add DT binding doc for ADXL355
	soc: amlogic: Fix refcount leak in meson-secure-pwrc.c
	arm64: dts: renesas: Fix thermal-sensors on single-zone sensors
	x86/pmem: Fix platform-device leak in error path
	ARM: dts: ast2500-evb: fix board compatible
	ARM: dts: ast2600-evb: fix board compatible
	ARM: dts: ast2600-evb-a1: fix board compatible
	arm64: dts: mt8192: Fix idle-states nodes naming scheme
	arm64: dts: mt8192: Fix idle-states entry-method
	arm64: select TRACE_IRQFLAGS_NMI_SUPPORT
	arm64: cpufeature: Allow different PMU versions in ID_DFR0_EL1
	locking/lockdep: Fix lockdep_init_map_*() confusion
	arm64: dts: qcom: sc7180: Remove ipa_fw_mem node on trogdor
	soc: fsl: guts: machine variable might be unset
	block: fix infinite loop for invalid zone append
	ARM: dts: qcom: mdm9615: add missing PMIC GPIO reg
	ARM: OMAP2+: Fix refcount leak in omapdss_init_of
	ARM: OMAP2+: Fix refcount leak in omap3xxx_prm_late_init
	arm64: dts: qcom: sdm630: disable GPU by default
	arm64: dts: qcom: sdm630: fix the qusb2phy ref clock
	arm64: dts: qcom: sdm630: fix gpu's interconnect path
	arm64: dts: qcom: sdm636-sony-xperia-ganges-mermaid: correct sdc2 pinconf
	cpufreq: zynq: Fix refcount leak in zynq_get_revision
	regulator: qcom_smd: Fix pm8916_pldo range
	ACPI: APEI: Fix _EINJ vs EFI_MEMORY_SP
	ARM: dts: qcom-msm8974: fix irq type on blsp2_uart1
	soc: qcom: ocmem: Fix refcount leak in of_get_ocmem
	soc: qcom: aoss: Fix refcount leak in qmp_cooling_devices_register
	ARM: dts: qcom: pm8841: add required thermal-sensor-cells
	bus: hisi_lpc: fix missing platform_device_put() in hisi_lpc_acpi_probe()
	stack: Declare {randomize_,}kstack_offset to fix Sparse warnings
	arm64: dts: qcom: msm8916: Fix typo in pronto remoteproc node
	ACPI: APEI: explicit init of HEST and GHES in apci_init()
	drivers/iio: Remove all strcpy() uses
	ACPI: VIOT: Fix ACS setup
	arm64: dts: qcom: sm6125: Move sdc2 pinctrl from seine-pdx201 to sm6125
	arm64: dts: qcom: sm6125: Append -state suffix to pinctrl nodes
	arm64: dts: qcom: sm8250: add missing PCIe PHY clock-cells
	arm64: dts: mt7622: fix BPI-R64 WPS button
	arm64: tegra: Fixup SYSRAM references
	arm64: tegra: Update Tegra234 BPMP channel addresses
	arm64: tegra: Mark BPMP channels as no-memory-wc
	arm64: tegra: Fix SDMMC1 CD on P2888
	erofs: avoid consecutive detection for Highmem memory
	blk-mq: don't create hctx debugfs dir until q->debugfs_dir is created
	spi: Fix simplification of devm_spi_register_controller
	spi: tegra20-slink: fix UAF in tegra_slink_remove()
	hwmon: (drivetemp) Add module alias
	blktrace: Trace remapped requests correctly
	PM: domains: Ensure genpd_debugfs_dir exists before remove
	dm writecache: return void from functions
	dm writecache: count number of blocks read, not number of read bios
	dm writecache: count number of blocks written, not number of write bios
	dm writecache: count number of blocks discarded, not number of discard bios
	regulator: of: Fix refcount leak bug in of_get_regulation_constraints()
	soc: qcom: Make QCOM_RPMPD depend on PM
	arm64: dts: qcom: qcs404: Fix incorrect USB2 PHYs assignment
	irqdomain: Report irq number for NOMAP domains
	drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX
	nohz/full, sched/rt: Fix missed tick-reenabling bug in dequeue_task_rt()
	x86/extable: Fix ex_handler_msr() print condition
	selftests/seccomp: Fix compile warning when CC=clang
	thermal/tools/tmon: Include pthread and time headers in tmon.h
	dm: return early from dm_pr_call() if DM device is suspended
	pwm: sifive: Simplify offset calculation for PWMCMP registers
	pwm: sifive: Ensure the clk is enabled exactly once per running PWM
	pwm: sifive: Shut down hardware only after pwmchip_remove() completed
	pwm: lpc18xx-sct: Reduce number of devm memory allocations
	pwm: lpc18xx-sct: Simplify driver by not using pwm_[gs]et_chip_data()
	pwm: lpc18xx: Fix period handling
	drm/dp: Export symbol / kerneldoc fixes for DP AUX bus
	drm/bridge: tc358767: Move (e)DP bridge endpoint parsing into dedicated function
	ath10k: do not enforce interrupt trigger type
	drm/st7735r: Fix module autoloading for Okaya RH128128T
	drm/panel: Fix build error when CONFIG_DRM_PANEL_SAMSUNG_ATNA33XC20=y && CONFIG_DRM_DISPLAY_HELPER=m
	wifi: rtlwifi: fix error codes in rtl_debugfs_set_write_h2c()
	ath11k: fix netdev open race
	drm/mipi-dbi: align max_chunk to 2 in spi_transfer
	ath11k: Fix incorrect debug_mask mappings
	drm/radeon: fix potential buffer overflow in ni_set_mc_special_registers()
	drm/mediatek: Modify dsi funcs to atomic operations
	drm/mediatek: Separate poweron/poweroff from enable/disable and define new funcs
	drm/mediatek: Add pull-down MIPI operation in mtk_dsi_poweroff function
	drm/meson: encoder_hdmi: switch to bridge DRM_BRIDGE_ATTACH_NO_CONNECTOR
	drm/meson: encoder_hdmi: Fix refcount leak in meson_encoder_hdmi_init
	drm/bridge: lt9611uxc: Cancel only driver's work
	i2c: npcm: Remove own slave addresses 2:10
	i2c: npcm: Correct slave role behavior
	i2c: mxs: Silence a clang warning
	virtio-gpu: fix a missing check to avoid NULL dereference
	drm/shmem-helper: Unexport drm_gem_shmem_create_with_handle()
	drm/shmem-helper: Export dedicated wrappers for GEM object functions
	drm/shmem-helper: Pass GEM shmem object in public interfaces
	drm/virtio: Fix NULL vs IS_ERR checking in virtio_gpu_object_shmem_init
	drm: adv7511: override i2c address of cec before accessing it
	crypto: sun8i-ss - do not allocate memory when handling hash requests
	crypto: sun8i-ss - fix error codes in allocate_flows()
	net: fix sk_wmem_schedule() and sk_rmem_schedule() errors
	can: netlink: allow configuring of fixed bit rates without need for do_set_bittiming callback
	can: netlink: allow configuring of fixed data bit rates without need for do_set_data_bittiming callback
	i2c: Fix a potential use after free
	crypto: sun8i-ss - fix infinite loop in sun8i_ss_setup_ivs()
	media: atmel: atmel-sama7g5-isc: fix warning in configs without OF
	media: tw686x: Register the irq at the end of probe
	media: imx-jpeg: Correct some definition according specification
	media: imx-jpeg: Leave a blank space before the configuration data
	media: imx-jpeg: Add pm-runtime support for imx-jpeg
	media: imx-jpeg: use NV12M to represent non contiguous NV12
	media: imx-jpeg: Set V4L2_BUF_FLAG_LAST at eos
	media: imx-jpeg: Refactor function mxc_jpeg_parse
	media: imx-jpeg: Identify and handle precision correctly
	media: imx-jpeg: Handle source change in a function
	media: imx-jpeg: Support dynamic resolution change
	media: imx-jpeg: Align upwards buffer size
	media: imx-jpeg: Implement drain using v4l2-mem2mem helpers
	ath9k: fix use-after-free in ath9k_hif_usb_rx_cb
	wifi: iwlegacy: 4965: fix potential off-by-one overflow in il4965_rs_fill_link_cmd()
	drm/radeon: fix incorrrect SPDX-License-Identifiers
	rcutorture: Warn on individual rcu_torture_init() error conditions
	rcutorture: Don't cpuhp_remove_state() if cpuhp_setup_state() failed
	rcutorture: Fix ksoftirqd boosting timing and iteration
	test_bpf: fix incorrect netdev features
	crypto: ccp - During shutdown, check SEV data pointer before using
	drm: bridge: adv7511: Add check for mipi_dsi_driver_register
	media: imx-jpeg: Disable slot interrupt when frame done
	drm/mcde: Fix refcount leak in mcde_dsi_bind
	media: hdpvr: fix error value returns in hdpvr_read
	media: v4l2-mem2mem: prevent pollerr when last_buffer_dequeued is set
	media: driver/nxp/imx-jpeg: fix a unexpected return value problem
	media: tw686x: Fix memory leak in tw686x_video_init
	drm/vc4: plane: Remove subpixel positioning check
	drm/vc4: plane: Fix margin calculations for the right/bottom edges
	drm/bridge: Add a function to abstract away panels
	drm/vc4: dsi: Switch to devm_drm_of_get_bridge
	drm/vc4: Use of_device_get_match_data()
	drm/vc4: dsi: Release workaround buffer and DMA
	drm/vc4: dsi: Correct DSI divider calculations
	drm/vc4: dsi: Correct pixel order for DSI0
	drm/vc4: dsi: Register dsi0 as the correct vc4 encoder type
	drm/vc4: dsi: Fix dsi0 interrupt support
	drm/vc4: dsi: Add correct stop condition to vc4_dsi_encoder_disable iteration
	drm/vc4: hdmi: Fix HPD GPIO detection
	drm/vc4: hdmi: Avoid full hdmi audio fifo writes
	drm/vc4: hdmi: Reset HDMI MISC_CONTROL register
	drm/vc4: hdmi: Fix timings for interlaced modes
	drm/vc4: hdmi: Correct HDMI timing registers for interlaced modes
	crypto: arm64/gcm - Select AEAD for GHASH_ARM64_CE
	selftests/xsk: Destroy BPF resources only when ctx refcount drops to 0
	drm/rockchip: vop: Don't crash for invalid duplicate_state()
	drm/rockchip: Fix an error handling path rockchip_dp_probe()
	drm/mediatek: dpi: Remove output format of YUV
	drm/mediatek: dpi: Only enable dpi after the bridge is enabled
	drm: bridge: sii8620: fix possible off-by-one
	hinic: Use the bitmap API when applicable
	net: hinic: fix bug that ethtool get wrong stats
	net: hinic: avoid kernel hung in hinic_get_stats64()
	drm/msm/mdp5: Fix global state lock backoff
	crypto: hisilicon/sec - don't sleep when in softirq
	crypto: hisilicon - Kunpeng916 crypto driver don't sleep when in softirq
	media: platform: mtk-mdp: Fix mdp_ipi_comm structure alignment
	drm/msm: Avoid dirtyfb stalls on video mode displays (v2)
	drm/msm/dpu: Fix for non-visible planes
	mt76: mt76x02u: fix possible memory leak in __mt76x02u_mcu_send_msg
	mt76: mt7615: do not update pm stats in case of error
	ieee80211: add EHT 1K aggregation definitions
	mt76: mt7921: fix aggregation subframes setting to HE max
	mt76: mt7921: enlarge maximum VHT MPDU length to 11454
	mediatek: mt76: mac80211: Fix missing of_node_put() in mt76_led_init()
	mediatek: mt76: eeprom: fix missing of_node_put() in mt76_find_power_limits_node()
	skmsg: Fix invalid last sg check in sk_msg_recvmsg()
	drm/exynos/exynos7_drm_decon: free resources when clk_set_parent() failed.
	tcp: make retransmitted SKB fit into the send window
	libbpf: Fix the name of a reused map
	selftests: timers: valid-adjtimex: build fix for newer toolchains
	selftests: timers: clocksource-switch: fix passing errors from child
	bpf: Fix subprog names in stack traces.
	fs: check FMODE_LSEEK to control internal pipe splicing
	media: cedrus: h265: Fix flag name
	media: hantro: postproc: Fix motion vector space size
	media: hantro: Simplify postprocessor
	media: hevc: Embedded indexes in RPS
	media: staging: media: hantro: Fix typos
	wifi: wil6210: debugfs: fix info leak in wil_write_file_wmi()
	wifi: p54: Fix an error handling path in p54spi_probe()
	wifi: p54: add missing parentheses in p54_flush()
	selftests/bpf: fix a test for snprintf() overflow
	libbpf: fix an snprintf() overflow check
	can: pch_can: do not report txerr and rxerr during bus-off
	can: rcar_can: do not report txerr and rxerr during bus-off
	can: sja1000: do not report txerr and rxerr during bus-off
	can: hi311x: do not report txerr and rxerr during bus-off
	can: sun4i_can: do not report txerr and rxerr during bus-off
	can: kvaser_usb_hydra: do not report txerr and rxerr during bus-off
	can: kvaser_usb_leaf: do not report txerr and rxerr during bus-off
	can: usb_8dev: do not report txerr and rxerr during bus-off
	can: error: specify the values of data[5..7] of CAN error frames
	can: pch_can: pch_can_error(): initialize errc before using it
	Bluetooth: hci_intel: Add check for platform_driver_register
	i2c: cadence: Support PEC for SMBus block read
	i2c: mux-gpmux: Add of_node_put() when breaking out of loop
	wifi: wil6210: debugfs: fix uninitialized variable use in `wil_write_file_wmi()`
	wifi: iwlwifi: mvm: fix double list_add at iwl_mvm_mac_wake_tx_queue
	wifi: libertas: Fix possible refcount leak in if_usb_probe()
	media: cedrus: hevc: Add check for invalid timestamp
	net/mlx5e: Remove WARN_ON when trying to offload an unsupported TLS cipher/version
	net/mlx5e: Fix the value of MLX5E_MAX_RQ_NUM_MTTS
	net/mlx5: Adjust log_max_qp to be 18 at most
	crypto: hisilicon/hpre - don't use GFP_KERNEL to alloc mem during softirq
	crypto: inside-secure - Add missing MODULE_DEVICE_TABLE for of
	crypto: hisilicon/sec - fix auth key size error
	inet: add READ_ONCE(sk->sk_bound_dev_if) in INET_MATCH()
	ipv6: add READ_ONCE(sk->sk_bound_dev_if) in INET6_MATCH()
	net: allow unbound socket for packets in VRF when tcp_l3mdev_accept set
	netdevsim: fib: Fix reference count leak on route deletion failure
	wifi: rtw88: check the return value of alloc_workqueue()
	iavf: Fix max_rate limiting
	iavf: Fix 'tc qdisc show' listing too many queues
	netdevsim: Avoid allocation warnings triggered from user space
	net: rose: fix netdev reference changes
	net: ionic: fix error check for vlan flags in ionic_set_nic_features()
	dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock
	net: usb: make USB_RTL8153_ECM non user configurable
	wireguard: ratelimiter: use hrtimer in selftest
	wireguard: allowedips: don't corrupt stack when detecting overflow
	HID: amd_sfh: Don't show client init failed as error when discovery fails
	clk: renesas: r9a06g032: Fix UART clkgrp bitsel
	mtd: maps: Fix refcount leak in of_flash_probe_versatile
	mtd: maps: Fix refcount leak in ap_flash_init
	mtd: rawnand: meson: Fix a potential double free issue
	of: check previous kernel's ima-kexec-buffer against memory bounds
	scsi: qla2xxx: edif: Reduce Initiator-Initiator thrashing
	scsi: qla2xxx: edif: Fix potential stuck session in sa update
	scsi: qla2xxx: edif: Reduce connection thrash
	scsi: qla2xxx: edif: Fix inconsistent check of db_flags
	scsi: qla2xxx: edif: Synchronize NPIV deletion with authentication application
	scsi: qla2xxx: edif: Add retry for ELS passthrough
	scsi: qla2xxx: edif: Fix n2n discovery issue with secure target
	scsi: qla2xxx: edif: Fix n2n login retry for secure device
	KVM: SVM: Unwind "speculative" RIP advancement if INTn injection "fails"
	KVM: SVM: Stuff next_rip on emulated INT3 injection if NRIPS is supported
	phy: samsung: exynosautov9-ufs: correct TSRV register configurations
	PCI: microchip: Fix refcount leak in mc_pcie_init_irq_domains()
	PCI: tegra194: Fix PM error handling in tegra_pcie_config_ep()
	HID: cp2112: prevent a buffer overflow in cp2112_xfer()
	mtd: sm_ftl: Fix deadlock caused by cancel_work_sync in sm_release
	mtd: partitions: Fix refcount leak in parse_redboot_of
	mtd: parsers: ofpart: Fix refcount leak in bcm4908_partitions_fw_offset
	mtd: st_spi_fsm: Add a clk_disable_unprepare() in .probe()'s error path
	PCI: mediatek-gen3: Fix refcount leak in mtk_pcie_init_irq_domains()
	fpga: altera-pr-ip: fix unsigned comparison with less than zero
	usb: host: Fix refcount leak in ehci_hcd_ppc_of_probe
	usb: ohci-nxp: Fix refcount leak in ohci_hcd_nxp_probe
	usb: gadget: tegra-xudc: Fix error check in tegra_xudc_powerdomain_init()
	usb: xhci: tegra: Fix error check
	netfilter: xtables: Bring SPDX identifier back
	scsi: qla2xxx: edif: Send LOGO for unexpected IKE message
	scsi: qla2xxx: edif: Reduce disruption due to multiple app start
	scsi: qla2xxx: edif: Fix no login after app start
	scsi: qla2xxx: edif: Tear down session if keys have been removed
	scsi: qla2xxx: edif: Fix session thrash
	scsi: qla2xxx: edif: Fix no logout on delete for N2N
	iio: accel: bma400: Fix the scale min and max macro values
	platform/chrome: cros_ec: Always expose last resume result
	iio: accel: bma400: Reordering of header files
	clk: mediatek: reset: Fix written reset bit offset
	lib/test_hmm: avoid accessing uninitialized pages
	memremap: remove support for external pgmap refcounts
	mm/memremap: fix memunmap_pages() race with get_dev_pagemap()
	KVM: Don't set Accessed/Dirty bits for ZERO_PAGE
	mwifiex: Ignore BTCOEX events from the 88W8897 firmware
	mwifiex: fix sleep in atomic context bugs caused by dev_coredumpv
	scsi: iscsi: Allow iscsi_if_stop_conn() to be called from kernel
	scsi: iscsi: Add helper to remove a session from the kernel
	scsi: iscsi: Fix session removal on shutdown
	dmaengine: dw-edma: Fix eDMA Rd/Wr-channels and DMA-direction semantics
	mtd: dataflash: Add SPI ID table
	clk: qcom: camcc-sm8250: Fix halt on boot by reducing driver's init level
	misc: rtsx: Fix an error handling path in rtsx_pci_probe()
	driver core: fix potential deadlock in __driver_attach
	clk: qcom: clk-krait: unlock spin after mux completion
	clk: qcom: gcc-msm8939: Add missing SYSTEM_MM_NOC_BFDCD_CLK_SRC
	clk: qcom: gcc-msm8939: Fix bimc_ddr_clk_src rcgr base address
	clk: qcom: gcc-msm8939: Add missing system_mm_noc_bfdcd_clk_src
	clk: qcom: gcc-msm8939: Point MM peripherals to system_mm_noc clock
	usb: host: xhci: use snprintf() in xhci_decode_trb()
	RDMA/rxe: Fix deadlock in rxe_do_local_ops()
	clk: qcom: ipq8074: fix NSS core PLL-s
	clk: qcom: ipq8074: SW workaround for UBI32 PLL lock
	clk: qcom: ipq8074: fix NSS port frequency tables
	clk: qcom: ipq8074: set BRANCH_HALT_DELAY flag for UBI clocks
	clk: qcom: camcc-sdm845: Fix topology around titan_top power domain
	clk: qcom: camcc-sm8250: Fix topology around titan_top power domain
	clk: qcom: clk-rcg2: Fail Duty-Cycle configuration if MND divider is not enabled.
	clk: qcom: clk-rcg2: Make sure to not write d=0 to the NMD register
	mm/mempolicy: fix get_nodes out of bound access
	PCI: dwc: Stop link on host_init errors and de-initialization
	PCI: dwc: Add unroll iATU space support to dw_pcie_disable_atu()
	PCI: dwc: Disable outbound windows only for controllers using iATU
	PCI: dwc: Set INCREASE_REGION_SIZE flag based on limit address
	PCI: dwc: Deallocate EPC memory on dw_pcie_ep_init() errors
	PCI: dwc: Always enable CDM check if "snps,enable-cdm-check" exists
	soundwire: bus_type: fix remove and shutdown support
	soundwire: revisit driver bind/unbind and callbacks
	KVM: arm64: Don't return from void function
	dmaengine: sf-pdma: Add multithread support for a DMA channel
	PCI: endpoint: Don't stop controller when unbinding endpoint function
	scsi: qla2xxx: Check correct variable in qla24xx_async_gffid()
	intel_th: Fix a resource leak in an error handling path
	intel_th: msu-sink: Potential dereference of null pointer
	intel_th: msu: Fix vmalloced buffers
	binder: fix redefinition of seq_file attributes
	staging: rtl8192u: Fix sleep in atomic context bug in dm_fsync_timer_callback
	mmc: sdhci-of-esdhc: Fix refcount leak in esdhc_signal_voltage_switch
	mmc: mxcmmc: Silence a clang warning
	mmc: renesas_sdhi: Get the reset handle early in the probe
	memstick/ms_block: Fix some incorrect memory allocation
	memstick/ms_block: Fix a memory leak
	mmc: sdhci-of-at91: fix set_uhs_signaling rewriting of MC1R
	of: device: Fix missing of_node_put() in of_dma_set_restricted_buffer
	mmc: block: Add single read for 4k sector cards
	KVM: s390: pv: leak the topmost page table when destroy fails
	PCI/portdrv: Don't disable AER reporting in get_port_device_capability()
	PCI: qcom: Set up rev 2.1.0 PARF_PHY before enabling clocks
	scsi: smartpqi: Fix DMA direction for RAID requests
	xtensa: iss/network: provide release() callback
	xtensa: iss: fix handling error cases in iss_net_configure()
	usb: gadget: udc: amd5536 depends on HAS_DMA
	usb: aspeed-vhub: Fix refcount leak bug in ast_vhub_init_desc()
	usb: dwc3: core: Deprecate GCTL.CORESOFTRESET
	usb: dwc3: core: Do not perform GCTL_CORE_SOFTRESET during bootup
	usb: dwc3: qcom: fix missing optional irq warnings
	eeprom: idt_89hpesx: uninitialized data in idt_dbgfs_csr_write()
	phy: stm32: fix error return in stm32_usbphyc_phy_init
	interconnect: imx: fix max_node_id
	um: random: Don't initialise hwrng struct with zero
	RDMA/irdma: Fix a window for use-after-free
	RDMA/irdma: Fix VLAN connection with wildcard address
	RDMA/irdma: Fix setting of QP context err_rq_idx_valid field
	RDMA/rtrs-srv: Fix modinfo output for stringify
	RDMA/rtrs: Fix warning when use poll mode on client side.
	RDMA/rtrs: Replace duplicate check with is_pollqueue helper
	RDMA/rtrs: Introduce destroy_cq helper
	RDMA/rtrs: Do not allow sessname to contain special symbols / and .
	RDMA/rtrs: Rename rtrs_sess to rtrs_path
	RDMA/rtrs-srv: Rename rtrs_srv_sess to rtrs_srv_path
	RDMA/rtrs-clt: Rename rtrs_clt_sess to rtrs_clt_path
	RDMA/rtrs-clt: Replace list_next_or_null_rr_rcu with an inline function
	RDMA/qedr: Fix potential memory leak in __qedr_alloc_mr()
	RDMA/hns: Fix incorrect clearing of interrupt status register
	RDMA/siw: Fix duplicated reported IW_CM_EVENT_CONNECT_REPLY event
	iio: cros: Register FIFO callback after sensor is registered
	clk: qcom: gcc-msm8939: Fix weird field spacing in ftbl_gcc_camss_cci_clk
	RDMA/hfi1: fix potential memory leak in setup_base_ctxt()
	gpio: gpiolib-of: Fix refcount bugs in of_mm_gpiochip_add_data()
	HID: mcp2221: prevent a buffer overflow in mcp_smbus_write()
	HID: amd_sfh: Add NULL check for hid device
	dmaengine: imx-dma: Cast of_device_get_match_data() with (uintptr_t)
	scripts/gdb: lx-dmesg: read records individually
	scripts/gdb: fix 'lx-dmesg' on 32 bits arch
	RDMA/rxe: Fix mw bind to allow any consumer key portion
	mmc: cavium-octeon: Add of_node_put() when breaking out of loop
	mmc: cavium-thunderx: Add of_node_put() when breaking out of loop
	HID: alps: Declare U1_UNICORN_LEGACY support
	RDMA/rxe: For invalidate compare according to set keys in mr
	PCI: tegra194: Fix Root Port interrupt handling
	PCI: tegra194: Fix link up retry sequence
	HID: amd_sfh: Handle condition of "no sensors"
	USB: serial: fix tty-port initialized comments
	usb: cdns3: change place of 'priv_ep' assignment in cdns3_gadget_ep_dequeue(), cdns3_gadget_ep_enable()
	mtd: spi-nor: fix spi_nor_spimem_setup_op() call in spi_nor_erase_{sector,chip}()
	KVM: nVMX: Set UMIP bit CR4_FIXED1 MSR when emulating UMIP
	platform/olpc: Fix uninitialized data in debugfs write
	RDMA/srpt: Duplicate port name members
	RDMA/srpt: Introduce a reference count in struct srpt_device
	RDMA/srpt: Fix a use-after-free
	android: binder: stop saving a pointer to the VMA
	mm/mmap.c: fix missing call to vm_unacct_memory in mmap_region
	selftests: kvm: set rax before vmcall
	of/fdt: declared return type does not match actual return type
	RDMA/mlx5: Add missing check for return value in get namespace flow
	RDMA/rxe: Add memory barriers to kernel queues
	RDMA/rxe: Remove the is_user members of struct rxe_sq/rxe_rq/rxe_srq
	RDMA/rxe: Fix error unwind in rxe_create_qp()
	block/rnbd-srv: Set keep_id to true after mutex_trylock
	null_blk: fix ida error handling in null_add_dev()
	nvme: use command_id instead of req->tag in trace_nvme_complete_rq()
	nvme: define compat_ioctl again to unbreak 32-bit userspace.
	nvme: disable namespace access for unsupported metadata
	nvme: don't return an error from nvme_configure_metadata
	nvme: catch -ENODEV from nvme_revalidate_zones again
	block/bio: remove duplicate append pages code
	block: ensure iov_iter advances for added pages
	jbd2: fix outstanding credits assert in jbd2_journal_commit_transaction()
	ext4: recover csum seed of tmp_inode after migrating to extents
	jbd2: fix assertion 'jh->b_frozen_data == NULL' failure when journal aborted
	usb: cdns3: Don't use priv_dev uninitialized in cdns3_gadget_ep_enable()
	opp: Fix error check in dev_pm_opp_attach_genpd()
	ASoC: cros_ec_codec: Fix refcount leak in cros_ec_codec_platform_probe
	ASoC: samsung: Fix error handling in aries_audio_probe
	ASoC: imx-audmux: Silence a clang warning
	ASoC: mediatek: mt8173: Fix refcount leak in mt8173_rt5650_rt5676_dev_probe
	ASoC: mt6797-mt6351: Fix refcount leak in mt6797_mt6351_dev_probe
	ASoC: codecs: da7210: add check for i2c_add_driver
	ASoC: mediatek: mt8173-rt5650: Fix refcount leak in mt8173_rt5650_dev_probe
	serial: 8250: Export ICR access helpers for internal use
	serial: 8250: dma: Allow driver operations before starting DMA transfers
	serial: 8250_dw: Store LSR into lsr_saved_flags in dw8250_tx_wait_empty()
	ASoC: codecs: msm8916-wcd-digital: move gains from SX_TLV to S8_TLV
	ASoC: codecs: wcd9335: move gains from SX_TLV to S8_TLV
	rpmsg: char: Add mutex protection for rpmsg_eptdev_open()
	rpmsg: mtk_rpmsg: Fix circular locking dependency
	remoteproc: k3-r5: Fix refcount leak in k3_r5_cluster_of_init
	selftests/livepatch: better synchronize test_klp_callbacks_busy
	profiling: fix shift too large makes kernel panic
	remoteproc: imx_rproc: Fix refcount leak in imx_rproc_addr_init
	ASoC: samsung: h1940_uda1380: include proepr GPIO consumer header
	powerpc/perf: Optimize clearing the pending PMI and remove WARN_ON for PMI check in power_pmu_disable
	ASoC: samsung: change gpiod_speaker_power and rx1950_audio from global to static variables
	tty: n_gsm: Delete gsmtty open SABM frame when config requester
	tty: n_gsm: fix user open not possible at responder until initiator open
	tty: n_gsm: fix tty registration before control channel open
	tty: n_gsm: fix wrong queuing behavior in gsm_dlci_data_output()
	tty: n_gsm: fix missing timer to handle stalled links
	tty: n_gsm: fix non flow control frames during mux flow off
	tty: n_gsm: fix packet re-transmission without open control channel
	tty: n_gsm: fix race condition in gsmld_write()
	tty: n_gsm: fix resource allocation order in gsm_activate_mux()
	ASoC: qcom: Fix missing of_node_put() in asoc_qcom_lpass_cpu_platform_probe()
	ASoC: imx-card: Fix DSD/PDM mclk frequency
	remoteproc: qcom: wcnss: Fix handling of IRQs
	vfio/ccw: Do not change FSM state in subchannel event
	serial: 8250_fsl: Don't report FE, PE and OE twice
	tty: n_gsm: fix wrong T1 retry count handling
	tty: n_gsm: fix DM command
	tty: n_gsm: fix missing corner cases in gsmld_poll()
	MIPS: vdso: Utilize __pa() for gic_pfn
	swiotlb: fail map correctly with failed io_tlb_default_mem
	ASoC: mt6359: Fix refcount leak bug
	serial: 8250_bcm7271: Save/restore RTS in suspend/resume
	iommu/exynos: Handle failed IOMMU device registration properly
	9p: fix a bunch of checkpatch warnings
	9p: Drop kref usage
	9p: Add client parameter to p9_req_put()
	net: 9p: fix refcount leak in p9_read_work() error handling
	MIPS: Fixed __debug_virt_addr_valid()
	rpmsg: qcom_smd: Fix refcount leak in qcom_smd_parse_edge
	kfifo: fix kfifo_to_user() return type
	lib/smp_processor_id: fix imbalanced instrumentation_end() call
	proc: fix a dentry lock race between release_task and lookup
	remoteproc: qcom: pas: Check if coredump is enabled
	remoteproc: sysmon: Wait for SSCTL service to come up
	mfd: t7l66xb: Drop platform disable callback
	mfd: max77620: Fix refcount leak in max77620_initialise_fps
	iommu/arm-smmu: qcom_iommu: Add of_node_put() when breaking out of loop
	perf tools: Fix dso_id inode generation comparison
	s390/dump: fix old lowcore virtual vs physical address confusion
	s390/maccess: fix semantics of memcpy_real() and its callers
	s390/crash: fix incorrect number of bytes to copy to user space
	s390/zcore: fix race when reading from hardware system area
	ASoC: fsl_asrc: force cast the asrc_format type
	ASoC: fsl-asoc-card: force cast the asrc_format type
	ASoC: fsl_easrc: use snd_pcm_format_t type for sample_format
	ASoC: imx-card: use snd_pcm_format_t type for asrc_format
	ASoC: qcom: q6dsp: Fix an off-by-one in q6adm_alloc_copp()
	fuse: Remove the control interface for virtio-fs
	ASoC: audio-graph-card: Add of_node_put() in fail path
	watchdog: sp5100_tco: Fix a memory leak of EFCH MMIO resource
	watchdog: armada_37xx_wdt: check the return value of devm_ioremap() in armada_37xx_wdt_probe()
	video: fbdev: amba-clcd: Fix refcount leak bugs
	video: fbdev: sis: fix typos in SiS_GetModeID()
	ASoC: mchp-spdifrx: disable end of block interrupt on failures
	powerpc/32: Call mmu_mark_initmem_nx() regardless of data block mapping.
	powerpc/32: Do not allow selection of e5500 or e6500 CPUs on PPC32
	powerpc/iommu: Fix iommu_table_in_use for a small default DMA window case
	powerpc/pci: Prefer PCI domain assignment via DT 'linux,pci-domain' and alias
	tty: serial: fsl_lpuart: correct the count of break characters
	s390/dump: fix os_info virtual vs physical address confusion
	s390/smp: cleanup target CPU callback starting
	s390/smp: cleanup control register update routines
	s390/maccess: rework absolute lowcore accessors
	s390/smp: enforce lowcore protection on CPU restart
	f2fs: fix to remove F2FS_COMPR_FL and tag F2FS_NOCOMP_FL at the same time
	powerpc/spufs: Fix refcount leak in spufs_init_isolated_loader
	powerpc/xive: Fix refcount leak in xive_get_max_prio
	powerpc/cell/axon_msi: Fix refcount leak in setup_msi_msg_address
	perf symbol: Fail to read phdr workaround
	kprobes: Forbid probing on trampoline and BPF code areas
	x86/bus_lock: Don't assume the init value of DEBUGCTLMSR.BUS_LOCK_DETECT to be zero
	powerpc/pci: Fix PHB numbering when using opal-phbid
	genelf: Use HAVE_LIBCRYPTO_SUPPORT, not the never defined HAVE_LIBCRYPTO
	scripts/faddr2line: Fix vmlinux detection on arm64
	sched/deadline: Merge dl_task_can_attach() and dl_cpu_busy()
	sched, cpuset: Fix dl_cpu_busy() panic due to empty cs->cpus_allowed
	x86/numa: Use cpumask_available instead of hardcoded NULL check
	video: fbdev: arkfb: Fix a divide-by-zero bug in ark_set_pixclock()
	tools/thermal: Fix possible path truncations
	sched: Fix the check of nr_running at queue wakelist
	sched: Remove the limitation of WF_ON_CPU on wakelist if wakee cpu is idle
	sched/core: Do not requeue task on CPU excluded from cpus_mask
	x86/entry: Build thunk_$(BITS) only if CONFIG_PREEMPTION=y
	f2fs: allow compression for mmap files in compress_mode=user
	f2fs: do not allow to decompress files have FI_COMPRESS_RELEASED
	video: fbdev: vt8623fb: Check the size of screen before memset_io()
	video: fbdev: arkfb: Check the size of screen before memset_io()
	video: fbdev: s3fb: Check the size of screen before memset_io()
	scsi: ufs: core: Correct ufshcd_shutdown() flow
	scsi: zfcp: Fix missing auto port scan and thus missing target ports
	scsi: qla2xxx: Fix imbalance vha->vref_count
	scsi: qla2xxx: Fix discovery issues in FC-AL topology
	scsi: qla2xxx: Turn off multi-queue for 8G adapters
	scsi: qla2xxx: Fix crash due to stale SRB access around I/O timeouts
	scsi: qla2xxx: Fix excessive I/O error messages by default
	scsi: qla2xxx: Fix erroneous mailbox timeout after PCI error injection
	scsi: qla2xxx: Wind down adapter after PCIe error
	scsi: qla2xxx: Fix losing FCP-2 targets on long port disable with I/Os
	scsi: qla2xxx: Fix losing target when it reappears during delete
	scsi: qla2xxx: Fix losing FCP-2 targets during port perturbation tests
	x86/bugs: Enable STIBP for IBPB mitigated RETBleed
	ftrace/x86: Add back ftrace_expected assignment
	x86/kprobes: Update kcb status flag after singlestepping
	x86/olpc: fix 'logical not is only applied to the left hand side'
	SMB3: fix lease break timeout when multiple deferred close handles for the same file.
	posix-cpu-timers: Cleanup CPU timers before freeing them during exec
	Input: gscps2 - check return value of ioremap() in gscps2_probe()
	__follow_mount_rcu(): verify that mount_lock remains unchanged
	spmi: trace: fix stack-out-of-bound access in SPMI tracing functions
	drm/mediatek: Allow commands to be sent during video mode
	drm/mediatek: Keep dsi as LP00 before dcs cmds transfer
	crypto: blake2s - remove shash module
	drm/dp/mst: Read the extended DPCD capabilities during system resume
	drm/vc4: drv: Adopt the dma configuration from the HVS or V3D component
	usbnet: smsc95xx: Don't clear read-only PHY interrupt
	usbnet: smsc95xx: Avoid link settings race on interrupt reception
	usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
	usbnet: smsc95xx: Fix deadlock on runtime resume
	firmware: arm_scpi: Ensure scpi_info is not assigned if the probe fails
	scsi: lpfc: Fix EEH support for NVMe I/O
	scsi: lpfc: SLI path split: Refactor lpfc_iocbq
	scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4
	scsi: lpfc: SLI path split: Refactor SCSI paths
	scsi: lpfc: Remove extra atomic_inc on cmd_pending in queuecommand after VMID
	intel_th: pci: Add Meteor Lake-P support
	intel_th: pci: Add Raptor Lake-S PCH support
	intel_th: pci: Add Raptor Lake-S CPU support
	KVM: set_msr_mce: Permit guests to ignore single-bit ECC errors
	KVM: x86: Signal #GP, not -EPERM, on bad WRMSR(MCi_CTL/STATUS)
	iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE)
	PCI/AER: Iterate over error counters instead of error strings
	PCI: qcom: Power on PHY before IPQ8074 DBI register accesses
	serial: 8250_pci: Refactor the loop in pci_ite887x_init()
	serial: 8250_pci: Replace dev_*() by pci_*() macros
	serial: 8250: Fold EndRun device support into OxSemi Tornado code
	serial: 8250: Add proper clock handling for OxSemi PCIe devices
	tty: 8250: Add support for Brainboxes PX cards.
	dm writecache: set a default MAX_WRITEBACK_JOBS
	kexec, KEYS, s390: Make use of built-in and secondary keyring for signature verification
	dm thin: fix use-after-free crash in dm_sm_register_threshold_callback
	net/9p: Initialize the iounit field during fid creation
	ARM: remove some dead code
	timekeeping: contribute wall clock to rng on time change
	locking/csd_lock: Change csdlock_debug from early_param to __setup
	block: remove the struct blk_queue_ctx forward declaration
	block: don't allow the same type rq_qos add more than once
	btrfs: ensure pages are unlocked on cow_file_range() failure
	btrfs: reset block group chunk force if we have to wait
	btrfs: properly flag filesystem with BTRFS_FEATURE_INCOMPAT_BIG_METADATA
	ACPI: CPPC: Do not prevent CPPC from working in the future
	powerpc/powernv/kvm: Use darn for H_RANDOM on Power9
	KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter
	KVM: VMX: Mark all PERF_GLOBAL_(OVF)_CTRL bits reserved if there's no vPMU
	KVM: x86/pmu: Ignore pmu->global_ctrl check if vPMU doesn't support global_ctrl
	KVM: VMX: Add helper to check if the guest PMU has PERF_GLOBAL_CTRL
	KVM: nVMX: Attempt to load PERF_GLOBAL_CTRL on nVMX xfer iff it exists
	dm raid: fix address sanitizer warning in raid_status
	dm raid: fix address sanitizer warning in raid_resume
	tracing: Add '__rel_loc' using trace event macros
	tracing: Avoid -Warray-bounds warning for __rel_loc macro
	ext4: update s_overhead_clusters in the superblock during an on-line resize
	ext4: fix extent status tree race in writeback error recovery path
	ext4: add EXT4_INODE_HAS_XATTR_SPACE macro in xattr.h
	ext4: fix use-after-free in ext4_xattr_set_entry
	ext4: correct max_inline_xattr_value_size computing
	ext4: correct the misjudgment in ext4_iget_extra_inode
	ext4: fix warning in ext4_iomap_begin as race between bmap and write
	ext4: check if directory block is within i_size
	ext4: make sure ext4_append() always allocates new block
	ext4: remove EA inode entry from mbcache on inode eviction
	ext4: use kmemdup() to replace kmalloc + memcpy
	ext4: unindent codeblock in ext4_xattr_block_set()
	ext4: fix race when reusing xattr blocks
	KEYS: asymmetric: enforce SM2 signature use pkey algo
	tpm: eventlog: Fix section mismatch for DEBUG_SECTION_MISMATCH
	xen-blkback: fix persistent grants negotiation
	xen-blkback: Apply 'feature_persistent' parameter when connect
	xen-blkfront: Apply 'feature_persistent' parameter when connect
	powerpc: Fix eh field when calling lwarx on PPC32
	tracing: Use a struct alignof to determine trace event field alignment
	net_sched: cls_route: remove from list when handle is 0
	mac80211: fix a memory leak where sta_info is not freed
	tcp: fix over estimation in sk_forced_mem_schedule()
	crypto: lib/blake2s - reduce stack frame usage in self test
	Revert "mwifiex: fix sleep in atomic context bugs caused by dev_coredumpv"
	Revert "s390/smp: enforce lowcore protection on CPU restart"
	drm/bridge: tc358767: Fix (e)DP bridge endpoint parsing in dedicated function
	net: phy: smsc: Disable Energy Detect Power-Down in interrupt mode
	drm/vc4: change vc4_dma_range_matches from a global to static
	tracing/perf: Avoid -Warray-bounds warning for __rel_loc macro
	drm/msm: Fix dirtyfb refcounting
	drm/meson: Fix refcount leak in meson_encoder_hdmi_init
	io_uring: mem-account pbuf buckets
	Revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP"
	Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm regression
	drm/bridge: Move devm_drm_of_get_bridge to bridge/panel.c
	scsi: lpfc: Fix locking for lpfc_sli_iocbq_lookup()
	scsi: lpfc: Fix element offset in __lpfc_sli_release_iocbq_s4()
	scsi: lpfc: Resolve some cleanup issues following SLI path refactoring
	Linux 5.15.61

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I0849e49fb265651bf6277e4ead9c440d50ed7536
2022-08-22 14:29:30 +02:00
Al Viro
4228c037f8 fix short copy handling in copy_mc_pipe_to_iter()
commit c3497fd009ef2c59eea60d21c3ac22de3585ed7d upstream.

Unlike other copying operations on ITER_PIPE, copy_mc_to_iter() can
result in a short copy.  In that case we need to trim the unused
buffers, as well as the length of partially filled one - it's not
enough to set ->head, ->iov_offset and ->count to reflect how
much had we copied.  Not hard to fix, fortunately...

I'd put a helper (pipe_discard_from(pipe, head)) into pipe_fs_i.h,
rather than iov_iter.c - it has nothing to do with iov_iter and
having it will allow us to avoid an ugly kludge in fs/splice.c.
We could put it into lib/iov_iter.c for now and move it later,
but I don't see the point going that way...

Cc: stable@kernel.org # 4.19+
Fixes: ca146f6f09 "lib/iov_iter: Fix pipe handling in _copy_to_iter_mcsafe()"
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17 14:22:51 +02:00
Greg Kroah-Hartman
6ebdc9fb8c ANDROID: GKI: fix up abi breakage in struct pipe_inode_info
In commit e6acf868ff ("pipe: make poll_usage boolean and annotate its
access") a field was changed from 'unsigned int' to 'bool' in struct
pipe_inode_info when it was determinied that the kernel was only
checking true/false for it.  This breaks the internal abi so put the
type back, while still keeping the original change that resolved a race
condition.

Bug: 161946584
Fixes: e6acf868ff ("pipe: make poll_usage boolean and annotate its access")
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: If5ebc17e4a198801e49257abef0f5e2ead94e668
2022-08-04 11:09:53 +02:00
Kuniyuki Iwashima
e6acf868ff pipe: make poll_usage boolean and annotate its access
commit f485922d8fe4e44f6d52a5bb95a603b7c65554bb upstream.

Patch series "Fix data-races around epoll reported by KCSAN."

This series suppresses a false positive KCSAN's message and fixes a real
data-race.


This patch (of 2):

pipe_poll() runs locklessly and assigns 1 to poll_usage.  Once poll_usage
is set to 1, it never changes in other places.  However, concurrent writes
of a value trigger KCSAN, so let's make KCSAN happy.

BUG: KCSAN: data-race in pipe_poll / pipe_poll

write to 0xffff8880042f6678 of 4 bytes by task 174 on cpu 3:
 pipe_poll (fs/pipe.c:656)
 ep_item_poll.isra.0 (./include/linux/poll.h:88 fs/eventpoll.c:853)
 do_epoll_wait (fs/eventpoll.c:1692 fs/eventpoll.c:1806 fs/eventpoll.c:2234)
 __x64_sys_epoll_wait (fs/eventpoll.c:2246 fs/eventpoll.c:2241 fs/eventpoll.c:2241)
 do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113)

write to 0xffff8880042f6678 of 4 bytes by task 177 on cpu 1:
 pipe_poll (fs/pipe.c:656)
 ep_item_poll.isra.0 (./include/linux/poll.h:88 fs/eventpoll.c:853)
 do_epoll_wait (fs/eventpoll.c:1692 fs/eventpoll.c:1806 fs/eventpoll.c:2234)
 __x64_sys_epoll_wait (fs/eventpoll.c:2246 fs/eventpoll.c:2241 fs/eventpoll.c:2241)
 do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113)

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 177 Comm: epoll_race Not tainted 5.17.0-58927-gf443e374ae13 #6
Hardware name: Red Hat KVM, BIOS 1.11.0-2.amzn2 04/01/2014

Link: https://lkml.kernel.org/r/20220322002653.33865-1-kuniyu@amazon.co.jp
Link: https://lkml.kernel.org/r/20220322002653.33865-2-kuniyu@amazon.co.jp
Fixes: 3b844826b6 ("pipe: avoid unnecessary EPOLLET wakeups under normal loads")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Kuniyuki Iwashima <kuni1840@gmail.com>
Cc: "Soheil Hassas Yeganeh" <soheil@google.com>
Cc: "Sridhar Samudrala" <sridhar.samudrala@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-06 08:43:37 +02:00
Linus Torvalds
3b844826b6 pipe: avoid unnecessary EPOLLET wakeups under normal loads
I had forgotten just how sensitive hackbench is to extra pipe wakeups,
and commit 3a34b13a88 ("pipe: make pipe writes always wake up
readers") ended up causing a quite noticeable regression on larger
machines.

Now, hackbench isn't necessarily a hugely meaningful benchmark, and it's
not clear that this matters in real life all that much, but as Mel
points out, it's used often enough when comparing kernels and so the
performance regression shows up like a sore thumb.

It's easy enough to fix at least for the common cases where pipes are
used purely for data transfer, and you never have any exciting poll
usage at all.  So set a special 'poll_usage' flag when there is polling
activity, and make the ugly "EPOLLET has crazy legacy expectations"
semantics explicit to only that case.

I would love to limit it to just the broken EPOLLET case, but the pipe
code can't see the difference between epoll and regular select/poll, so
any non-read/write waiting will trigger the extra wakeup behavior.  That
is sufficient for at least the hackbench case.

Apart from making the odd extra wakeup cases more explicitly about
EPOLLET, this also makes the extra wakeup be at the _end_ of the pipe
write, not at the first write chunk.  That is actually much saner
semantics (as much as you can call any of the legacy edge-triggered
expectations for EPOLLET "sane") since it means that you know the wakeup
will happen once the write is done, rather than possibly in the middle
of one.

[ For stable people: I'm putting a "Fixes" tag on this, but I leave it
  up to you to decide whether you actually want to backport it or not.
  It likely has no impact outside of synthetic benchmarks  - Linus ]

Link: https://lore.kernel.org/lkml/20210802024945.GA8372@xsang-OptiPlex-9020/
Fixes: 3a34b13a88 ("pipe: make pipe writes always wake up readers")
Reported-by: kernel test robot <oliver.sang@intel.com>
Tested-by: Sandeep Patil <sspatil@android.com>
Tested-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-18 11:39:46 -07:00
Linus Torvalds
472e5b056f pipe: remove pipe_wait() and fix wakeup race with splice
The pipe splice code still used the old model of waiting for pipe IO by
using a non-specific "pipe_wait()" that waited for any pipe event to
happen, which depended on all pipe IO being entirely serialized by the
pipe lock.  So by checking the state you were waiting for, and then
adding yourself to the wait queue before dropping the lock, you were
guaranteed to see all the wakeups.

Strictly speaking, the actual wakeups were not done under the lock, but
the pipe_wait() model still worked, because since the waiter held the
lock when checking whether it should sleep, it would always see the
current state, and the wakeup was always done after updating the state.

However, commit 0ddad21d3e ("pipe: use exclusive waits when reading or
writing") split the single wait-queue into two, and in the process also
made the "wait for event" code wait for _two_ wait queues, and that then
showed a race with the wakers that were not serialized by the pipe lock.

It's only splice that used that "pipe_wait()" model, so the problem
wasn't obvious, but Josef Bacik reports:

 "I hit a hang with fstest btrfs/187, which does a btrfs send into
  /dev/null. This works by creating a pipe, the write side is given to
  the kernel to write into, and the read side is handed to a thread that
  splices into a file, in this case /dev/null.

  The box that was hung had the write side stuck here [pipe_write] and
  the read side stuck here [splice_from_pipe_next -> pipe_wait].

  [ more details about pipe_wait() scenario ]

  The problem is we're doing the prepare_to_wait, which sets our state
  each time, however we can be woken up either with reads or writes. In
  the case above we race with the WRITER waking us up, and re-set our
  state to INTERRUPTIBLE, and thus never break out of schedule"

Josef had a patch that avoided the issue in pipe_wait() by just making
it set the state only once, but the deeper problem is that pipe_wait()
depends on a level of synchonization by the pipe mutex that it really
shouldn't.  And the whole "wait for any pipe state change" model really
isn't very good to begin with.

So rather than trying to work around things in pipe_wait(), remove that
legacy model of "wait for arbitrary pipe event" entirely, and actually
create functions that wait for the pipe actually being readable or
writable, and can do so without depending on the pipe lock serializing
everything.

Fixes: 0ddad21d3e ("pipe: use exclusive waits when reading or writing")
Link: https://lore.kernel.org/linux-fsdevel/bfa88b5ad6f069b2b679316b9e495a970130416c.1601567868.git.josef@toxicpanda.com/
Reported-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-and-tested-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-01 19:14:36 -07:00
Linus Torvalds
6c32978414 Merge tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull notification queue from David Howells:
 "This adds a general notification queue concept and adds an event
  source for keys/keyrings, such as linking and unlinking keys and
  changing their attributes.

  Thanks to Debarshi Ray, we do have a pull request to use this to fix a
  problem with gnome-online-accounts - as mentioned last time:

     https://gitlab.gnome.org/GNOME/gnome-online-accounts/merge_requests/47

  Without this, g-o-a has to constantly poll a keyring-based kerberos
  cache to find out if kinit has changed anything.

  [ There are other notification pending: mount/sb fsinfo notifications
    for libmount that Karel Zak and Ian Kent have been working on, and
    Christian Brauner would like to use them in lxc, but let's see how
    this one works first ]

  LSM hooks are included:

   - A set of hooks are provided that allow an LSM to rule on whether or
     not a watch may be set. Each of these hooks takes a different
     "watched object" parameter, so they're not really shareable. The
     LSM should use current's credentials. [Wanted by SELinux & Smack]

   - A hook is provided to allow an LSM to rule on whether or not a
     particular message may be posted to a particular queue. This is
     given the credentials from the event generator (which may be the
     system) and the watch setter. [Wanted by Smack]

  I've provided SELinux and Smack with implementations of some of these
  hooks.

  WHY
  ===

  Key/keyring notifications are desirable because if you have your
  kerberos tickets in a file/directory, your Gnome desktop will monitor
  that using something like fanotify and tell you if your credentials
  cache changes.

  However, we also have the ability to cache your kerberos tickets in
  the session, user or persistent keyring so that it isn't left around
  on disk across a reboot or logout. Keyrings, however, cannot currently
  be monitored asynchronously, so the desktop has to poll for it - not
  so good on a laptop. This facility will allow the desktop to avoid the
  need to poll.

  DESIGN DECISIONS
  ================

   - The notification queue is built on top of a standard pipe. Messages
     are effectively spliced in. The pipe is opened with a special flag:

        pipe2(fds, O_NOTIFICATION_PIPE);

     The special flag has the same value as O_EXCL (which doesn't seem
     like it will ever be applicable in this context)[?]. It is given up
     front to make it a lot easier to prohibit splice&co from accessing
     the pipe.

     [?] Should this be done some other way?  I'd rather not use up a new
         O_* flag if I can avoid it - should I add a pipe3() system call
         instead?

     The pipe is then configured::

        ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, queue_depth);
        ioctl(fds[1], IOC_WATCH_QUEUE_SET_FILTER, &filter);

     Messages are then read out of the pipe using read().

   - It should be possible to allow write() to insert data into the
     notification pipes too, but this is currently disabled as the
     kernel has to be able to insert messages into the pipe *without*
     holding pipe->mutex and the code to make this work needs careful
     auditing.

   - sendfile(), splice() and vmsplice() are disabled on notification
     pipes because of the pipe->mutex issue and also because they
     sometimes want to revert what they just did - but one or more
     notification messages might've been interleaved in the ring.

   - The kernel inserts messages with the wait queue spinlock held. This
     means that pipe_read() and pipe_write() have to take the spinlock
     to update the queue pointers.

   - Records in the buffer are binary, typed and have a length so that
     they can be of varying size.

     This allows multiple heterogeneous sources to share a common
     buffer; there are 16 million types available, of which I've used
     just a few, so there is scope for others to be used. Tags may be
     specified when a watchpoint is created to help distinguish the
     sources.

   - Records are filterable as types have up to 256 subtypes that can be
     individually filtered. Other filtration is also available.

   - Notification pipes don't interfere with each other; each may be
     bound to a different set of watches. Any particular notification
     will be copied to all the queues that are currently watching for it
     - and only those that are watching for it.

   - When recording a notification, the kernel will not sleep, but will
     rather mark a queue as having lost a message if there's
     insufficient space. read() will fabricate a loss notification
     message at an appropriate point later.

   - The notification pipe is created and then watchpoints are attached
     to it, using one of:

        keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, fds[1], 0x01);
        watch_mount(AT_FDCWD, "/", 0, fd, 0x02);
        watch_sb(AT_FDCWD, "/mnt", 0, fd, 0x03);

     where in both cases, fd indicates the queue and the number after is
     a tag between 0 and 255.

   - Watches are removed if either the notification pipe is destroyed or
     the watched object is destroyed. In the latter case, a message will
     be generated indicating the enforced watch removal.

  Things I want to avoid:

   - Introducing features that make the core VFS dependent on the
     network stack or networking namespaces (ie. usage of netlink).

   - Dumping all this stuff into dmesg and having a daemon that sits
     there parsing the output and distributing it as this then puts the
     responsibility for security into userspace and makes handling
     namespaces tricky. Further, dmesg might not exist or might be
     inaccessible inside a container.

   - Letting users see events they shouldn't be able to see.

  TESTING AND MANPAGES
  ====================

   - The keyutils tree has a pipe-watch branch that has keyctl commands
     for making use of notifications. Proposed manual pages can also be
     found on this branch, though a couple of them really need to go to
     the main manpages repository instead.

     If the kernel supports the watching of keys, then running "make
     test" on that branch will cause the testing infrastructure to spawn
     a monitoring process on the side that monitors a notifications pipe
     for all the key/keyring changes induced by the tests and they'll
     all be checked off to make sure they happened.

        https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log/?h=pipe-watch

   - A test program is provided (samples/watch_queue/watch_test) that
     can be used to monitor for keyrings, mount and superblock events.
     Information on the notifications is simply logged to stdout"

* tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  smack: Implement the watch_key and post_notification hooks
  selinux: Implement the watch_key security hook
  keys: Make the KEY_NEED_* perms an enum rather than a mask
  pipe: Add notification lossage handling
  pipe: Allow buffers to be marked read-whole-or-error for notifications
  Add sample notification program
  watch_queue: Add a key/keyring notification facility
  security: Add hooks to rule on setting a watch
  pipe: Add general notification queue support
  pipe: Add O_NOTIFICATION_PIPE
  security: Add a hook for the point of notification insertion
  uapi: General notification queue definitions
2020-06-13 09:56:21 -07:00
Christoph Hellwig
c928f642c2 fs: rename pipe_buf ->steal to ->try_steal
And replace the arcane return value convention with a simple bool
where true means success and false means failure.

[AV: braino fix folded in]

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 12:14:10 -04:00
Christoph Hellwig
b8d9e7f241 fs: make the pipe_buf_operations ->confirm operation optional
Just return 0 for success if it is not present.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 12:11:26 -04:00
Christoph Hellwig
76887c2567 fs: make the pipe_buf_operations ->steal operation optional
Just return 1 for failure if it is not present.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 12:11:26 -04:00
Christoph Hellwig
f6dd975583 pipe: merge anon_pipe_buf*_ops
All the op vectors are exactly the same, they are just used to encode
packet or nomerge behavior.  There already is a flag for the packet
behavior, so just add a new one to allow for merging.  Inverting it vs
the previous nomerge special casing actually allows for much nicer code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-20 12:11:26 -04:00
David Howells
e7d553d69c pipe: Add notification lossage handling
Add handling for loss of notifications by having read() insert a
loss-notification message after it has read the pipe buffer that was last
in the ring when the loss occurred.

Lossage can come about either by running out of notification descriptors or
by running out of space in the pipe ring.

Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-19 15:40:28 +01:00
David Howells
8cfba76383 pipe: Allow buffers to be marked read-whole-or-error for notifications
Allow a buffer to be marked such that read() must return the entire buffer
in one go or return ENOBUFS.  Multiple buffers can be amalgamated into a
single read, but a short read will occur if the next "whole" buffer won't
fit.

This is useful for watch queue notifications to make sure we don't split a
notification across multiple reads, especially given that we need to
fabricate an overrun record under some circumstances - and that isn't in
the buffers.

Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-19 15:38:18 +01:00
David Howells
c73be61ced pipe: Add general notification queue support
Make it possible to have a general notification queue built on top of a
standard pipe.  Notifications are 'spliced' into the pipe and then read
out.  splice(), vmsplice() and sendfile() are forbidden on pipes used for
notifications as post_one_notification() cannot take pipe->mutex.  This
means that notifications could be posted in between individual pipe
buffers, making iov_iter_revert() difficult to effect.

The way the notification queue is used is:

 (1) An application opens a pipe with a special flag and indicates the
     number of messages it wishes to be able to queue at once (this can
     only be set once):

	pipe2(fds, O_NOTIFICATION_PIPE);
	ioctl(fds[0], IOC_WATCH_QUEUE_SET_SIZE, queue_depth);

 (2) The application then uses poll() and read() as normal to extract data
     from the pipe.  read() will return multiple notifications if the
     buffer is big enough, but it will not split a notification across
     buffers - rather it will return a short read or EMSGSIZE.

     Notification messages include a length in the header so that the
     caller can split them up.

Each message has a header that describes it:

	struct watch_notification {
		__u32	type:24;
		__u32	subtype:8;
		__u32	info;
	};

The type indicates the source (eg. mount tree changes, superblock events,
keyring changes, block layer events) and the subtype indicates the event
type (eg. mount, unmount; EIO, EDQUOT; link, unlink).  The info field
indicates a number of things, including the entry length, an ID assigned to
a watchpoint contributing to this buffer and type-specific flags.

Supplementary data, such as the key ID that generated an event, can be
attached in additional slots.  The maximum message size is 127 bytes.
Messages may not be padded or aligned, so there is no guarantee, for
example, that the notification type will be on a 4-byte bounary.

Signed-off-by: David Howells <dhowells@redhat.com>
2020-05-19 15:08:24 +01:00
Randy Dunlap
0bf999f9c5 linux/pipe_fs_i.h: fix kernel-doc warnings after @wait was split
Fix kernel-doc warnings in struct pipe_inode_info after @wait was
split into @rd_wait and @wr_wait.

  include/linux/pipe_fs_i.h:66: warning: Function parameter or member 'rd_wait' not described in 'pipe_inode_info'
  include/linux/pipe_fs_i.h:66: warning: Function parameter or member 'wr_wait' not described in 'pipe_inode_info'

Fixes: 0ddad21d3e ("pipe: use exclusive waits when reading or writing")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-12 11:54:08 -08:00
Linus Torvalds
0ddad21d3e pipe: use exclusive waits when reading or writing
This makes the pipe code use separate wait-queues and exclusive waiting
for readers and writers, avoiding a nasty thundering herd problem when
there are lots of readers waiting for data on a pipe (or, less commonly,
lots of writers waiting for a pipe to have space).

While this isn't a common occurrence in the traditional "use a pipe as a
data transport" case, where you typically only have a single reader and
a single writer process, there is one common special case: using a pipe
as a source of "locking tokens" rather than for data communication.

In particular, the GNU make jobserver code ends up using a pipe as a way
to limit parallelism, where each job consumes a token by reading a byte
from the jobserver pipe, and releases the token by writing a byte back
to the pipe.

This pattern is fairly traditional on Unix, and works very well, but
will waste a lot of time waking up a lot of processes when only a single
reader needs to be woken up when a writer releases a new token.

A simplified test-case of just this pipe interaction is to create 64
processes, and then pass a single token around between them (this
test-case also intentionally passes another token that gets ignored to
test the "wake up next" logic too, in case anybody wonders about it):

    #include <unistd.h>

    int main(int argc, char **argv)
    {
        int fd[2], counters[2];

        pipe(fd);
        counters[0] = 0;
        counters[1] = -1;
        write(fd[1], counters, sizeof(counters));

        /* 64 processes */
        fork(); fork(); fork(); fork(); fork(); fork();

        do {
                int i;
                read(fd[0], &i, sizeof(i));
                if (i < 0)
                        continue;
                counters[0] = i+1;
                write(fd[1], counters, (1+(i & 1)) *sizeof(int));
        } while (counters[0] < 1000000);
        return 0;
    }

and in a perfect world, passing that token around should only cause one
context switch per transfer, when the writer of a token causes a
directed wakeup of just a single reader.

But with the "writer wakes all readers" model we traditionally had, on
my test box the above case causes more than an order of magnitude more
scheduling: instead of the expected ~1M context switches, "perf stat"
shows

        231,852.37 msec task-clock                #   15.857 CPUs utilized
        11,250,961      context-switches          #    0.049 M/sec
           616,304      cpu-migrations            #    0.003 M/sec
             1,648      page-faults               #    0.007 K/sec
 1,097,903,998,514      cycles                    #    4.735 GHz
   120,781,778,352      instructions              #    0.11  insn per cycle
    27,997,056,043      branches                  #  120.754 M/sec
       283,581,233      branch-misses             #    1.01% of all branches

      14.621273891 seconds time elapsed

       0.018243000 seconds user
       3.611468000 seconds sys

before this commit.

After this commit, I get

          5,229.55 msec task-clock                #    3.072 CPUs utilized
         1,212,233      context-switches          #    0.232 M/sec
           103,951      cpu-migrations            #    0.020 M/sec
             1,328      page-faults               #    0.254 K/sec
    21,307,456,166      cycles                    #    4.074 GHz
    12,947,819,999      instructions              #    0.61  insn per cycle
     2,881,985,678      branches                  #  551.096 M/sec
        64,267,015      branch-misses             #    2.23% of all branches

       1.702148350 seconds time elapsed

       0.004868000 seconds user
       0.110786000 seconds sys

instead. Much better.

[ Note! This kernel improvement seems to be very good at triggering a
  race condition in the make jobserver (in GNU make 4.2.1) for me. It's
  a long known bug that was fixed back in June 2017 by GNU make commit
  b552b0525198 ("[SV 51159] Use a non-blocking read with pselect to
  avoid hangs.").

  But there wasn't a new release of GNU make until 4.3 on Jan 19 2020,
  so a number of distributions may still have the buggy version. Some
  have backported the fix to their 4.2.1 release, though, and even
  without the fix it's quite timing-dependent whether the bug actually
  is hit. ]

Josh Triplett says:
 "I've been hammering on your pipe fix patch (switching to exclusive
  wait queues) for a month or so, on several different systems, and I've
  run into no issues with it. The patch *substantially* improves
  parallel build times on large (~100 CPU) systems, both with parallel
  make and with other things that use make's pipe-based jobserver.

  All current distributions (including stable and long-term stable
  distributions) have versions of GNU make that no longer have the
  jobserver bug"

Tested-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-08 11:39:19 -08:00
Linus Torvalds
a28c8b9db8 pipe: remove 'waiting_writers' merging logic
This code is ancient, and goes back to when we only had a single page
for the pipe buffers.  The exact history is hidden in the mists of time
(ie "before git", and in fact predates the BK repository too).

At that long-ago point in time, it actually helped to try to merge big
back-and-forth pipe reads and writes, and not limit pipe reads to the
single pipe buffer in length just because that was all we had at a time.

However, since then we've expanded the pipe buffers to multiple pages,
and this logic really doesn't seem to make sense.  And a lot of it is
somewhat questionable (ie "hmm, the user asked for a non-blocking read,
but we see that there's a writer pending, so let's wait anyway to get
the extra data that the writer will have").

But more importantly, it makes the "go to sleep" logic much less
obvious, and considering the wakeup issues we've had, I want to make for
less of those kinds of things.

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-07 13:21:01 -08:00
David Howells
6718b6f855 pipe: Allow pipes to have kernel-reserved slots
Split pipe->ring_size into two numbers:

 (1) pipe->ring_size - indicates the hard size of the pipe ring.

 (2) pipe->max_usage - indicates the maximum number of pipe ring slots that
     userspace orchestrated events can fill.

This allows for a pipe that is both writable by the general kernel
notification facility and by userspace, allowing plenty of ring space for
notifications to be added whilst preventing userspace from being able to
pin too much unswappable kernel space.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-11-15 16:22:54 +00:00
David Howells
8cefc107ca pipe: Use head and tail pointers for the ring, not cursor and length
Convert pipes to use head and tail pointers for the buffer ring rather than
pointer and length as the latter requires two atomic ops to update (or a
combined op) whereas the former only requires one.

 (1) The head pointer is the point at which production occurs and points to
     the slot in which the next buffer will be placed.  This is equivalent
     to pipe->curbuf + pipe->nrbufs.

     The head pointer belongs to the write-side.

 (2) The tail pointer is the point at which consumption occurs.  It points
     to the next slot to be consumed.  This is equivalent to pipe->curbuf.

     The tail pointer belongs to the read-side.

 (3) head and tail are allowed to run to UINT_MAX and wrap naturally.  They
     are only masked off when the array is being accessed, e.g.:

	pipe->bufs[head & mask]

     This means that it is not necessary to have a dead slot in the ring as
     head == tail isn't ambiguous.

 (4) The ring is empty if "head == tail".

     A helper, pipe_empty(), is provided for this.

 (5) The occupancy of the ring is "head - tail".

     A helper, pipe_occupancy(), is provided for this.

 (6) The number of free slots in the ring is "pipe->ring_size - occupancy".

     A helper, pipe_space_for_user() is provided to indicate how many slots
     userspace may use.

 (7) The ring is full if "head - tail >= pipe->ring_size".

     A helper, pipe_full(), is provided for this.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-10-31 15:12:34 +00:00
Linus Torvalds
e9e1a2e7b4 Merge tag 'trace-v5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
 "Three tracing fixes:

   - Use "nosteal" for ring buffer splice pages

   - Memory leak fix in error path of trace_pid_write()

   - Fix preempt_enable_no_resched() (use preempt_enable()) in ring
     buffer code"

* tag 'trace-v5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  trace: Fix preempt_enable_no_resched() abuse
  tracing: Fix a memory leak by early error exit in trace_pid_write()
  tracing: Fix buffer_ref pipe ops
2019-04-26 11:09:55 -07:00
Jann Horn
b987222654 tracing: Fix buffer_ref pipe ops
This fixes multiple issues in buffer_pipe_buf_ops:

 - The ->steal() handler must not return zero unless the pipe buffer has
   the only reference to the page. But generic_pipe_buf_steal() assumes
   that every reference to the pipe is tracked by the page's refcount,
   which isn't true for these buffers - buffer_pipe_buf_get(), which
   duplicates a buffer, doesn't touch the page's refcount.
   Fix it by using generic_pipe_buf_nosteal(), which refuses every
   attempted theft. It should be easy to actually support ->steal, but the
   only current users of pipe_buf_steal() are the virtio console and FUSE,
   and they also only use it as an optimization. So it's probably not worth
   the effort.
 - The ->get() and ->release() handlers can be invoked concurrently on pipe
   buffers backed by the same struct buffer_ref. Make them safe against
   concurrency by using refcount_t.
 - The pointers stored in ->private were only zeroed out when the last
   reference to the buffer_ref was dropped. As far as I know, this
   shouldn't be necessary anyway, but if we do it, let's always do it.

Link: http://lkml.kernel.org/r/20190404215925.253531-1-jannh@google.com

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
Fixes: 73a757e631 ("ring-buffer: Return reader page back into existing ring buffer")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-26 11:44:39 -04:00
Linus Torvalds
6b3a707736 Merge branch 'page-refs' (page ref overflow)
Merge page ref overflow branch.

Jann Horn reported that he can overflow the page ref count with
sufficient memory (and a filesystem that is intentionally extremely
slow).

Admittedly it's not exactly easy.  To have more than four billion
references to a page requires a minimum of 32GB of kernel memory just
for the pointers to the pages, much less any metadata to keep track of
those pointers.  Jann needed a total of 140GB of memory and a specially
crafted filesystem that leaves all reads pending (in order to not ever
free the page references and just keep adding more).

Still, we have a fairly straightforward way to limit the two obvious
user-controllable sources of page references: direct-IO like page
references gotten through get_user_pages(), and the splice pipe page
duplication.  So let's just do that.

* branch page-refs:
  fs: prevent page refcount overflow in pipe_buf_get
  mm: prevent get_user_pages() from overflowing page refcount
  mm: add 'try_get_page()' helper function
  mm: make page ref count overflow check tighter and more explicit
2019-04-14 15:09:40 -07:00
Matthew Wilcox
15fab63e1e fs: prevent page refcount overflow in pipe_buf_get
Change pipe_buf_get() to return a bool indicating whether it succeeded
in raising the refcount of the page (if the thing in the pipe is a page).
This removes another mechanism for overflowing the page refcount.  All
callers converted to handle a failure.

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-14 10:00:04 -07:00
Jann Horn
01e7187b41 pipe: stop using ->can_merge
Al Viro pointed out that since there is only one pipe buffer type to which
new data can be appended, it isn't necessary to have a ->can_merge field in
struct pipe_buf_operations, we can just check for a magic type.

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-01 02:01:45 -05:00
Jann Horn
a0ce2f0aa6 splice: don't merge into linked buffers
Before this patch, it was possible for two pipes to affect each other after
data had been transferred between them with tee():

============
$ cat tee_test.c

int main(void) {
  int pipe_a[2];
  if (pipe(pipe_a)) err(1, "pipe");
  int pipe_b[2];
  if (pipe(pipe_b)) err(1, "pipe");
  if (write(pipe_a[1], "abcd", 4) != 4) err(1, "write");
  if (tee(pipe_a[0], pipe_b[1], 2, 0) != 2) err(1, "tee");
  if (write(pipe_b[1], "xx", 2) != 2) err(1, "write");

  char buf[5];
  if (read(pipe_a[0], buf, 4) != 4) err(1, "read");
  buf[4] = 0;
  printf("got back: '%s'\n", buf);
}
$ gcc -o tee_test tee_test.c
$ ./tee_test
got back: 'abxx'
$
============

As suggested by Al Viro, fix it by creating a separate type for
non-mergeable pipe buffers, then changing the types of buffers in
splice_pipe_to_pipe() and link_pipe().

Cc: <stable@vger.kernel.org>
Fixes: 7c77f0b3f9 ("splice: implement pipe to pipe splicing")
Fixes: 70524490ee ("[PATCH] splice: add support for sys_tee()")
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-01 02:01:45 -05:00
Eric Biggers
96e99be40e pipe: reject F_SETPIPE_SZ with size over UINT_MAX
A pipe's size is represented as an 'unsigned int'.  As expected, writing a
value greater than UINT_MAX to /proc/sys/fs/pipe-max-size fails with
EINVAL.  However, the F_SETPIPE_SZ fcntl silently truncates such values to
32 bits, rather than failing with EINVAL as expected.  (It *does* fail
with EINVAL for values above (1 << 31) but <= UINT_MAX.)

Fix this by moving the check against UINT_MAX into round_pipe_size() which
is called in both cases.

Link: http://lkml.kernel.org/r/20180111052902.14409-6-ebiggers3@gmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Luis R . Rodriguez" <mcgrof@kernel.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:47 -08:00
Eric Biggers
319e0a21bb pipe, sysctl: remove pipe_proc_fn()
pipe_proc_fn() is no longer needed, as it only calls through to
proc_dopipe_max_size().  Just put proc_dopipe_max_size() in the ctl_table
entry directly, and remove the unneeded EXPORT_SYMBOL() and the ENOSYS
stub for it.

(The reason the ENOSYS stub isn't needed is that the pipe-max-size
ctl_table entry is located directly in 'kern_table' rather than being
registered separately.  Therefore, the entry is already only defined when
the kernel is built with sysctl support.)

Link: http://lkml.kernel.org/r/20180111052902.14409-3-ebiggers3@gmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Luis R . Rodriguez" <mcgrof@kernel.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:47 -08:00
Eric Biggers
4c2e4befb3 pipe, sysctl: drop 'min' parameter from pipe-max-size converter
Patch series "pipe: buffer limits fixes and cleanups", v2.

This series simplifies the sysctl handler for pipe-max-size and fixes
another set of bugs related to the pipe buffer limits:

- The root user wasn't allowed to exceed the limits when creating new
  pipes.

- There was an off-by-one error when checking the limits, so a limit of
  N was actually treated as N - 1.

- F_SETPIPE_SZ accepted values over UINT_MAX.

- Reading the pipe buffer limits could be racy.

This patch (of 7):

Before validating the given value against pipe_min_size,
do_proc_dopipe_max_size_conv() calls round_pipe_size(), which rounds the
value up to pipe_min_size.  Therefore, the second check against
pipe_min_size is redundant.  Remove it.

Link: http://lkml.kernel.org/r/20180111052902.14409-2-ebiggers3@gmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Luis R . Rodriguez" <mcgrof@kernel.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:47 -08:00
Joe Lawrence
7a8d181949 pipe: add proc_dopipe_max_size() to safely assign pipe_max_size
pipe_max_size is assigned directly via procfs sysctl:

  static struct ctl_table fs_table[] = {
          ...
          {
                  .procname       = "pipe-max-size",
                  .data           = &pipe_max_size,
                  .maxlen         = sizeof(int),
                  .mode           = 0644,
                  .proc_handler   = &pipe_proc_fn,
                  .extra1         = &pipe_min_size,
          },
          ...

  int pipe_proc_fn(struct ctl_table *table, int write, void __user *buf,
                   size_t *lenp, loff_t *ppos)
  {
          ...
          ret = proc_dointvec_minmax(table, write, buf, lenp, ppos)
          ...

and then later rounded in-place a few statements later:

          ...
          pipe_max_size = round_pipe_size(pipe_max_size);
          ...

This leaves a window of time between initial assignment and rounding
that may be visible to other threads.  (For example, one thread sets a
non-rounded value to pipe_max_size while another reads its value.)

Similar reads of pipe_max_size are potentially racy:

  pipe.c :: alloc_pipe_info()
  pipe.c :: pipe_set_size()

Add a new proc_dopipe_max_size() that consolidates reading the new value
from the user buffer, verifying bounds, and calling round_pipe_size()
with a single assignment to pipe_max_size.

Link: http://lkml.kernel.org/r/1507658689-11669-4-git-send-email-joe.lawrence@redhat.com
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:03 -08:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Miklos Szeredi
a949e63992 pipe: fix comment in pipe_buf_operations
Map and unmap ops no longer exist.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-05 18:24:00 -04:00
Miklos Szeredi
ca76f5b6bd pipe: add pipe_buf_steal() helper
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-05 18:23:59 -04:00
Miklos Szeredi
fba597db42 pipe: add pipe_buf_confirm() helper
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-05 18:23:59 -04:00
Miklos Szeredi
a779638cf6 pipe: add pipe_buf_release() helper
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-05 18:23:58 -04:00
Miklos Szeredi
7bf2d1df80 pipe: add pipe_buf_get() helper
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-05 18:23:57 -04:00
Willy Tarreau
759c01142a pipe: limit the per-user amount of pages allocated in pipes
On no-so-small systems, it is possible for a single process to cause an
OOM condition by filling large pipes with data that are never read. A
typical process filling 4000 pipes with 1 MB of data will use 4 GB of
memory. On small systems it may be tricky to set the pipe max size to
prevent this from happening.

This patch makes it possible to enforce a per-user soft limit above
which new pipes will be limited to a single page, effectively limiting
them to 4 kB each, as well as a hard limit above which no new pipes may
be created for this user. This has the effect of protecting the system
against memory abuse without hurting other users, and still allowing
pipes to work correctly though with less data at once.

The limit are controlled by two new sysctls : pipe-user-pages-soft, and
pipe-user-pages-hard. Both may be disabled by setting them to zero. The
default soft limit allows the default number of FDs per process (1024)
to create pipes of the default size (64kB), thus reaching a limit of 64MB
before starting to create only smaller pipes. With 256 processes limited
to 1024 FDs each, this results in 1024*64kB + (256*1024 - 1024) * 4kB =
1084 MB of memory allocated for a user. The hard limit is disabled by
default to avoid breaking existing applications that make intensive use
of pipes (eg: for splicing).

Reported-by: socketpair@gmail.com
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Mitigates: CVE-2013-4312 (Linux 2.0+)
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-19 19:25:21 -05:00
Linus Torvalds
5166701b36 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "The first vfs pile, with deep apologies for being very late in this
  window.

  Assorted cleanups and fixes, plus a large preparatory part of iov_iter
  work.  There's a lot more of that, but it'll probably go into the next
  merge window - it *does* shape up nicely, removes a lot of
  boilerplate, gets rid of locking inconsistencie between aio_write and
  splice_write and I hope to get Kent's direct-io rewrite merged into
  the same queue, but some of the stuff after this point is having
  (mostly trivial) conflicts with the things already merged into
  mainline and with some I want more testing.

  This one passes LTP and xfstests without regressions, in addition to
  usual beating.  BTW, readahead02 in ltp syscalls testsuite has started
  giving failures since "mm/readahead.c: fix readahead failure for
  memoryless NUMA nodes and limit readahead pages" - might be a false
  positive, might be a real regression..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  missing bits of "splice: fix racy pipe->buffers uses"
  cifs: fix the race in cifs_writev()
  ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure
  kill generic_file_buffered_write()
  ocfs2_file_aio_write(): switch to generic_perform_write()
  ceph_aio_write(): switch to generic_perform_write()
  xfs_file_buffered_aio_write(): switch to generic_perform_write()
  export generic_perform_write(), start getting rid of generic_file_buffer_write()
  generic_file_direct_write(): get rid of ppos argument
  btrfs_file_aio_write(): get rid of ppos
  kill the 5th argument of generic_file_buffered_write()
  kill the 4th argument of __generic_file_aio_write()
  lustre: don't open-code kernel_recvmsg()
  ocfs2: don't open-code kernel_recvmsg()
  drbd: don't open-code kernel_recvmsg()
  constify blk_rq_map_user_iov() and friends
  lustre: switch to kernel_sendmsg()
  ocfs2: don't open-code kernel_sendmsg()
  take iov_iter stuff to mm/iov_iter.c
  process_vm_access: tidy up a bit
  ...
2014-04-12 14:49:50 -07:00
Al Viro
fbb32750a6 pipe: kill ->map() and ->unmap()
all pipe_buffer_operations have the same instances of those...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01 23:19:19 -04:00
Jiri Kosina
d4263348f7 Merge branch 'master' into for-next 2014-02-20 14:54:28 +01:00
Masanari Iida
e227867f12 treewide: Fix typo in Documentation/DocBook
This patch fix spelling typo in Documentation/DocBook.
It is because .html and .xml files are generated by make htmldocs,
I have to fix a typo within the source files.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-02-19 14:58:17 +01:00
Miklos Szeredi
28a625cbc2 fuse: fix pipe_buf_operations
Having this struct in module memory could Oops when if the module is
unloaded while the buffer still persists in a pipe.

Since sock_pipe_buf_ops is essentially the same as fuse_dev_pipe_buf_steal
merge them into nosteal_pipe_buf_ops (this is the same as
default_pipe_buf_ops except stealing the page from the buffer is not
allowed).

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: stable@vger.kernel.org
2014-01-22 19:36:57 +01:00
Al Viro
4b8a8f1e4f get rid of the last free_pipe_info() callers
and rename __free_pipe_info() to free_pipe_info()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09 14:13:02 -04:00
Al Viro
7bee130e22 get rid of alloc_pipe_info() argument
not used anymore

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09 14:13:01 -04:00
Al Viro
6447a3cf19 get rid of pipe->inode
it's used only as a flag to distinguish normal pipes/FIFOs from the
internal per-task one used by file-to-file splice.  And pipe->files
would work just as well for that purpose...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09 14:13:01 -04:00
Al Viro
72b0d9aacb pipe: don't use ->i_mutex
now it can be done - put mutex into pipe_inode_info, use it instead
of ->i_mutex

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09 14:13:00 -04:00
Al Viro
ba5bb14733 pipe: take allocation and freeing of pipe_inode_info out of ->i_mutex
* new field - pipe->files; number of struct file over that pipe (all
  sharing the same inode, of course); protected by inode->i_lock.
* pipe_release() decrements pipe->files, clears inode->i_pipe when
  if the counter has reached 0 (all under ->i_lock) and, in that case,
  frees pipe after having done pipe_unlock()
* fifo_open() starts with grabbing ->i_lock, and either bumps pipe->files
  if ->i_pipe was non-NULL or allocates a new pipe (dropping and regaining
  ->i_lock) and rechecks ->i_pipe; if it's still NULL, inserts new pipe
  there, otherwise bumps ->i_pipe->files and frees the one we'd allocated.
  At that point we know that ->i_pipe is non-NULL and won't go away, so
  we can do pipe_lock() on it and proceed as we used to.  If we end up
  failing, decrement pipe->files and if it reaches 0 clear ->i_pipe and
  free the sucker after pipe_unlock().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09 14:12:59 -04:00
Linus Torvalds
a0e881b7c1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull second vfs pile from Al Viro:
 "The stuff in there: fsfreeze deadlock fixes by Jan (essentially, the
  deadlock reproduced by xfstests 068), symlink and hardlink restriction
  patches, plus assorted cleanups and fixes.

  Note that another fsfreeze deadlock (emergency thaw one) is *not*
  dealt with - the series by Fernando conflicts a lot with Jan's, breaks
  userland ABI (FIFREEZE semantics gets changed) and trades the deadlock
  for massive vfsmount leak; this is going to be handled next cycle.
  There probably will be another pull request, but that stuff won't be
  in it."

Fix up trivial conflicts due to unrelated changes next to each other in
drivers/{staging/gdm72xx/usb_boot.c, usb/gadget/storage_common.c}

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits)
  delousing target_core_file a bit
  Documentation: Correct s_umount state for freeze_fs/unfreeze_fs
  fs: Remove old freezing mechanism
  ext2: Implement freezing
  btrfs: Convert to new freezing mechanism
  nilfs2: Convert to new freezing mechanism
  ntfs: Convert to new freezing mechanism
  fuse: Convert to new freezing mechanism
  gfs2: Convert to new freezing mechanism
  ocfs2: Convert to new freezing mechanism
  xfs: Convert to new freezing code
  ext4: Convert to new freezing mechanism
  fs: Protect write paths by sb_start_write - sb_end_write
  fs: Skip atime update on frozen filesystem
  fs: Add freezing handling to mnt_want_write() / mnt_drop_write()
  fs: Improve filesystem freezing handling
  switch the protection of percpu_counter list to spinlock
  nfsd: Push mnt_want_write() outside of i_mutex
  btrfs: Push mnt_want_write() outside of i_mutex
  fat: Push mnt_want_write() outside of i_mutex
  ...
2012-08-01 10:26:23 -07:00
Al Viro
e4fad8e5d2 consolidate pipe file creation
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-29 21:24:19 +04:00
Cong Wang
2164d33446 pipe: remove KM_USER0 from comments
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-07-24 15:27:34 +08:00
Linus Torvalds
9883035ae7 pipes: add a "packetized pipe" mode for writing
The actual internal pipe implementation is already really about
individual packets (called "pipe buffers"), and this simply exposes that
as a special packetized mode.

When we are in the packetized mode (marked by O_DIRECT as suggested by
Alan Cox), a write() on a pipe will not merge the new data with previous
writes, so each write will get a pipe buffer of its own.  The pipe
buffer is then marked with the PIPE_BUF_FLAG_PACKET flag, which in turn
will tell the reader side to break the read at that boundary (and throw
away any partial packet contents that do not fit in the read buffer).

End result: as long as you do writes less than PIPE_BUF in size (so that
the pipe doesn't have to split them up), you can now treat the pipe as a
packet interface, where each read() system call will read one packet at
a time.  You can just use a sufficiently big read buffer (PIPE_BUF is
sufficient, since bigger than that doesn't guarantee atomicity anyway),
and the return value of the read() will naturally give you the size of
the packet.

NOTE! We do not support zero-sized packets, and zero-sized reads and
writes to a pipe continue to be no-ops.  Also note that big packets will
currently be split at write time, but that the size at which that
happens is not really specified (except that it's bigger than PIPE_BUF).
Currently that limit is the system page size, but we might want to
explicitly support bigger packets some day.

The main user for this is going to be the autofs packet interface,
allowing us to stop having to care so deeply about exact packet sizes
(which have had bugs with 32/64-bit compatibility modes).  But user
space can create packetized pipes with "pipe2(fd, O_DIRECT)", which will
fail with an EINVAL on kernels that do not support this interface.

Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: David Miller <davem@davemloft.net>
Cc: Ian Kent <raven@themaw.net>
Cc: Thomas Meyer <thomas@m3y3r.de>
Cc: stable@kernel.org  # needed for systemd/autofs interaction fix
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-29 13:12:42 -07:00