182 Commits

Author SHA1 Message Date
Max Wang
e36db2ad7a Revert "hrtimers: Handle CPU state correctly on hotplug"
This reverts commit 15b453db41d36184cf0ccc21e7df624014ab6a1a.

Fix missing struct hrtimer_cpu_base initialize in CPU hotplug Online process when the device is awakened from a deep state by reverting
hrtimer referenced modifies in android13-5.15-2025-03_r1.

Bug:407861080
Bug:412524055
Change-Id: I8eebcdc59c1ae2a61a5032e07da98326a9484189
Signed-off-by: Max Wang <max.wang@unisoc.com>
(cherry picked from commit b9c7089e27cc61054769a30b1cdcc3976e386be1)
2025-04-29 15:23:45 -07:00
Greg Kroah-Hartman
fc74821cbc Merge 5.10.234 into android12-5.10-lts
Changes in 5.10.234
	ceph: give up on paths longer than PATH_MAX
	jbd2: flush filesystem device before updating tail sequence
	dm array: fix releasing a faulty array block twice in dm_array_cursor_end
	dm array: fix unreleased btree blocks on closing a faulty array cursor
	dm array: fix cursor index when skipping across block boundaries
	exfat: fix the infinite loop in exfat_readdir()
	ASoC: mediatek: disable buffer pre-allocation
	netfilter: nft_dynset: honor stateful expressions in set definition
	ieee802154: ca8210: Add missing check for kfifo_alloc() in ca8210_probe()
	net: 802: LLC+SNAP OID:PID lookup on start of skb data
	tcp/dccp: complete lockless accesses to sk->sk_max_ack_backlog
	tcp/dccp: allow a connection when sk_max_ack_backlog is zero
	net_sched: cls_flow: validate TCA_FLOW_RSHIFT attribute
	cxgb4: Avoid removal of uninserted tid
	tls: Fix tls_sw_sendmsg error handling
	netfilter: nf_tables: imbalance in flowtable binding
	netfilter: conntrack: clamp maximum hashtable size to INT_MAX
	afs: Fix the maximum cell name length
	dm thin: make get_first_thin use rcu-safe list first function
	dm-ebs: don't set the flag DM_TARGET_PASSES_INTEGRITY
	sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy
	sctp: sysctl: auth_enable: avoid using current->nsproxy
	drm/amd/display: Add check for granularity in dml ceil/floor helpers
	riscv: Fix sleeping in invalid context in die()
	ACPI: resource: Add TongFang GM5HG0A to irq1_edge_low_force_override[]
	ACPI: resource: Add Asus Vivobook X1504VAP to irq1_level_low_skip_override[]
	drm/amd/display: increase MAX_SURFACES to the value supported by hw
	scripts/sorttable: fix orc_sort_cmp() to maintain symmetry and transitivity
	md/raid5: fix atomicity violation in raid5_cache_count
	USB: serial: option: add MeiG Smart SRM815
	USB: serial: option: add Neoway N723-EA support
	staging: iio: ad9834: Correct phase range check
	staging: iio: ad9832: Correct phase range check
	usb-storage: Add max sectors quirk for Nokia 208
	USB: serial: cp210x: add Phoenix Contact UPS Device
	usb: dwc3: gadget: fix writing NYET threshold
	usb: gadget: u_serial: Disable ep before setting port to null to fix the crash caused by port being null
	USB: usblp: return error when setting unsupported protocol
	USB: core: Disable LPM only for non-suspended ports
	usb: fix reference leak in usb_new_device()
	usb: gadget: f_fs: Remove WARN_ON in functionfs_bind
	iio: pressure: zpa2326: fix information leak in triggered buffer
	iio: dummy: iio_simply_dummy_buffer: fix information leak in triggered buffer
	iio: light: vcnl4035: fix information leak in triggered buffer
	iio: imu: kmx61: fix information leak in triggered buffer
	iio: adc: ti-ads8688: fix information leak in triggered buffer
	iio: gyro: fxas21002c: Fix missing data update in trigger handler
	iio: adc: ti-ads124s08: Use gpiod_set_value_cansleep()
	iio: adc: at91: call input_free_device() on allocated iio_dev
	iio: inkern: call iio_device_put() only on mapped devices
	arm64: dts: rockchip: add #power-domain-cells to power domain nodes
	arm64: dts: rockchip: add hevc power domain clock to rk3328
	loop: let set_capacity_revalidate_and_notify update the bdev size
	nvme: let set_capacity_revalidate_and_notify update the bdev size
	sd: update the bdev size in sd_revalidate_disk
	block: remove the update_bdev parameter to set_capacity_revalidate_and_notify
	phy: usb: Add "wake on" functionality for newer Synopsis XHCI controllers
	phy: usb: Toggle the PHY power during init
	ocfs2: correct return value of ocfs2_local_free_info()
	ocfs2: fix slab-use-after-free due to dangling pointer dqi_priv
	drm: bridge: adv7511: Remove redundant null check before clk_disable_unprepare
	drm/mipi-dsi: Create devm device registration
	drm/mipi-dsi: Create devm device attachment
	drm/bridge: adv7533: Switch to devm MIPI-DSI helpers
	drm: bridge: adv7511: unregister cec i2c device after cec adapter
	drm: bridge: adv7511: use dev_err_probe in probe function
	drm: adv7511: Fix use-after-free in adv7533_attach_dsi()
	sctp: sysctl: rto_min/max: avoid using current->nsproxy
	phy: usb: Use slow clock for wake enabled suspend
	phy: usb: Fix clock imbalance for suspend/resume
	net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()
	bpf: Fix bpf_sk_select_reuseport() memory leak
	net: net_namespace: Optimize the code
	net: add exit_batch_rtnl() method
	gtp: use exit_batch_rtnl() method
	gtp: Use for_each_netdev_rcu() in gtp_genl_dump_pdp().
	gtp: Destroy device along with udp socket's netns dismantle.
	nfp: bpf: prevent integer overflow in nfp_bpf_event_output()
	net/mlx5: Add priorities for counters in RDMA namespaces
	net/mlx5: Refactor mlx5_get_flow_namespace
	net/mlx5: Fix RDMA TX steering prio
	drm/v3d: Ensure job pointer is set to NULL after job completion
	i2c: mux: demux-pinctrl: check initial mux selection, too
	i2c: rcar: fix NACK handling when being a target
	mac802154: check local interfaces before deleting sdata list
	hfs: Sanity check the root record
	fs: fix missing declaration of init_files
	kheaders: Ignore silly-rename files
	poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll()
	nvmet: propagate npwg topology
	x86/asm: Make serialize() always_inline
	net: ethernet: xgbe: re-add aneg to supported features in PHY quirks
	vsock/virtio: cancel close work in the destructor
	vsock: reset socket state when de-assigning the transport
	fs/proc: fix softlockup in __read_vmcore (part 2)
	gpiolib: cdev: Fix use after free in lineinfo_changed_notify
	irqchip/gic-v3: Handle CPU_PM_ENTER_FAILED correctly
	hrtimers: Handle CPU state correctly on hotplug
	Revert "PCI: Use preserve_config in place of pci_flags"
	iio: imu: inv_icm42600: fix spi burst write not supported
	iio: imu: inv_icm42600: fix timestamps after suspend if sensor is on
	iio: adc: rockchip_saradc: fix information leak in triggered buffer
	drm/radeon: check bo_va->bo is non-NULL before using it
	vmalloc: fix accounting with i915
	RDMA/hns: Fix deadlock on SRQ async events.
	blk-cgroup: Fix UAF in blkcg_unpin_online()
	ipv6: avoid possible NULL deref in rt6_uncached_list_flush_dev()
	nfsd: add list_head nf_gc to struct nfsd_file
	fou: remove warn in gue_gro_receive on unsupported protocol
	vsock/virtio: discard packets if the transport changes
	vsock: prevent null-ptr-deref in vsock_*[has_data|has_space]
	x86/xen: fix SLS mitigation in xen_hypercall_iret()
	scsi: sg: Fix slab-use-after-free read in sg_release()
	net: fix data-races around sk->sk_forward_alloc
	ASoC: wm8994: Add depends on MFD core
	ASoC: samsung: Add missing selects for MFD_WM8994
	seccomp: Stub for !CONFIG_SECCOMP
	scsi: iscsi: Fix redundant response for ISCSI_UEVENT_GET_HOST_STATS request
	irqchip/sunxi-nmi: Add missing SKIP_WAKE flag
	ASoC: samsung: Add missing depends on I2C
	gfs2: Truncate address space when flipping GFS2_DIF_JDATA flag
	net: sched: fix ets qdisc OOB Indexing
	m68k: Update ->thread.esp0 before calling syscall_trace() in ret_from_signal
	signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die
	vfio/platform: check the bounds of read/write syscalls
	Bluetooth: RFCOMM: Fix not validating setsockopt user input
	ipv4: ip_tunnel: Fix suspicious RCU usage warning in ip_tunnel_find()
	wifi: iwlwifi: add a few rate index validity checks
	USB: serial: quatech2: fix null-ptr-deref in qt2_process_read_urb()
	Revert "usb: gadget: u_serial: Disable ep before setting port to null to fix the crash caused by port being null"
	Input: atkbd - map F23 key to support default copilot shortcut
	Input: xpad - add unofficial Xbox 360 wireless receiver clone
	Input: xpad - add support for wooting two he (arm)
	drm/v3d: Assign job pointer to NULL before signaling the fence
	xhci: use pm_ptr() instead of #ifdef for CONFIG_PM conditionals
	Partial revert of xhci: use pm_ptr() instead #ifdef for CONFIG_PM conditionals
	Linux 5.10.234

Change-Id: I9a9b2d6c6fd99416deb58efc6543cc381d7a8e5f
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2025-02-06 11:48:45 +00:00
Koichiro Den
14984139f1 hrtimers: Handle CPU state correctly on hotplug
commit 2f8dea1692eef2b7ba6a256246ed82c365fdc686 upstream.

Consider a scenario where a CPU transitions from CPUHP_ONLINE to halfway
through a CPU hotunplug down to CPUHP_HRTIMERS_PREPARE, and then back to
CPUHP_ONLINE:

Since hrtimers_prepare_cpu() does not run, cpu_base.hres_active remains set
to 1 throughout. However, during a CPU unplug operation, the tick and the
clockevents are shut down at CPUHP_AP_TICK_DYING. On return to the online
state, for instance CFS incorrectly assumes that the hrtick is already
active, and the chance of the clockevent device to transition to oneshot
mode is also lost forever for the CPU, unless it goes back to a lower state
than CPUHP_HRTIMERS_PREPARE once.

This round-trip reveals another issue; cpu_base.online is not set to 1
after the transition, which appears as a WARN_ON_ONCE in enqueue_hrtimer().

Aside of that, the bulk of the per CPU state is not reset either, which
means there are dangling pointers in the worst case.

Address this by adding a corresponding startup() callback, which resets the
stale per CPU state and sets the online flag.

[ tglx: Make the new callback unconditionally available, remove the online
  	modification in the prepare() callback and clear the remaining
  	state in the starting callback instead of the prepare callback ]

Fixes: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier")
Signed-off-by: Koichiro Den <koichiro.den@canonical.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20241220134421.3809834-1-koichiro.den@canonical.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-01 18:22:29 +01:00
Greg Kroah-Hartman
47e789159e Revert "hrtimer: Report offline hrtimer enqueue"
This reverts commit b1f576be92 which is
commit dad6a09f3148257ac1773cd90934d721d68ab595 upstream.

It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.

Bug: 161946584
Change-Id: I3946038162ecfa5fafc8721ac4aaa8545ed540e2
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-04-16 09:50:36 +00:00
Greg Kroah-Hartman
66e91da883 Merge 5.10.210 into android12-5.10-lts
Changes in 5.10.210
	usb: cdns3: Fixes for sparse warnings
	usb: cdns3: fix uvc failure work since sg support enabled
	usb: cdns3: fix incorrect calculation of ep_buf_size when more than one config
	usb: cdns3: fix iso transfer error when mult is not zero
	usb: cdns3: Fix uvc fail when DMA cross 4k boundery since sg enabled
	PCI: mediatek: Clear interrupt status before dispatching handler
	units: change from 'L' to 'UL'
	units: add the HZ macros
	serial: sc16is7xx: set safe default SPI clock frequency
	spi: introduce SPI_MODE_X_MASK macro
	serial: sc16is7xx: add check for unsupported SPI modes during probe
	iio: adc: ad7091r: Set alert bit in config register
	iio: adc: ad7091r: Allow users to configure device events
	iio: adc: ad7091r: Enable internal vref if external vref is not supplied
	dmaengine: fix NULL pointer in channel unregistration function
	iio:adc:ad7091r: Move exports into IIO_AD7091R namespace.
	ext4: allow for the last group to be marked as trimmed
	crypto: api - Disallow identical driver names
	PM: hibernate: Enforce ordering during image compression/decompression
	hwrng: core - Fix page fault dead lock on mmap-ed hwrng
	crypto: s390/aes - Fix buffer overread in CTR mode
	rpmsg: virtio: Free driver_override when rpmsg_remove()
	bus: mhi: host: Drop chan lock before queuing buffers
	parisc/firmware: Fix F-extend for PDC addresses
	async: Split async_schedule_node_domain()
	async: Introduce async_schedule_dev_nocall()
	arm64: dts: qcom: sdm845: fix USB wakeup interrupt types
	arm64: dts: qcom: sdm845: fix USB DP/DM HS PHY interrupts
	lsm: new security_file_ioctl_compat() hook
	scripts/get_abi: fix source path leak
	mmc: core: Use mrq.sbc in close-ended ffu
	mmc: mmc_spi: remove custom DMA mapped buffers
	rtc: Adjust failure return code for cmos_set_alarm()
	nouveau/vmm: don't set addr on the fail path to avoid warning
	ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path
	rename(): fix the locking of subdirectories
	block: Remove special-casing of compound pages
	stddef: Introduce DECLARE_FLEX_ARRAY() helper
	smb3: Replace smb2pdu 1-element arrays with flex-arrays
	mm: vmalloc: introduce array allocation functions
	KVM: use __vcalloc for very large allocations
	net/smc: fix illegal rmb_desc access in SMC-D connection dump
	tcp: make sure init the accept_queue's spinlocks once
	bnxt_en: Wait for FLR to complete during probe
	vlan: skip nested type that is not IFLA_VLAN_QOS_MAPPING
	llc: make llc_ui_sendmsg() more robust against bonding changes
	llc: Drop support for ETH_P_TR_802_2.
	net/rds: Fix UBSAN: array-index-out-of-bounds in rds_cmsg_recv
	tracing: Ensure visibility when inserting an element into tracing_map
	afs: Hide silly-rename files from userspace
	tcp: Add memory barrier to tcp_push()
	netlink: fix potential sleeping issue in mqueue_flush_file
	ipv6: init the accept_queue's spinlocks in inet6_create
	net/mlx5: DR, Use the right GVMI number for drop action
	net/mlx5e: fix a double-free in arfs_create_groups
	netfilter: nf_tables: restrict anonymous set and map names to 16 bytes
	netfilter: nf_tables: validate NFPROTO_* family
	net: mvpp2: clear BM pool before initialization
	selftests: netdevsim: fix the udp_tunnel_nic test
	fjes: fix memleaks in fjes_hw_setup
	net: fec: fix the unhandled context fault from smmu
	btrfs: ref-verify: free ref cache before clearing mount opt
	btrfs: tree-checker: fix inline ref size in error messages
	btrfs: don't warn if discard range is not aligned to sector
	btrfs: defrag: reject unknown flags of btrfs_ioctl_defrag_range_args
	btrfs: don't abort filesystem when attempting to snapshot deleted subvolume
	rbd: don't move requests to the running list on errors
	exec: Fix error handling in begin_new_exec()
	wifi: iwlwifi: fix a memory corruption
	netfilter: nft_chain_filter: handle NETDEV_UNREGISTER for inet/ingress basechain
	netfilter: nf_tables: reject QUEUE/DROP verdict parameters
	gpiolib: acpi: Ignore touchpad wakeup on GPD G1619-04
	drm: Don't unref the same fb many times by mistake due to deadlock handling
	drm/bridge: nxp-ptn3460: fix i2c_master_send() error checking
	drm/tidss: Fix atomic_flush check
	drm/bridge: nxp-ptn3460: simplify some error checking
	PM: sleep: Use dev_printk() when possible
	PM: sleep: Avoid calling put_device() under dpm_list_mtx
	PM: core: Remove unnecessary (void *) conversions
	PM: sleep: Fix possible deadlocks in core system-wide PM code
	fs/pipe: move check to pipe_has_watch_queue()
	pipe: wakeup wr_wait after setting max_usage
	ARM: dts: samsung: exynos4210-i9100: Unconditionally enable LDO12
	arm64: dts: qcom: sc7180: Use pdc interrupts for USB instead of GIC interrupts
	arm64: dts: qcom: sc7180: fix USB wakeup interrupt types
	media: mtk-jpeg: Fix use after free bug due to error path handling in mtk_jpeg_dec_device_run
	mm: use __pfn_to_section() instead of open coding it
	mm/sparsemem: fix race in accessing memory_section->usage
	btrfs: remove err variable from btrfs_delete_subvolume
	btrfs: avoid copying BTRFS_ROOT_SUBVOL_DEAD flag to snapshot of subvolume being deleted
	drm: panel-simple: add missing bus flags for Tianma tm070jvhg[30/33]
	drm/exynos: fix accidental on-stack copy of exynos_drm_plane
	drm/exynos: gsc: minor fix for loop iteration in gsc_runtime_resume
	gpio: eic-sprd: Clear interrupt after set the interrupt type
	spi: bcm-qspi: fix SFDP BFPT read by usig mspi read
	mips: Call lose_fpu(0) before initializing fcr31 in mips_set_personality_nan
	tick/sched: Preserve number of idle sleeps across CPU hotplug events
	x86/entry/ia32: Ensure s32 is sign extended to s64
	powerpc/mm: Fix null-pointer dereference in pgtable_cache_add
	drivers/perf: pmuv3: don't expose SW_INCR event in sysfs
	powerpc: Fix build error due to is_valid_bugaddr()
	powerpc/mm: Fix build failures due to arch_reserved_kernel_pages()
	x86/boot: Ignore NMIs during very early boot
	powerpc: pmd_move_must_withdraw() is only needed for CONFIG_TRANSPARENT_HUGEPAGE
	powerpc/lib: Validate size for vector operations
	x86/mce: Mark fatal MCE's page as poison to avoid panic in the kdump kernel
	perf/core: Fix narrow startup race when creating the perf nr_addr_filters sysfs file
	debugobjects: Stop accessing objects after releasing hash bucket lock
	regulator: core: Only increment use_count when enable_count changes
	audit: Send netlink ACK before setting connection in auditd_set
	ACPI: video: Add quirk for the Colorful X15 AT 23 Laptop
	PNP: ACPI: fix fortify warning
	ACPI: extlog: fix NULL pointer dereference check
	PM / devfreq: Synchronize devfreq_monitor_[start/stop]
	ACPI: APEI: set memory failure flags as MF_ACTION_REQUIRED on synchronous events
	FS:JFS:UBSAN:array-index-out-of-bounds in dbAdjTree
	UBSAN: array-index-out-of-bounds in dtSplitRoot
	jfs: fix slab-out-of-bounds Read in dtSearch
	jfs: fix array-index-out-of-bounds in dbAdjTree
	jfs: fix uaf in jfs_evict_inode
	pstore/ram: Fix crash when setting number of cpus to an odd number
	crypto: stm32/crc32 - fix parsing list of devices
	afs: fix the usage of read_seqbegin_or_lock() in afs_lookup_volume_rcu()
	afs: fix the usage of read_seqbegin_or_lock() in afs_find_server*()
	rxrpc_find_service_conn_rcu: fix the usage of read_seqbegin_or_lock()
	jfs: fix array-index-out-of-bounds in diNewExt
	s390/ptrace: handle setting of fpc register correctly
	KVM: s390: fix setting of fpc register
	SUNRPC: Fix a suspicious RCU usage warning
	ecryptfs: Reject casefold directory inodes
	ext4: fix inconsistent between segment fstrim and full fstrim
	ext4: unify the type of flexbg_size to unsigned int
	ext4: remove unnecessary check from alloc_flex_gd()
	ext4: avoid online resizing failures due to oversized flex bg
	wifi: rt2x00: restart beacon queue when hardware reset
	selftests/bpf: satisfy compiler by having explicit return in btf test
	selftests/bpf: Fix pyperf180 compilation failure with clang18
	scsi: lpfc: Fix possible file string name overflow when updating firmware
	PCI: Add no PM reset quirk for NVIDIA Spectrum devices
	bonding: return -ENOMEM instead of BUG in alb_upper_dev_walk
	scsi: arcmsr: Support new PCI device IDs 1883 and 1886
	ARM: dts: imx7d: Fix coresight funnel ports
	ARM: dts: imx7s: Fix lcdif compatible
	ARM: dts: imx7s: Fix nand-controller #size-cells
	wifi: ath9k: Fix potential array-index-out-of-bounds read in ath9k_htc_txstatus()
	bpf: Add map and need_defer parameters to .map_fd_put_ptr()
	scsi: libfc: Don't schedule abort twice
	scsi: libfc: Fix up timeout error in fc_fcp_rec_error()
	bpf: Set uattr->batch.count as zero before batched update or deletion
	ARM: dts: rockchip: fix rk3036 hdmi ports node
	ARM: dts: imx25/27-eukrea: Fix RTC node name
	ARM: dts: imx: Use flash@0,0 pattern
	ARM: dts: imx27: Fix sram node
	ARM: dts: imx1: Fix sram node
	ionic: pass opcode to devcmd_wait
	block/rnbd-srv: Check for unlikely string overflow
	ARM: dts: imx25: Fix the iim compatible string
	ARM: dts: imx25/27: Pass timing0
	ARM: dts: imx27-apf27dev: Fix LED name
	ARM: dts: imx23-sansa: Use preferred i2c-gpios properties
	ARM: dts: imx23/28: Fix the DMA controller node name
	net: dsa: mv88e6xxx: Fix mv88e6352_serdes_get_stats error path
	block: prevent an integer overflow in bvec_try_merge_hw_page
	md: Whenassemble the array, consult the superblock of the freshest device
	arm64: dts: qcom: msm8996: Fix 'in-ports' is a required property
	arm64: dts: qcom: msm8998: Fix 'out-ports' is a required property
	wifi: rtl8xxxu: Add additional USB IDs for RTL8192EU devices
	wifi: rtlwifi: rtl8723{be,ae}: using calculate_bit_shift()
	wifi: cfg80211: free beacon_ies when overridden from hidden BSS
	Bluetooth: qca: Set both WIDEBAND_SPEECH and LE_STATES quirks for QCA2066
	Bluetooth: L2CAP: Fix possible multiple reject send
	i40e: Fix VF disable behavior to block all traffic
	f2fs: fix to check return value of f2fs_reserve_new_block()
	ALSA: hda: Refer to correct stream index at loops
	ASoC: doc: Fix undefined SND_SOC_DAPM_NOPM argument
	fast_dput(): handle underflows gracefully
	RDMA/IPoIB: Fix error code return in ipoib_mcast_join
	drm/amd/display: Fix tiled display misalignment
	f2fs: fix write pointers on zoned device after roll forward
	drm/drm_file: fix use of uninitialized variable
	drm/framebuffer: Fix use of uninitialized variable
	drm/mipi-dsi: Fix detach call without attach
	media: stk1160: Fixed high volume of stk1160_dbg messages
	media: rockchip: rga: fix swizzling for RGB formats
	PCI: add INTEL_HDA_ARL to pci_ids.h
	ALSA: hda: Intel: add HDA_ARL PCI ID support
	ALSA: hda: intel-dspcfg: add filters for ARL-S and ARL
	drm/exynos: Call drm_atomic_helper_shutdown() at shutdown/unbind time
	IB/ipoib: Fix mcast list locking
	media: ddbridge: fix an error code problem in ddb_probe
	drm/msm/dpu: Ratelimit framedone timeout msgs
	clk: hi3620: Fix memory leak in hi3620_mmc_clk_init()
	clk: mmp: pxa168: Fix memory leak in pxa168_clk_init()
	watchdog: it87_wdt: Keep WDTCTRL bit 3 unmodified for IT8784/IT8786
	drm/amdgpu: Let KFD sync with VM fences
	drm/amdgpu: Drop 'fence' check in 'to_amdgpu_amdkfd_fence()'
	leds: trigger: panic: Don't register panic notifier if creating the trigger failed
	um: Fix naming clash between UML and scheduler
	um: Don't use vfprintf() for os_info()
	um: net: Fix return type of uml_net_start_xmit()
	i3c: master: cdns: Update maximum prescaler value for i2c clock
	xen/gntdev: Fix the abuse of underlying struct page in DMA-buf import
	mfd: ti_am335x_tscadc: Fix TI SoC dependencies
	PCI: Only override AMD USB controller if required
	PCI: switchtec: Fix stdev_release() crash after surprise hot remove
	usb: hub: Replace hardcoded quirk value with BIT() macro
	tty: allow TIOCSLCKTRMIOS with CAP_CHECKPOINT_RESTORE
	fs/kernfs/dir: obey S_ISGID
	PCI/AER: Decode Requester ID when no error info found
	libsubcmd: Fix memory leak in uniq()
	virtio_net: Fix "‘%d’ directive writing between 1 and 11 bytes into a region of size 10" warnings
	blk-mq: fix IO hang from sbitmap wakeup race
	ceph: fix deadlock or deadcode of misusing dget()
	drm/amd/powerplay: Fix kzalloc parameter 'ATOM_Tonga_PPM_Table' in 'get_platform_power_management_table()'
	drm/amdgpu: Release 'adev->pm.fw' before return in 'amdgpu_device_need_post()'
	perf: Fix the nr_addr_filters fix
	wifi: cfg80211: fix RCU dereference in __cfg80211_bss_update
	drm: using mul_u32_u32() requires linux/math64.h
	scsi: isci: Fix an error code problem in isci_io_request_build()
	scsi: core: Introduce enum scsi_disposition
	scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler
	ip6_tunnel: use dev_sw_netstats_rx_add()
	ip6_tunnel: make sure to pull inner header in __ip6_tnl_rcv()
	net-zerocopy: Refactor frag-is-remappable test.
	tcp: add sanity checks to rx zerocopy
	ixgbe: Remove non-inclusive language
	ixgbe: Refactor returning internal error codes
	ixgbe: Refactor overtemp event handling
	ixgbe: Fix an error handling path in ixgbe_read_iosf_sb_reg_x550()
	ipv6: Ensure natural alignment of const ipv6 loopback and router addresses
	llc: call sock_orphan() at release time
	netfilter: nf_log: replace BUG_ON by WARN_ON_ONCE when putting logger
	netfilter: nft_ct: sanitize layer 3 and 4 protocol number in custom expectations
	net: ipv4: fix a memleak in ip_setup_cork
	af_unix: fix lockdep positive in sk_diag_dump_icons()
	net: sysfs: Fix /sys/class/net/<iface> path
	HID: apple: Add support for the 2021 Magic Keyboard
	HID: apple: Add 2021 magic keyboard FN key mapping
	bonding: remove print in bond_verify_device_path
	uapi: stddef.h: Fix __DECLARE_FLEX_ARRAY for C++
	PM: sleep: Fix error handling in dpm_prepare()
	dmaengine: fsl-dpaa2-qdma: Fix the size of dma pools
	dmaengine: ti: k3-udma: Report short packet errors
	dmaengine: fsl-qdma: Fix a memory leak related to the status queue DMA
	dmaengine: fsl-qdma: Fix a memory leak related to the queue command DMA
	phy: renesas: rcar-gen3-usb2: Fix returning wrong error code
	dmaengine: fix is_slave_direction() return false when DMA_DEV_TO_DEV
	phy: ti: phy-omap-usb2: Fix NULL pointer dereference for SRP
	drm/msm/dp: return correct Colorimetry for DP_TEST_DYNAMIC_RANGE_CEA case
	net: stmmac: xgmac: fix handling of DPP safety error for DMA channels
	selftests: net: avoid just another constant wait
	tunnels: fix out of bounds access when building IPv6 PMTU error
	atm: idt77252: fix a memleak in open_card_ubr0
	hwmon: (aspeed-pwm-tacho) mutex for tach reading
	hwmon: (coretemp) Fix out-of-bounds memory access
	hwmon: (coretemp) Fix bogus core_id to attr name mapping
	inet: read sk->sk_family once in inet_recv_error()
	rxrpc: Fix response to PING RESPONSE ACKs to a dead call
	tipc: Check the bearer type before calling tipc_udp_nl_bearer_add()
	ppp_async: limit MRU to 64K
	netfilter: nft_compat: reject unused compat flag
	netfilter: nft_compat: restrict match/target protocol to u16
	netfilter: nft_ct: reject direction for ct id
	netfilter: nft_set_pipapo: store index in scratch maps
	netfilter: nft_set_pipapo: add helper to release pcpu scratch area
	netfilter: nft_set_pipapo: remove scratch_aligned pointer
	scsi: core: Move scsi_host_busy() out of host lock if it is for per-command
	blk-iocost: Fix an UBSAN shift-out-of-bounds warning
	net/af_iucv: clean up a try_then_request_module()
	USB: serial: qcserial: add new usb-id for Dell Wireless DW5826e
	USB: serial: option: add Fibocom FM101-GL variant
	USB: serial: cp210x: add ID for IMST iM871A-USB
	usb: host: xhci-plat: Add support for XHCI_SG_TRB_CACHE_SIZE_QUIRK
	hrtimer: Report offline hrtimer enqueue
	Input: i8042 - fix strange behavior of touchpad on Clevo NS70PU
	Input: atkbd - skip ATKBD_CMD_SETLEDS when skipping ATKBD_CMD_GETID
	vhost: use kzalloc() instead of kmalloc() followed by memset()
	clocksource: Skip watchdog check for large watchdog intervals
	net: stmmac: xgmac: use #define for string constants
	net: stmmac: xgmac: fix a typo of register name in DPP safety handling
	netfilter: nft_set_rbtree: skip end interval element from gc
	btrfs: forbid creating subvol qgroups
	btrfs: do not ASSERT() if the newly created subvolume already got read
	btrfs: forbid deleting live subvol qgroup
	btrfs: send: return EOPNOTSUPP on unknown flags
	of: unittest: Fix compile in the non-dynamic case
	net: openvswitch: limit the number of recursions from action sets
	spi: ppc4xx: Drop write-only variable
	ASoC: rt5645: Fix deadlock in rt5645_jack_detect_work()
	net: sysfs: Fix /sys/class/net/<iface> path for statistics
	MIPS: Add 'memory' clobber to csum_ipv6_magic() inline assembler
	i40e: Fix waiting for queues of all VSIs to be disabled
	tracing/trigger: Fix to return error if failed to alloc snapshot
	mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again
	ALSA: hda/realtek: Fix the external mic not being recognised for Acer Swift 1 SF114-32
	ALSA: hda/realtek: Enable Mute LED on HP Laptop 14-fq0xxx
	HID: wacom: generic: Avoid reporting a serial of '0' to userspace
	HID: wacom: Do not register input devices until after hid_hw_start
	usb: ucsi_acpi: Fix command completion handling
	USB: hub: check for alternate port before enabling A_ALT_HNP_SUPPORT
	usb: f_mass_storage: forbid async queue when shutdown happen
	media: ir_toy: fix a memleak in irtoy_tx
	powerpc/kasan: Fix addr error caused by page alignment
	i2c: i801: Remove i801_set_block_buffer_mode
	i2c: i801: Fix block process call transactions
	modpost: trim leading spaces when processing source files list
	scsi: Revert "scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock"
	lsm: fix the logic in security_inode_getsecctx()
	firewire: core: correct documentation of fw_csr_string() kernel API
	kbuild: Fix changing ELF file type for output of gen_btf for big endian
	nfc: nci: free rx_data_reassembly skb on NCI device cleanup
	net: hsr: remove WARN_ONCE() in send_hsr_supervision_frame()
	xen-netback: properly sync TX responses
	ALSA: hda/realtek: Enable headset mic on Vaio VJFE-ADL
	binder: signal epoll threads of self-work
	misc: fastrpc: Mark all sessions as invalid in cb_remove
	ext4: fix double-free of blocks due to wrong extents moved_len
	tracing: Fix wasted memory in saved_cmdlines logic
	staging: iio: ad5933: fix type mismatch regression
	iio: magnetometer: rm3100: add boundary check for the value read from RM3100_REG_TMRC
	iio: accel: bma400: Fix a compilation problem
	media: rc: bpf attach/detach requires write permission
	hv_netvsc: Fix race condition between netvsc_probe and netvsc_remove
	ring-buffer: Clean ring_buffer_poll_wait() error return
	serial: max310x: set default value when reading clock ready bit
	serial: max310x: improve crystal stable clock detection
	x86/Kconfig: Transmeta Crusoe is CPU family 5, not 6
	x86/mm/ident_map: Use gbpages only where full GB page should be mapped.
	mmc: slot-gpio: Allow non-sleeping GPIO ro
	ALSA: hda/conexant: Add quirk for SWS JS201D
	nilfs2: fix data corruption in dsync block recovery for small block sizes
	nilfs2: fix hang in nilfs_lookup_dirty_data_buffers()
	crypto: ccp - Fix null pointer dereference in __sev_platform_shutdown_locked
	nfp: use correct macro for LengthSelect in BAR config
	nfp: flower: prevent re-adding mac index for bonded port
	wifi: mac80211: reload info pointer in ieee80211_tx_dequeue()
	irqchip/irq-brcmstb-l2: Add write memory barrier before exit
	irqchip/gic-v3-its: Fix GICv4.1 VPE affinity update
	s390/qeth: Fix potential loss of L3-IP@ in case of network issues
	ceph: prevent use-after-free in encode_cap_msg()
	of: property: fix typo in io-channels
	can: j1939: Fix UAF in j1939_sk_match_filter during setsockopt(SO_J1939_FILTER)
	pmdomain: core: Move the unused cleanup to a _sync initcall
	tracing: Inform kmemleak of saved_cmdlines allocation
	Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"
	bus: moxtet: Add spi device table
	PCI: dwc: endpoint: Fix dw_pcie_ep_raise_msix_irq() alignment support
	mips: Fix max_mapnr being uninitialized on early stages
	crypto: lib/mpi - Fix unexpected pointer access in mpi_ec_init
	serial: Add rs485_supported to uart_port
	serial: 8250_exar: Fill in rs485_supported
	serial: 8250_exar: Set missing rs485_supported flag
	scripts/decode_stacktrace.sh: silence stderr messages from addr2line/nm
	scripts/decode_stacktrace.sh: support old bash version
	scripts: decode_stacktrace: demangle Rust symbols
	scripts/decode_stacktrace.sh: optionally use LLVM utilities
	netfilter: ipset: fix performance regression in swap operation
	netfilter: ipset: Missing gc cancellations fixed
	hrtimer: Ignore slack time for RT tasks in schedule_hrtimeout_range()
	Revert "arm64: Stash shadow stack pointer in the task struct on interrupt"
	net: prevent mss overflow in skb_segment()
	sched/membarrier: reduce the ability to hammer on sys_membarrier
	nilfs2: fix potential bug in end_buffer_async_write
	nilfs2: replace WARN_ONs for invalid DAT metadata block requests
	dm: limit the number of targets and parameter size area
	PM: runtime: add devm_pm_runtime_enable helper
	PM: runtime: Have devm_pm_runtime_enable() handle pm_runtime_dont_use_autosuspend()
	drm/msm/dsi: Enable runtime PM
	netfilter: nf_tables: fix pointer math issue in nft_byteorder_eval()
	net: bcmgenet: Fix EEE implementation
	PCI: dwc: Fix a 64bit bug in dw_pcie_ep_raise_msix_irq()
	Linux 5.10.210

Change-Id: I5e7327f58dd6abd26ac2b1e328a81c1010d1147c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-04-10 07:10:03 +00:00
Frederic Weisbecker
b1f576be92 hrtimer: Report offline hrtimer enqueue
commit dad6a09f3148257ac1773cd90934d721d68ab595 upstream.

The hrtimers migration on CPU-down hotplug process has been moved
earlier, before the CPU actually goes to die. This leaves a small window
of opportunity to queue an hrtimer in a blind spot, leaving it ignored.

For example a practical case has been reported with RCU waking up a
SCHED_FIFO task right before the CPUHP_AP_IDLE_DEAD stage, queuing that
way a sched/rt timer to the local offline CPU.

Make sure such situations never go unnoticed and warn when that happens.

Fixes: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier")
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240129235646.3171983-4-boqun.feng@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-23 08:42:22 +01:00
Greg Kroah-Hartman
e710feda7e Revert "hrtimers: Push pending hrtimers away from outgoing CPU earlier"
This reverts commit 7f4c89400d which is
commit 5c0930ccaad5a74d74e8b18b648c5eb21ed2fe94 upstream.

It breaks the Android kernel abi and can be brought back in the future
in an abi-safe way if it is really needed.

Bug: 161946584
Change-Id: Ifdc5474de6e7488eb24959df1697b61b2e3e7880
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-12-18 19:17:40 +00:00
Greg Kroah-Hartman
001d2105f6 Merge 5.10.204 into android12-5.10-lts
Changes in 5.10.204
	hrtimers: Push pending hrtimers away from outgoing CPU earlier
	i2c: designware: Fix corrupted memory seen in the ISR
	netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test
	tg3: Move the [rt]x_dropped counters to tg3_napi
	tg3: Increment tx_dropped in tg3_tso_bug()
	kconfig: fix memory leak from range properties
	drm/amdgpu: correct chunk_ptr to a pointer to chunk.
	platform/x86: asus-wmi: Add support for SW_TABLET_MODE on UX360
	platform/x86: asus-nb-wmi: Allow configuring SW_TABLET_MODE method with a module option
	platform/x86: asus-nb-wmi: Add tablet_mode_sw=lid-flip quirk for the TP200s
	asus-wmi: Add dgpu disable method
	platform/x86: asus-wmi: Adjust tablet/lidflip handling to use enum
	platform/x86: asus-wmi: Add support for ROG X13 tablet mode
	platform/x86: asus-wmi: Simplify tablet-mode-switch probing
	platform/x86: asus-wmi: Simplify tablet-mode-switch handling
	platform/x86: asus-wmi: Move i8042 filter install to shared asus-wmi code
	of: base: Fix some formatting issues and provide missing descriptions
	of: Fix kerneldoc output formatting
	of: Add missing 'Return' section in kerneldoc comments
	of: dynamic: Fix of_reconfig_get_state_change() return value documentation
	ipv6: fix potential NULL deref in fib6_add()
	octeontx2-pf: Add missing mutex lock in otx2_get_pauseparam
	hv_netvsc: rndis_filter needs to select NLS
	mlxbf-bootctl: correctly identify secure boot with development keys
	net: arcnet: com20020 fix error handling
	arcnet: restoring support for multiple Sohard Arcnet cards
	i40e: Fix unexpected MFS warning message
	net: bnxt: fix a potential use-after-free in bnxt_init_tc
	ionic: fix snprintf format length warning
	ionic: Fix dim work handling in split interrupt mode
	ipv4: ip_gre: Avoid skb_pull() failure in ipgre_xmit()
	net: hns: fix fake link up on xge port
	netfilter: xt_owner: Fix for unsafe access of sk->sk_socket
	tcp: do not accept ACK of bytes we never sent
	bpf: sockmap, updating the sg structure should also update curr
	tee: optee: Fix supplicant based device enumeration
	arm64: dts: rockchip: Expand reg size of vdec node for RK3399
	RDMA/rtrs-clt: Remove the warnings for req in_use check
	RDMA/bnxt_re: Correct module description string
	hwmon: (acpi_power_meter) Fix 4.29 MW bug
	ASoC: wm_adsp: fix memleak in wm_adsp_buffer_populate
	tracing: Fix a warning when allocating buffered events fails
	scsi: be2iscsi: Fix a memleak in beiscsi_init_wrb_handle()
	ARM: imx: Check return value of devm_kasprintf in imx_mmdc_perf_init
	ARM: dts: imx7: Declare timers compatible with fsl,imx6dl-gpt
	riscv: fix misaligned access handling of C.SWSP and C.SDSP
	ALSA: pcm: fix out-of-bounds in snd_pcm_state_names
	ALSA: hda/realtek: Enable headset on Lenovo M90 Gen5
	nilfs2: fix missing error check for sb_set_blocksize call
	nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage()
	checkstack: fix printed address
	tracing: Always update snapshot buffer size
	tracing: Disable snapshot buffer when stopping instance tracers
	tracing: Fix incomplete locking when disabling buffered events
	tracing: Fix a possible race when disabling buffered events
	packet: Move reference count in packet_sock to atomic_long_t
	arm64: dts: mediatek: mt7622: fix memory node warning check
	arm64: dts: mediatek: mt8173-evb: Fix regulator-fixed node names
	arm64: dts: mediatek: mt8183: Fix unit address for scp reserved memory
	misc: mei: client.c: return negative error code in mei_cl_write
	misc: mei: client.c: fix problem of return '-EOVERFLOW' in mei_cl_write
	ring-buffer: Force absolute timestamp on discard of event
	tracing: Set actual size after ring buffer resize
	tracing: Stop current tracer when resizing buffer
	perf/core: Add a new read format to get a number of lost samples
	perf: Fix perf_event_validate_size()
	gpiolib: sysfs: Fix error handling on failed export
	drm/amdgpu: correct the amdgpu runtime dereference usage count
	usb: gadget: f_hid: fix report descriptor allocation
	parport: Add support for Brainboxes IX/UC/PX parallel cards
	Revert "xhci: Loosen RPM as default policy to cover for AMD xHC 1.1"
	usb: typec: class: fix typec_altmode_put_partner to put plugs
	ARM: PL011: Fix DMA support
	serial: sc16is7xx: address RX timeout interrupt errata
	serial: 8250: 8250_omap: Clear UART_HAS_RHR_IT_DIS bit
	serial: 8250: 8250_omap: Do not start RX DMA on THRI interrupt
	serial: 8250_omap: Add earlycon support for the AM654 UART controller
	x86/CPU/AMD: Check vendor in the AMD microcode callback
	KVM: s390/mm: Properly reset no-dat
	MIPS: Loongson64: Reserve vgabios memory on boot
	MIPS: Loongson64: Enable DMA noncoherent support
	io_uring/af_unix: disable sending io_uring over sockets
	netlink: don't call ->netlink_bind with table lock held
	genetlink: add CAP_NET_ADMIN test for multicast bind
	psample: Require 'CAP_NET_ADMIN' when joining "packets" group
	drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group
	netfilter: nft_set_pipapo: skip inactive elements during set walk
	platform/x86: asus-wmi: Fix kbd_dock_devid tablet-switch reporting
	tools headers UAPI: Sync linux/perf_event.h with the kernel sources
	platform/x86: asus-wmi: Document the dgpu_disable sysfs attribute
	mmc: block: Be sure to wait while busy in CQE error recovery
	Revert "btrfs: add dmesg output for first mount and last unmount of a filesystem"
	cifs: Fix non-availability of dedup breaking generic/304
	smb: client: fix potential NULL deref in parse_dfs_referrals()
	devcoredump : Serialize devcd_del work
	devcoredump: Send uevent once devcd is ready
	r8169: fix rtl8125b PAUSE frames blasting when suspended
	Linux 5.10.204

Change-Id: Ic65cbf2bdbf57c9cea815a17fcec35c0b72168a2
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-12-14 15:59:58 +00:00
Thomas Gleixner
7f4c89400d hrtimers: Push pending hrtimers away from outgoing CPU earlier
[ Upstream commit 5c0930ccaad5a74d74e8b18b648c5eb21ed2fe94 ]

2b8272ff4a70 ("cpu/hotplug: Prevent self deadlock on CPU hot-unplug")
solved the straight forward CPU hotplug deadlock vs. the scheduler
bandwidth timer. Yu discovered a more involved variant where a task which
has a bandwidth timer started on the outgoing CPU holds a lock and then
gets throttled. If the lock required by one of the CPU hotplug callbacks
the hotplug operation deadlocks because the unthrottling timer event is not
handled on the dying CPU and can only be recovered once the control CPU
reaches the hotplug state which pulls the pending hrtimers from the dead
CPU.

Solve this by pushing the hrtimers away from the dying CPU in the dying
callbacks. Nothing can queue a hrtimer on the dying CPU at that point because
all other CPUs spin in stop_machine() with interrupts disabled and once the
operation is finished the CPU is marked offline.

Reported-by: Yu Liao <liaoyu15@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Liu Tie <liutie4@huawei.com>
Link: https://lore.kernel.org/r/87a5rphara.ffs@tglx
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-13 18:26:56 +01:00
Greg Kroah-Hartman
2300418cc6 Merge 5.10.65 into android12-5.10-lts
Changes in 5.10.65
	locking/mutex: Fix HANDOFF condition
	regmap: fix the offset of register error log
	regulator: tps65910: Silence deferred probe error
	crypto: mxs-dcp - Check for DMA mapping errors
	sched/deadline: Fix reset_on_fork reporting of DL tasks
	power: supply: axp288_fuel_gauge: Report register-address on readb / writeb errors
	crypto: omap-sham - clear dma flags only after omap_sham_update_dma_stop()
	sched/deadline: Fix missing clock update in migrate_task_rq_dl()
	rcu/tree: Handle VM stoppage in stall detection
	EDAC/mce_amd: Do not load edac_mce_amd module on guests
	posix-cpu-timers: Force next expiration recalc after itimer reset
	hrtimer: Avoid double reprogramming in __hrtimer_start_range_ns()
	hrtimer: Ensure timerfd notification for HIGHRES=n
	udf: Check LVID earlier
	udf: Fix iocharset=utf8 mount option
	isofs: joliet: Fix iocharset=utf8 mount option
	bcache: add proper error unwinding in bcache_device_init
	blk-throtl: optimize IOPS throttle for large IO scenarios
	nvme-tcp: don't update queue count when failing to set io queues
	nvme-rdma: don't update queue count when failing to set io queues
	nvmet: pass back cntlid on successful completion
	power: supply: smb347-charger: Add missing pin control activation
	power: supply: max17042_battery: fix typo in MAx17042_TOFF
	s390/cio: add dev_busid sysfs entry for each subchannel
	s390/zcrypt: fix wrong offset index for APKA master key valid state
	libata: fix ata_host_start()
	crypto: omap - Fix inconsistent locking of device lists
	crypto: qat - do not ignore errors from enable_vf2pf_comms()
	crypto: qat - handle both source of interrupt in VF ISR
	crypto: qat - fix reuse of completion variable
	crypto: qat - fix naming for init/shutdown VF to PF notifications
	crypto: qat - do not export adf_iov_putmsg()
	fcntl: fix potential deadlock for &fasync_struct.fa_lock
	udf_get_extendedattr() had no boundary checks.
	s390/kasan: fix large PMD pages address alignment check
	s390/pci: fix misleading rc in clp_set_pci_fn()
	s390/debug: keep debug data on resize
	s390/debug: fix debug area life cycle
	s390/ap: fix state machine hang after failure to enable irq
	power: supply: cw2015: use dev_err_probe to allow deferred probe
	m68k: emu: Fix invalid free in nfeth_cleanup()
	sched/numa: Fix is_core_idle()
	sched: Fix UCLAMP_FLAG_IDLE setting
	rcu: Fix to include first blocked task in stall warning
	rcu: Add lockdep_assert_irqs_disabled() to rcu_sched_clock_irq() and callees
	rcu: Fix stall-warning deadlock due to non-release of rcu_node ->lock
	m68k: Fix invalid RMW_INSNS on CPUs that lack CAS
	block: return ELEVATOR_DISCARD_MERGE if possible
	spi: spi-fsl-dspi: Fix issue with uninitialized dma_slave_config
	spi: spi-pic32: Fix issue with uninitialized dma_slave_config
	genirq/timings: Fix error return code in irq_timings_test_irqs()
	irqchip/loongson-pch-pic: Improve edge triggered interrupt support
	lib/mpi: use kcalloc in mpi_resize
	clocksource/drivers/sh_cmt: Fix wrong setting if don't request IRQ for clock source channel
	block: nbd: add sanity check for first_minor
	spi: coldfire-qspi: Use clk_disable_unprepare in the remove function
	irqchip/gic-v3: Fix priority comparison when non-secure priorities are used
	crypto: qat - use proper type for vf_mask
	certs: Trigger creation of RSA module signing key if it's not an RSA key
	tpm: ibmvtpm: Avoid error message when process gets signal while waiting
	x86/mce: Defer processing of early errors
	spi: davinci: invoke chipselect callback
	blk-crypto: fix check for too-large dun_bytes
	regulator: vctrl: Use locked regulator_get_voltage in probe path
	regulator: vctrl: Avoid lockdep warning in enable/disable ops
	spi: sprd: Fix the wrong WDG_LOAD_VAL
	spi: spi-zynq-qspi: use wait_for_completion_timeout to make zynq_qspi_exec_mem_op not interruptible
	EDAC/i10nm: Fix NVDIMM detection
	drm/panfrost: Fix missing clk_disable_unprepare() on error in panfrost_clk_init()
	drm/gma500: Fix end of loop tests for list_for_each_entry
	ASoC: mediatek: mt8183: Fix Unbalanced pm_runtime_enable in mt8183_afe_pcm_dev_probe
	media: TDA1997x: enable EDID support
	leds: is31fl32xx: Fix missing error code in is31fl32xx_parse_dt()
	soc: rockchip: ROCKCHIP_GRF should not default to y, unconditionally
	media: cxd2880-spi: Fix an error handling path
	drm/of: free the right object
	bpf: Fix a typo of reuseport map in bpf.h.
	bpf: Fix potential memleak and UAF in the verifier.
	drm/of: free the iterator object on failure
	gve: fix the wrong AdminQ buffer overflow check
	libbpf: Fix the possible memory leak on error
	ARM: dts: aspeed-g6: Fix HVI3C function-group in pinctrl dtsi
	arm64: dts: renesas: r8a77995: draak: Remove bogus adv7511w properties
	i40e: improve locking of mac_filter_hash
	soc: qcom: rpmhpd: Use corner in power_off
	libbpf: Fix removal of inner map in bpf_object__create_map
	gfs2: Fix memory leak of object lsi on error return path
	firmware: fix theoretical UAF race with firmware cache and resume
	driver core: Fix error return code in really_probe()
	ionic: cleanly release devlink instance
	media: dvb-usb: fix uninit-value in dvb_usb_adapter_dvb_init
	media: dvb-usb: fix uninit-value in vp702x_read_mac_addr
	media: dvb-usb: Fix error handling in dvb_usb_i2c_init
	media: go7007: fix memory leak in go7007_usb_probe
	media: go7007: remove redundant initialization
	media: rockchip/rga: use pm_runtime_resume_and_get()
	media: rockchip/rga: fix error handling in probe
	media: coda: fix frame_mem_ctrl for YUV420 and YVU420 formats
	media: atomisp: fix the uninitialized use and rename "retvalue"
	Bluetooth: sco: prevent information leak in sco_conn_defer_accept()
	6lowpan: iphc: Fix an off-by-one check of array index
	drm/amdgpu/acp: Make PM domain really work
	tcp: seq_file: Avoid skipping sk during tcp_seek_last_pos
	ARM: dts: meson8: Use a higher default GPU clock frequency
	ARM: dts: meson8b: odroidc1: Fix the pwm regulator supply properties
	ARM: dts: meson8b: mxq: Fix the pwm regulator supply properties
	ARM: dts: meson8b: ec100: Fix the pwm regulator supply properties
	net/mlx5e: Prohibit inner indir TIRs in IPoIB
	net/mlx5e: Block LRO if firmware asks for tunneled LRO
	cgroup/cpuset: Fix a partition bug with hotplug
	drm: mxsfb: Enable recovery on underflow
	drm: mxsfb: Increase number of outstanding requests on V4 and newer HW
	drm: mxsfb: Clear FIFO_CLEAR bit
	net: cipso: fix warnings in netlbl_cipsov4_add_std
	Bluetooth: mgmt: Fix wrong opcode in the response for add_adv cmd
	arm64: dts: renesas: rzg2: Convert EtherAVB to explicit delay handling
	arm64: dts: renesas: hihope-rzg2-ex: Add EtherAVB internal rx delay
	devlink: Break parameter notification sequence to be before/after unload/load driver
	net/mlx5: Fix missing return value in mlx5_devlink_eswitch_inline_mode_set()
	i2c: highlander: add IRQ check
	leds: lt3593: Put fwnode in any case during ->probe()
	leds: trigger: audio: Add an activate callback to ensure the initial brightness is set
	media: em28xx-input: fix refcount bug in em28xx_usb_disconnect
	media: venus: venc: Fix potential null pointer dereference on pointer fmt
	PCI: PM: Avoid forcing PCI_D0 for wakeup reasons inconsistently
	PCI: PM: Enable PME if it can be signaled from D3cold
	bpf, samples: Add missing mprog-disable to xdp_redirect_cpu's optstring
	soc: qcom: smsm: Fix missed interrupts if state changes while masked
	debugfs: Return error during {full/open}_proxy_open() on rmmod
	Bluetooth: increase BTNAMSIZ to 21 chars to fix potential buffer overflow
	PM: EM: Increase energy calculation precision
	selftests/bpf: Fix bpf-iter-tcp4 test to print correctly the dest IP
	drm/msm/mdp4: refactor HW revision detection into read_mdp_hw_revision
	drm/msm/mdp4: move HW revision detection to earlier phase
	drm/msm/dpu: make dpu_hw_ctl_clear_all_blendstages clear necessary LMs
	arm64: dts: exynos: correct GIC CPU interfaces address range on Exynos7
	counter: 104-quad-8: Return error when invalid mode during ceiling_write
	cgroup/cpuset: Miscellaneous code cleanup
	cgroup/cpuset: Fix violation of cpuset locking rule
	ASoC: Intel: Fix platform ID matching
	Bluetooth: fix repeated calls to sco_sock_kill
	drm/msm/dsi: Fix some reference counted resource leaks
	net/mlx5: Register to devlink ingress VLAN filter trap
	net/mlx5: Fix unpublish devlink parameters
	ASoC: rt5682: Implement remove callback
	ASoC: rt5682: Properly turn off regulators if wrong device ID
	usb: dwc3: meson-g12a: add IRQ check
	usb: dwc3: qcom: add IRQ check
	usb: gadget: udc: at91: add IRQ check
	usb: gadget: udc: s3c2410: add IRQ check
	usb: phy: fsl-usb: add IRQ check
	usb: phy: twl6030: add IRQ checks
	usb: gadget: udc: renesas_usb3: Fix soc_device_match() abuse
	selftests/bpf: Fix test_core_autosize on big-endian machines
	devlink: Clear whole devlink_flash_notify struct
	samples: pktgen: add missing IPv6 option to pktgen scripts
	Bluetooth: Move shutdown callback before flushing tx and rx queue
	PM: cpu: Make notifier chain use a raw_spinlock_t
	usb: host: ohci-tmio: add IRQ check
	usb: phy: tahvo: add IRQ check
	libbpf: Re-build libbpf.so when libbpf.map changes
	mac80211: Fix insufficient headroom issue for AMSDU
	locking/lockdep: Mark local_lock_t
	locking/local_lock: Add missing owner initialization
	lockd: Fix invalid lockowner cast after vfs_test_lock
	nfsd4: Fix forced-expiry locking
	arm64: dts: marvell: armada-37xx: Extend PCIe MEM space
	clk: staging: correct reference to config IOMEM to config HAS_IOMEM
	i2c: synquacer: fix deferred probing
	firmware: raspberrypi: Keep count of all consumers
	firmware: raspberrypi: Fix a leak in 'rpi_firmware_get()'
	usb: gadget: mv_u3d: request_irq() after initializing UDC
	mm/swap: consider max pages in iomap_swapfile_add_extent
	lkdtm: replace SCSI_DISPATCH_CMD with SCSI_QUEUE_RQ
	Bluetooth: add timeout sanity check to hci_inquiry
	i2c: iop3xx: fix deferred probing
	i2c: s3c2410: fix IRQ check
	i2c: fix platform_get_irq.cocci warnings
	i2c: hix5hd2: fix IRQ check
	gfs2: init system threads before freeze lock
	rsi: fix error code in rsi_load_9116_firmware()
	rsi: fix an error code in rsi_probe()
	ASoC: Intel: kbl_da7219_max98927: Fix format selection for max98373
	ASoC: Intel: Skylake: Leave data as is when invoking TLV IPCs
	ASoC: Intel: Skylake: Fix module resource and format selection
	mmc: sdhci: Fix issue with uninitialized dma_slave_config
	mmc: dw_mmc: Fix issue with uninitialized dma_slave_config
	mmc: moxart: Fix issue with uninitialized dma_slave_config
	bpf: Fix possible out of bound write in narrow load handling
	CIFS: Fix a potencially linear read overflow
	i2c: mt65xx: fix IRQ check
	i2c: xlp9xx: fix main IRQ check
	usb: ehci-orion: Handle errors of clk_prepare_enable() in probe
	usb: bdc: Fix an error handling path in 'bdc_probe()' when no suitable DMA config is available
	usb: bdc: Fix a resource leak in the error handling path of 'bdc_probe()'
	tty: serial: fsl_lpuart: fix the wrong mapbase value
	ASoC: wcd9335: Fix a double irq free in the remove function
	ASoC: wcd9335: Fix a memory leak in the error handling path of the probe function
	ASoC: wcd9335: Disable irq on slave ports in the remove function
	iwlwifi: follow the new inclusive terminology
	iwlwifi: skip first element in the WTAS ACPI table
	ice: Only lock to update netdev dev_addr
	ath6kl: wmi: fix an error code in ath6kl_wmi_sync_point()
	atlantic: Fix driver resume flow.
	bcma: Fix memory leak for internally-handled cores
	brcmfmac: pcie: fix oops on failure to resume and reprobe
	ipv6: make exception cache less predictible
	ipv4: make exception cache less predictible
	net: sched: Fix qdisc_rate_table refcount leak when get tcf_block failed
	net: qualcomm: fix QCA7000 checksum handling
	octeontx2-af: Fix loop in free and unmap counter
	octeontx2-af: Fix static code analyzer reported issues
	octeontx2-af: Set proper errorcode for IPv4 checksum errors
	ipv4: fix endianness issue in inet_rtm_getroute_build_skb()
	ASoC: rt5682: Remove unused variable in rt5682_i2c_remove()
	iwlwifi Add support for ax201 in Samsung Galaxy Book Flex2 Alpha
	f2fs: guarantee to write dirty data when enabling checkpoint back
	time: Handle negative seconds correctly in timespec64_to_ns()
	io_uring: IORING_OP_WRITE needs hash_reg_file set
	bio: fix page leak bio_add_hw_page failure
	tty: Fix data race between tiocsti() and flush_to_ldisc()
	perf/x86/amd/ibs: Extend PERF_PMU_CAP_NO_EXCLUDE to IBS Op
	x86/resctrl: Fix a maybe-uninitialized build warning treated as error
	Revert "KVM: x86: mmu: Add guest physical address check in translate_gpa()"
	KVM: s390: index kvm->arch.idle_mask by vcpu_idx
	KVM: x86: Update vCPU's hv_clock before back to guest when tsc_offset is adjusted
	KVM: VMX: avoid running vmx_handle_exit_irqoff in case of emulation
	KVM: nVMX: Unconditionally clear nested.pi_pending on nested VM-Enter
	ARM: dts: at91: add pinctrl-{names, 0} for all gpios
	fuse: truncate pagecache on atomic_o_trunc
	fuse: flush extending writes
	IMA: remove -Wmissing-prototypes warning
	IMA: remove the dependency on CRYPTO_MD5
	fbmem: don't allow too huge resolutions
	backlight: pwm_bl: Improve bootloader/kernel device handover
	clk: kirkwood: Fix a clocking boot regression
	Linux 5.10.65

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie0b9306ba6ee4193de3200df7cdacaeba152b83e
2021-09-15 14:16:47 +02:00
Thomas Gleixner
3d12ccecfa hrtimer: Ensure timerfd notification for HIGHRES=n
[ Upstream commit 8c3b5e6ec0fee18bc2ce38d1dfe913413205f908 ]

If high resolution timers are disabled the timerfd notification about a
clock was set event is not happening for all cases which use
clock_was_set_delayed() because that's a NOP for HIGHRES=n, which is wrong.

Make clock_was_set_delayed() unconditially available to fix that.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210713135158.196661266@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15 09:50:25 +02:00
Greg Kroah-Hartman
3cb5c28964 ANDROID: GKI: hrtimer.h: add Android ABI padding to a structure
Try to mitigate potential future driver core api changes by adding a
padding to struct hrtimer.

Based on a change made to the RHEL/CENTOS 8 kernel.

Bug: 151154716
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5432e05386265281d993199599c6f9dcd17a9daf
2021-03-18 16:10:11 +01:00
Peter Zijlstra
0cd39f4600 locking/seqlock, headers: Untangle the spaghetti monster
By using lockdep_assert_*() from seqlock.h, the spaghetti monster
attacked.

Attack back by reducing seqlock.h dependencies from two key high level headers:

 - <linux/seqlock.h>:               -Remove <linux/ww_mutex.h>
 - <linux/time.h>:                  -Remove <linux/seqlock.h>
 - <linux/sched.h>:                 +Add    <linux/seqlock.h>

The price was to add it to sched.h ...

Core header fallout, we add direct header dependencies instead of gaining them
parasitically from higher level headers:

 - <linux/dynamic_queue_limits.h>:  +Add <asm/bug.h>
 - <linux/hrtimer.h>:               +Add <linux/seqlock.h>
 - <linux/ktime.h>:                 +Add <asm/bug.h>
 - <linux/lockdep.h>:               +Add <linux/smp.h>
 - <linux/sched.h>:                 +Add <linux/seqlock.h>
 - <linux/videodev2.h>:             +Add <linux/kernel.h>

Arch headers fallout:

 - PARISC: <asm/timex.h>:           +Add <asm/special_insns.h>
 - SH:     <asm/io.h>:              +Add <asm/page.h>
 - SPARC:  <asm/timer_64.h>:        +Add <uapi/asm/asi.h>
 - SPARC:  <asm/vvar.h>:            +Add <asm/processor.h>, <asm/barrier.h>
                                    -Remove <linux/seqlock.h>
 - X86:    <asm/fixmap.h>:          +Add <asm/pgtable_types.h>
                                    -Remove <asm/acpi.h>

There's also a bunch of parasitic header dependency fallout in .c files, not listed
separately.

[ mingo: Extended the changelog, split up & fixed the original patch. ]

Co-developed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200804133438.GK2674@hirez.programming.kicks-ass.net
2020-08-06 16:13:13 +02:00
Ahmed S. Darwish
af5a06b582 hrtimer: Use sequence counter with associated raw spinlock
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_raw_spinlock_t data type, which allows to associate
a raw spinlock with the sequence counter. This enables lockdep to verify
that the raw spinlock used for writer serialization is held when the
write side critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200720155530.1173732-25-a.darwish@linutronix.de
2020-07-29 16:14:29 +02:00
Andrei Vagin
ea2d1f7fce hrtimers: Prepare hrtimer_nanosleep() for time namespaces
clock_nanosleep() accepts absolute values of expiration time when
TIMER_ABSTIME flag is set. This absolute value is inside the task's
time namespace, and has to be converted to the host's time.

There is timens_ktime_to_host() helper for converting time, but
it accepts ktime argument.

As a preparation, make hrtimer_nanosleep() accept a clock value in ktime
instead of timespec64.

Co-developed-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20191112012724.250792-17-dima@arista.com
2020-01-14 12:20:55 +01:00
Eric Dumazet
56144737e6 hrtimer: Annotate lockless access to timer->state
syzbot reported various data-race caused by hrtimer_is_queued() reading
timer->state. A READ_ONCE() is required there to silence the warning.

Also add the corresponding WRITE_ONCE() when timer->state is set.

In remove_hrtimer() the hrtimer_is_queued() helper is open coded to avoid
loading timer->state twice.

KCSAN reported these cases:

BUG: KCSAN: data-race in __remove_hrtimer / tcp_pacing_check

write to 0xffff8880b2a7d388 of 1 bytes by interrupt on cpu 0:
 __remove_hrtimer+0x52/0x130 kernel/time/hrtimer.c:991
 __run_hrtimer kernel/time/hrtimer.c:1496 [inline]
 __hrtimer_run_queues+0x250/0x600 kernel/time/hrtimer.c:1576
 hrtimer_run_softirq+0x10e/0x150 kernel/time/hrtimer.c:1593
 __do_softirq+0x115/0x33f kernel/softirq.c:292
 run_ksoftirqd+0x46/0x60 kernel/softirq.c:603
 smpboot_thread_fn+0x37d/0x4a0 kernel/smpboot.c:165
 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352

read to 0xffff8880b2a7d388 of 1 bytes by task 24652 on cpu 1:
 tcp_pacing_check net/ipv4/tcp_output.c:2235 [inline]
 tcp_pacing_check+0xba/0x130 net/ipv4/tcp_output.c:2225
 tcp_xmit_retransmit_queue+0x32c/0x5a0 net/ipv4/tcp_output.c:3044
 tcp_xmit_recovery+0x7c/0x120 net/ipv4/tcp_input.c:3558
 tcp_ack+0x17b6/0x3170 net/ipv4/tcp_input.c:3717
 tcp_rcv_established+0x37e/0xf50 net/ipv4/tcp_input.c:5696
 tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1561
 sk_backlog_rcv include/net/sock.h:945 [inline]
 __release_sock+0x135/0x1e0 net/core/sock.c:2435
 release_sock+0x61/0x160 net/core/sock.c:2951
 sk_stream_wait_memory+0x3d7/0x7c0 net/core/stream.c:145
 tcp_sendmsg_locked+0xb47/0x1f30 net/ipv4/tcp.c:1393
 tcp_sendmsg+0x39/0x60 net/ipv4/tcp.c:1434
 inet_sendmsg+0x6d/0x90 net/ipv4/af_inet.c:807
 sock_sendmsg_nosec net/socket.c:637 [inline]
 sock_sendmsg+0x9f/0xc0 net/socket.c:657

BUG: KCSAN: data-race in __remove_hrtimer / __tcp_ack_snd_check

write to 0xffff8880a3a65588 of 1 bytes by interrupt on cpu 0:
 __remove_hrtimer+0x52/0x130 kernel/time/hrtimer.c:991
 __run_hrtimer kernel/time/hrtimer.c:1496 [inline]
 __hrtimer_run_queues+0x250/0x600 kernel/time/hrtimer.c:1576
 hrtimer_run_softirq+0x10e/0x150 kernel/time/hrtimer.c:1593
 __do_softirq+0x115/0x33f kernel/softirq.c:292
 invoke_softirq kernel/softirq.c:373 [inline]
 irq_exit+0xbb/0xe0 kernel/softirq.c:413
 exiting_irq arch/x86/include/asm/apic.h:536 [inline]
 smp_apic_timer_interrupt+0xe6/0x280 arch/x86/kernel/apic/apic.c:1137
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830

read to 0xffff8880a3a65588 of 1 bytes by task 22891 on cpu 1:
 __tcp_ack_snd_check+0x415/0x4f0 net/ipv4/tcp_input.c:5265
 tcp_ack_snd_check net/ipv4/tcp_input.c:5287 [inline]
 tcp_rcv_established+0x750/0xf50 net/ipv4/tcp_input.c:5708
 tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1561
 sk_backlog_rcv include/net/sock.h:945 [inline]
 __release_sock+0x135/0x1e0 net/core/sock.c:2435
 release_sock+0x61/0x160 net/core/sock.c:2951
 sk_stream_wait_memory+0x3d7/0x7c0 net/core/stream.c:145
 tcp_sendmsg_locked+0xb47/0x1f30 net/ipv4/tcp.c:1393
 tcp_sendmsg+0x39/0x60 net/ipv4/tcp.c:1434
 inet_sendmsg+0x6d/0x90 net/ipv4/af_inet.c:807
 sock_sendmsg_nosec net/socket.c:637 [inline]
 sock_sendmsg+0x9f/0xc0 net/socket.c:657
 __sys_sendto+0x21f/0x320 net/socket.c:1952
 __do_sys_sendto net/socket.c:1964 [inline]
 __se_sys_sendto net/socket.c:1960 [inline]
 __x64_sys_sendto+0x89/0xb0 net/socket.c:1960
 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 24652 Comm: syz-executor.3 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

[ tglx: Added comments ]

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191106174804.74723-1-edumazet@google.com
2019-11-06 23:18:31 +01:00
Sebastian Andrzej Siewior
a67e408241 hrtimer: Add kernel doc annotation for HRTIMER_MODE_HARD
Add kernel doc annotation for HRTIMER_MODE_HARD.

Fixes: ae6683d815 ("hrtimer: Introduce HARD expiry mode")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190823113845.12125-2-bigeasy@linutronix.de
2019-08-28 13:01:25 +02:00
Anna-Maria Gleixner
f61eff83ce hrtimer: Prepare support for PREEMPT_RT
When PREEMPT_RT is enabled, the soft interrupt thread can be preempted.  If
the soft interrupt thread is preempted in the middle of a timer callback,
then calling hrtimer_cancel() can lead to two issues:

  - If the caller is on a remote CPU then it has to spin wait for the timer
    handler to complete. This can result in unbound priority inversion.

  - If the caller originates from the task which preempted the timer
    handler on the same CPU, then spin waiting for the timer handler to
    complete is never going to end.

To avoid these issues, add a new lock to the timer base which is held
around the execution of the timer callbacks. If hrtimer_cancel() detects
that the timer callback is currently running, it blocks on the expiry
lock. When the callback is finished, the expiry lock is dropped by the
softirq thread which wakes up the waiter and the system makes progress.

This addresses both the priority inversion and the life lock issues.

The same issue can happen in virtual machines when the vCPU which runs a
timer callback is scheduled out. If a second vCPU of the same guest calls
hrtimer_cancel() it will spin wait for the other vCPU to be scheduled back
in. The expiry lock mechanism would avoid that. It'd be trivial to enable
this when paravirt spinlocks are enabled in a guest, but it's not clear
whether this is an actual problem in the wild, so for now it's an RT only
mechanism.

[ tglx: Refactored it for mainline ]

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190726185753.737767218@linutronix.de
2019-08-01 20:51:22 +02:00
Thomas Gleixner
0ab6a3ddba hrtimer: Make enqueue mode check work on RT
hrtimer_start_range_ns() has a WARN_ONCE() which verifies that a timer
which is marker for softirq expiry is not queued in the hard interrupt base
and vice versa.

When PREEMPT_RT is enabled, timers which are not explicitely marked to
expire in hard interrupt context are deferrred to the soft interrupt. So
the regular check would trigger.

Change the check, so when PREEMPT_RT is enabled, it is verified that the
timers marked for hard interrupt expiry are not tried to be queued for soft
interrupt expiry or any of the unmarked and softirq marked is tried to be
expired in hard interrupt context.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2019-08-01 20:51:19 +02:00
Sebastian Andrzej Siewior
ae6683d815 hrtimer: Introduce HARD expiry mode
On PREEMPT_RT not all hrtimers can be expired in hard interrupt context
even if that is perfectly fine on a PREEMPT_RT=n kernel, e.g. because they
take regular spinlocks. Also for latency reasons PREEMPT_RT tries to defer
most hrtimers' expiry into soft interrupt context.

But there are hrtimers which must be expired in hard interrupt context even
when PREEMPT_RT is enabled:

  - hrtimers which must expiry in hard interrupt context, e.g. scheduler,
    perf, watchdog related hrtimers

  - latency critical hrtimers, e.g. nanosleep, ..., kvm lapic timer

Add a new mode flag HRTIMER_MODE_HARD which allows to mark these timers so
PREEMPT_RT will not move them into softirq expiry mode.

[ tglx: Split out of a larger combo patch. Added changelog ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190726185752.981398465@linutronix.de
2019-08-01 17:43:16 +02:00
Thomas Gleixner
01656464fc hrtimer: Provide hrtimer_sleeper_start_expires()
hrtimer_sleepers will gain a scheduling class dependent treatment on
PREEMPT_RT. Create a wrapper around hrtimer_start_expires() to make that
possible.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2019-08-01 17:43:15 +02:00
Sebastian Andrzej Siewior
dbc1625fc9 hrtimer: Consolidate hrtimer_init() + hrtimer_init_sleeper() calls
hrtimer_init_sleeper() calls require prior initialisation of the hrtimer
object which is embedded into the hrtimer_sleeper.

Combine the initialization and spare a function call. Fixup all call sites.

This is also a preparatory change for PREEMPT_RT to do hrtimer sleeper
specific initializations of the embedded hrtimer without modifying any of
the call sites.

No functional change.

[ anna-maria: Minor cleanups ]
[ tglx: Adopted to the removal of the task argument of
  	hrtimer_init_sleeper() and trivial polishing.
	Folded a fix from Stephen Rothwell for the vsoc code ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190726185752.887468908@linutronix.de
2019-08-01 17:43:15 +02:00
Thomas Gleixner
b744948725 hrtimer: Remove task argument from hrtimer_init_sleeper()
All callers hand in 'current' and that's the only task pointer which
actually makes sense. Remove the task argument and set current in the
function.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190726185752.791885290@linutronix.de
2019-07-30 23:57:51 +02:00
Vincenzo Frascino
32e29396f0 hrtimer: Split out hrtimer defines into separate header
To avoid include dependency hell split out the hrtimer defines which are
required in the upcoming VDSO library into a separate header file.

[ tglx: Split out from the VDSO library patch and included ktime.h as
        the new header depends on it. ]

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Shijith Thotton <sthotton@marvell.com>
Tested-by: Andre Przywara <andre.przywara@arm.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Mark Salyzyn <salyzyn@android.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Huw Davies <huw@codeweavers.com>
Link: https://lkml.kernel.org/r/20190621095252.32307-3-vincenzo.frascino@arm.com
2019-06-22 21:21:04 +02:00
Thomas Gleixner
f49c174b5f hrtimers/tick/clockevents: Remove sloppy license references
"For licencing details see kernel-base/COPYING" and similar license
references have no value over the SPDX identifier. Remove them.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: David Riley <davidriley@chromium.org>
Cc: Colin Cross <ccross@android.com>
Cc: Mark Brown <broonie@kernel.org>
Link: https://lkml.kernel.org/r/20181031182252.963632760@linutronix.de
2018-11-23 11:51:21 +01:00
Thomas Gleixner
35728b8209 time: Add SPDX license identifiers
Update the time(r) core files files with the correct SPDX license
identifier based on the license text in the file itself. The SPDX
identifier is a legally binding shorthand, which can be used instead of the
full boiler plate text.

This work is based on a script and data from Philippe Ombredanne, Kate
Stewart and myself. The data has been created with two independent license
scanners and manual inspection.

The following files do not contain any direct license information and have
been omitted from the big initial SPDX changes:

  timeconst.bc: The .bc files were not touched
  time.c, timer.c, timekeeping.c: Licence was deduced from EXPORT_SYMBOL_GPL

As those files do not contain direct license references they fall under the
project license, i.e. GPL V2 only.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: David Riley <davidriley@chromium.org>
Cc: Colin Cross <ccross@android.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: https://lkml.kernel.org/r/20181031182252.879109557@linutronix.de
2018-11-23 11:51:20 +01:00
Thomas Gleixner
58c5fc2b96 time: Remove useless filenames in top level comments
Remove the pointless filenames in the top level comments. They have no
value at all and just occupy space. While at it tidy up some of the
comments and remove a stale one.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: David Riley <davidriley@chromium.org>
Cc: Colin Cross <ccross@android.com>
Cc: Mark Brown <broonie@kernel.org>
Link: https://lkml.kernel.org/r/20181031182252.794898238@linutronix.de
2018-11-23 11:51:20 +01:00
Thomas Gleixner
a3ed0e4393 Revert: Unify CLOCK_MONOTONIC and CLOCK_BOOTTIME
Revert commits

92af4dcb4e ("tracing: Unify the "boot" and "mono" tracing clocks")
127bfa5f43 ("hrtimer: Unify MONOTONIC and BOOTTIME clock behavior")
7250a4047a ("posix-timers: Unify MONOTONIC and BOOTTIME clock behavior")
d6c7270e91 ("timekeeping: Remove boot time specific code")
f2d6fdbfd2 ("Input: Evdev - unify MONOTONIC and BOOTTIME clock behavior")
d6ed449afd ("timekeeping: Make the MONOTONIC clock behave like the BOOTTIME clock")
72199320d4 ("timekeeping: Add the new CLOCK_MONOTONIC_ACTIVE clock")

As stated in the pull request for the unification of CLOCK_MONOTONIC and
CLOCK_BOOTTIME, it was clear that we might have to revert the change.

As reported by several folks systemd and other applications rely on the
documented behaviour of CLOCK_MONOTONIC on Linux and break with the above
changes. After resume daemons time out and other timeout related issues are
observed. Rafael compiled this list:

* systemd kills daemons on resume, after >WatchdogSec seconds
  of suspending (Genki Sky).  [Verified that that's because systemd uses
  CLOCK_MONOTONIC and expects it to not include the suspend time.]

* systemd-journald misbehaves after resume:
  systemd-journald[7266]: File /var/log/journal/016627c3c4784cd4812d4b7e96a34226/system.journal
corrupted or uncleanly shut down, renaming and replacing.
  (Mike Galbraith).

* NetworkManager reports "networking disabled" and networking is broken
  after resume 50% of the time (Pavel).  [May be because of systemd.]

* MATE desktop dims the display and starts the screensaver right after
  system resume (Pavel).

* Full system hang during resume (me).  [May be due to systemd or NM or both.]

That happens on debian and open suse systems.

It's sad, that these problems were neither catched in -next nor by those
folks who expressed interest in this change.

Reported-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Reported-by: Genki Sky <sky@genki.is>,
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kevin Easton <kevin@guarana.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salyzyn <salyzyn@android.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
2018-04-26 14:53:32 +02:00
Rafael J. Wysocki
51798deaff Merge branches 'pm-cpuidle' and 'pm-qos'
* pm-cpuidle:
  tick-sched: avoid a maybe-uninitialized warning
  cpuidle: Add definition of residency to sysfs documentation
  time: hrtimer: Use timerqueue_iterate_next() to get to the next timer
  nohz: Avoid duplication of code related to got_idle_tick
  nohz: Gather tick_sched booleans under a common flag field
  cpuidle: menu: Avoid selecting shallow states with stopped tick
  cpuidle: menu: Refine idle state selection for running tick
  sched: idle: Select idle state before stopping the tick
  time: hrtimer: Introduce hrtimer_next_event_without()
  time: tick-sched: Split tick_nohz_stop_sched_tick()
  cpuidle: Return nohz hint from cpuidle_select()
  jiffies: Introduce USER_TICK_USEC and redefine TICK_USEC
  sched: idle: Do not stop the tick before cpuidle_idle_call()
  sched: idle: Do not stop the tick upfront in the idle loop
  time: tick-sched: Reorganize idle tick management code

* pm-qos:
  PM / QoS: mark expected switch fall-throughs
2018-04-11 13:22:46 +02:00
Rafael J. Wysocki
a59855cd8c time: hrtimer: Introduce hrtimer_next_event_without()
The next set of changes will need to compute the time to the next
hrtimer event over all hrtimers except for the scheduler tick one.

To that end introduce a new helper function,
hrtimer_next_event_without(), for computing the time until the next
hrtimer event over all timers except for one and modify the underlying
code in __hrtimer_next_event_base() to prepare it for being called by
that new function.

No intentional changes in functionality.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
2018-04-07 18:49:54 +02:00
Thomas Gleixner
127bfa5f43 hrtimer: Unify MONOTONIC and BOOTTIME clock behavior
Now that th MONOTONIC and BOOTTIME clocks are indentical remove all the special
casing.

The user space visible interfaces still support both clocks, but their behavior
is identical.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kevin Easton <kevin@guarana.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salyzyn <salyzyn@android.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20180301165150.410218515@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-13 07:34:23 +01:00
Anna-Maria Gleixner
5da7016046 hrtimer: Implement support for softirq based hrtimers
hrtimer callbacks are always invoked in hard interrupt context. Several
users in tree require soft interrupt context for their callbacks and
achieve this by combining a hrtimer with a tasklet. The hrtimer schedules
the tasklet in hard interrupt context and the tasklet callback gets invoked
in softirq context later.

That's suboptimal and aside of that the real-time patch moves most of the
hrtimers into softirq context. So adding native support for hrtimers
expiring in softirq context is a valuable extension for both mainline and
the RT patch set.

Each valid hrtimer clock id has two associated hrtimer clock bases: one for
timers expiring in hardirq context and one for timers expiring in softirq
context.

Implement the functionality to associate a hrtimer with the hard or softirq
related clock bases and update the relevant functions to take them into
account when the next expiry time needs to be evaluated.

Add a check into the hard interrupt context handler functions to check
whether the first expiring softirq based timer has expired. If it's expired
the softirq is raised and the accounting of softirq based timers to
evaluate the next expiry time for programming the timer hardware is skipped
until the softirq processing has finished. At the end of the softirq
processing the regular processing is resumed.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-29-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16 09:51:22 +01:00
Anna-Maria Gleixner
98ecadd430 hrtimer: Add clock bases and hrtimer mode for softirq context
Currently hrtimer callback functions are always executed in hard interrupt
context. Users of hrtimers, which need their timer function to be executed
in soft interrupt context, make use of tasklets to get the proper context.

Add additional hrtimer clock bases for timers which must expire in softirq
context, so the detour via the tasklet can be avoided. This is also
required for RT, where the majority of hrtimer is moved into softirq
hrtimer context.

The selection of the expiry mode happens via a mode bit. Introduce
HRTIMER_MODE_SOFT and the matching combinations with the ABS/REL/PINNED
bits and update the decoding of hrtimer_mode in tracepoints.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-27-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16 03:00:50 +01:00
Anna-Maria Gleixner
11a9fe069e hrtimer: Make hrtimer_reprogramm() unconditional
hrtimer_reprogram() needs to be available unconditionally for softirq based
hrtimers. Move the function and all required struct members out of the
CONFIG_HIGH_RES_TIMERS #ifdef.

There is no functional change because hrtimer_reprogram() is only invoked
when hrtimer_cpu_base.hres_active is true. Making it unconditional
increases the text size for the CONFIG_HIGH_RES_TIMERS=n case, but avoids
replication of that code for the upcoming softirq based hrtimers support.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-18-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16 02:35:47 +01:00
Anna-Maria Gleixner
eb27926ba0 hrtimer: Make hrtimer_cpu_base.next_timer handling unconditional
hrtimer_cpu_base.next_timer stores the pointer to the next expiring timer
in a CPU base.

This pointer cannot be dereferenced and is solely used to check whether a
hrtimer which is removed is the hrtimer which is the first to expire in the
CPU base. If this is the case, then the timer hardware needs to be
reprogrammed to avoid an extra interrupt for nothing.

Again, this is conditional functionality, but there is no compelling reason
to make this conditional. As a preparation, hrtimer_cpu_base.next_timer
needs to be available unconditonally.

Aside of that the upcoming support for softirq based hrtimers requires access
to this pointer unconditionally as well, so our motivation is not entirely
simplicity based.

Make the update of hrtimer_cpu_base.next_timer unconditional and remove the
#ifdef cruft. The impact on CONFIG_HIGH_RES_TIMERS=n && CONFIG_NOHZ=n is
marginal as it's just a store on an already dirtied cacheline.

No functional change.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-17-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16 02:35:47 +01:00
Anna-Maria Gleixner
07a9a7eae8 hrtimer: Make the remote enqueue check unconditional
hrtimer_cpu_base.expires_next is used to cache the next event armed in the
timer hardware. The value is used to check whether an hrtimer can be
enqueued remotely. If the new hrtimer is expiring before expires_next, then
remote enqueue is not possible as the remote hrtimer hardware cannot be
accessed for reprogramming to an earlier expiry time.

The remote enqueue check is currently conditional on
CONFIG_HIGH_RES_TIMERS=y and hrtimer_cpu_base.hres_active. There is no
compelling reason to make this conditional.

Move hrtimer_cpu_base.expires_next out of the CONFIG_HIGH_RES_TIMERS=y
guarded area and remove the conditionals in hrtimer_check_target().

The check is currently a NOOP for the CONFIG_HIGH_RES_TIMERS=n and the
!hrtimer_cpu_base.hres_active case because in these cases nothing updates
hrtimer_cpu_base.expires_next yet. This will be changed with later patches
which further reduce the #ifdef zoo in this code.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-16-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16 02:35:47 +01:00
Anna-Maria Gleixner
28bfd18bf3 hrtimer: Make the hrtimer_cpu_base::hres_active field unconditional, to simplify the code
The hrtimer_cpu_base::hres_active_member field depends on CONFIG_HIGH_RES_TIMERS=y
currently, and all related functions to this member are conditional as well.

To simplify the code make it unconditional and set it to zero during initialization.

(This will also help with the upcoming softirq based hrtimers code.)

The conditional code sections can be avoided by adding IS_ENABLED(HIGHRES)
conditionals into common functions, which ensures dead code elimination.

There is no functional change.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-14-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16 02:35:47 +01:00
Anna-Maria Gleixner
da21c5a58a hrtimer: Make room in 'struct hrtimer_cpu_base'
The upcoming softirq based hrtimers support requires an additional field in
the hrtimer_cpu_base struct, which would grow the struct size beyond a
cache line.

The hrtimer_cpu_base::nr_retries and ::nr_hangs members are solely
used for diagnostic output and have no requirement to be 'unsigned int'.

Make them 'unsigned short' to create room for the new struct member.

No functional change.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-13-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16 02:35:46 +01:00
Anna-Maria Gleixner
3f0b9e8eec hrtimer: Store running timer in hrtimer_clock_base
The pointer to the currently running timer is stored in hrtimer_cpu_base
before the base lock is dropped and the callback is invoked.

This results in two levels of indirections and the upcoming support for
softirq based hrtimer requires splitting the "running" storage into soft
and hard IRQ context expiry.

Storing both in the cpu base would require conditionals in all code paths
accessing that information.

It's possible to have a per clock base sequence count and running pointer
without changing the semantics of the related mechanisms because the timer
base pointer cannot be changed while a timer is running the callback.

Unfortunately this makes cpu_clock base larger than 32 bytes on 32-bit
kernels. Instead of having huge gaps due to alignment, remove the alignment
and let the compiler pack CPU base for 32-bit kernels. The resulting cache access
patterns are fortunately not really different from the current
behaviour. On 64-bit kernels the 64-byte alignment stays and the behaviour is
unchanged. This was determined by analyzing the resulting layout and
looking at the number of cache lines involved for the frequently used
clocks.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-12-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16 02:35:46 +01:00
Anna-Maria Gleixner
19b51cb5ff hrtimer: Clean up 'enum hrtimer_mode'
It's not obvious that the HRTIMER_MODE variants are bit combinations,
because all modes are hard coded constants currently.

Change it so the bit meanings are clear; and use the symbols for creating
modes which combine bits.

While at it get rid of the ugly tail comments as well.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-8-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16 02:35:45 +01:00
Anna-Maria Gleixner
6de6250c75 hrtimer: Fix hrtimer_start[_range_ns]() function descriptions
The hrtimer_start[_range_ns]() functions start a timer reliably on this CPU only
when HRTIMER_MODE_PINNED is set.

Furthermore the HRTIMER_MODE_PINNED mode is not considered when a hrtimer is initialized.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-6-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16 02:35:45 +01:00
Anna-Maria Gleixner
907777136f hrtimer: Clean up the 'int clock' parameter of schedule_hrtimeout_range_clock()
schedule_hrtimeout_range_clock() uses an 'int clock' parameter for the
clock ID, instead of the customary predefined "clockid_t" type.

In hrtimer coding style the canonical variable name for the clock ID is
'clock_id', therefore change the name of the parameter here as well
to make it all consistent.

While at it, clean up the description for the 'clock_id' and 'mode'
function parameters. The clock modes and the clock IDs are not
restricted as the comment suggests.

Fix the mode description as well for the callers of schedule_hrtimeout_range_clock().

No functional changes intended.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-5-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16 02:35:44 +01:00
Anna-Maria Gleixner
1fbc78b3c9 hrtimer: Fix kerneldoc syntax for 'struct hrtimer_cpu_base'
The '/**' sequence marks the start of a structure description. Add the
missing second asterisk. While at it adapt the ordering of the struct
members to the struct definition and document the purpose of
expires_next more precisely.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-4-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16 02:35:44 +01:00
Thomas Gleixner
ae67badaa1 hrtimer: Optimize the hrtimer code by using static keys for migration_enable/nohz_active
The hrtimer_cpu_base::migration_enable and ::nohz_active fields
were originally introduced to avoid accessing global variables
for these decisions.

Still that results in a (cache hot) load and conditional branch,
which can be avoided by using static keys.

Implement it with static keys and optimize for the most critical
case of high performance networking which tends to disable the
timer migration functionality.

No change in functionality.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1801142327490.2371@nanos
Link: https://lkml.kernel.org/r/20171221104205.7269-2-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16 02:35:44 +01:00
Deepa Dinamani
c0edd7c9ac nanosleep: Use get_timespec64() and put_timespec64()
Usage of these apis and their compat versions makes
the syscalls: clock_nanosleep and nanosleep and
their compat implementations simpler.

This is a preparatory patch to isolate data conversions to
struct timespec64 at userspace boundaries. This helps contain
the changes needed to transition to new y2038 safe types.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-30 04:14:14 -04:00
Thomas Gleixner
938e7cf2d5 posix-timers: Make nanosleep timespec argument const
No nanosleep implementation modifies the rqtp argument. Mark is const.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
2017-06-14 00:00:47 +02:00
Al Viro
fb923c4a3c posix-timers: Kill ->nsleep_restart()
No more users.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170607084241.28657-8-viro@ZenIV.linux.org.uk
2017-06-14 00:00:42 +02:00
Al Viro
ce41aaf47a hrtimers/posix-timers: Merge nanosleep timespec copyout logics into a new helper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170607084241.28657-7-viro@ZenIV.linux.org.uk
2017-06-14 00:00:42 +02:00
Al Viro
192a82f900 hrtimer_nanosleep(): Pass rmtp in restart_block
Store the pointer to the timespec which gets updated with the remaining
time in the restart block and remove the function argument.

[ tglx: Added changelog ]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170607084241.28657-3-viro@ZenIV.linux.org.uk
2017-06-14 00:00:40 +02:00
Deepa Dinamani
ad19638463 time: Change k_clock nsleep() to use timespec64
struct timespec is not y2038 safe on 32 bit machines.  Replace uses of
struct timespec with struct timespec64 in the kernel.

The syscall interfaces themselves will be changed in a separate series.

Note that the restart_block parameter for nanosleep has also been left
unchanged and will be part of syscall series noted above.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Cc: y2038@lists.linaro.org
Cc: john.stultz@linaro.org
Cc: arnd@arndb.de
Link: http://lkml.kernel.org/r/1490555058-4603-8-git-send-email-deepa.kernel@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-14 21:49:56 +02:00