* refs/heads/tmp-b324a70:
Linux 4.9.86
MIPS: Implement __multi3 for GCC7 MIPS64r6 builds
KVM: arm/arm64: Fix check for hugepage size when allocating at Stage 2
net: gianfar_ptp: move set_fipers() to spinlock protecting area
sctp: make use of pre-calculated len
xen/gntdev: Fix partial gntdev_mmap() cleanup
xen/gntdev: Fix off-by-one error when unmapping with holes
SolutionEngine771x: fix Ether platform data
mdio-sun4i: Fix a memory leak
xen-netfront: enable device after manual module load
bnxt_en: Fix the 'Invalid VF' id check in bnxt_vf_ndo_prep routine.
can: flex_can: Correct the checking for frame length in flexcan_start_xmit()
mac80211: mesh: drop frames appearing to be from us
nl80211: Check for the required netlink attribute presence
i40e/i40evf: Account for frags split over multiple descriptors in check linearize
uapi libc compat: add fallback for unsupported libcs
drm/ttm: check the return value of kzalloc
NET: usb: qmi_wwan: add support for YUGA CLM920-NC5 PID 0x9625
e1000: fix disabling already-disabled warning
macvlan: Fix one possible double free
xfs: quota: check result of register_shrinker()
xfs: quota: fix missed destroy of qi_tree_lock
IB/ipoib: Fix race condition in neigh creation
IB/mlx4: Fix mlx4_ib_alloc_mr error flow
s390/dasd: fix wrongly assigned configuration data
genirq: Guard handle_bad_irq log messages
IB/mlx5: Fix mlx5_ib_alloc_mr error flow
led: core: Fix brightness setting when setting delay_off=0
bnx2x: Improve reliability in case of nested PCI errors
tg3: Enable PHY reset in MTU change path for 5720
tg3: Add workaround to restrict 5762 MRRS to 2048
tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path
tipc: error path leak fixes in tipc_enable_bearer()
lib/mpi: Fix umul_ppmm() for MIPS64r6
ARM: dts: ls1021a: fix incorrect clock references
scsi: storvsc: Fix scsi_cmd error assignments in storvsc_handle_error
net: stmmac: Fix TX timestamp calculation
ip6_tunnel: get the min mtu properly in ip6_tnl_xmit
net: arc_emac: fix arc_emac_rx() error paths
net: mediatek: setup proper state for disabled GMAC on the default
ASoC: nau8825: fix issue that pop noise when start capture
spi: atmel: fixed spin_lock usage inside atmel_spi_remove
mac80211_hwsim: Fix a possible sleep-in-atomic bug in hwsim_get_radio_nl
drm/nouveau/pci: do a msi rearm on init
net: phy: xgene: disable clk on error paths
sget(): handle failures of register_shrinker()
x86/asm: Allow again using asm.h when building for the 'bpf' clang target
ARM: 8731/1: Fix csum_partial_copy_from_user() stack mismatch
ipv6: icmp6: Allow icmp messages to be looped back
mtd: nand: brcmnand: Zero bitflip is not an error
mtd: nand: gpmi: Fix failure when a erased page has a bitflip at BBM
net: usb: qmi_wwan: add Telit ME910 PID 0x1101 support
nvme: check hw sectors before setting chunk sectors
dmaengine: fsl-edma: disable clks on all error paths
f2fs: fix a bug caused by NULL extent tree
i2c: designware: must wait for enable
hrtimer: Ensure POSIX compliance (relative CLOCK_REALTIME hrtimers)
ANDROID: kbuild: change LTO into a choice
ANDROID: arm64: crypto: fix AES CE when built as a module
ANDROID: staging: lustre: fix filler function type
ANDROID: fs: logfs: fix filler function type
ANDROID: fs: gfs2: fix filler function type
ANDROID: fs: exofs: fix filler function type
ANDROID: fs: afs: fix filler function type
ANDROID: keychord: Check for write data size
media-device: fix ioctl function types
drivers/perf: arm_pmu: fix function type mismatch
dummycon: fix function types
fs: nfs: fix filler function type
mm: fix filler function type mismatch
mm: fix drain_local_pages function type
BACKPORT: vfs: pass type instead of fn to do_{loop,iter}_readv_writev()
arch/arm64/crypto: fix CFI in AES CE
arch/arm64/crypto: fix CFI in SHA CE
arm64: disable CFI for cpu_replace_ttbr1
v4l2-ioctl: fix function types for IOCTL_INFO_STD
UPSTREAM: module: Do not paper over type mismatches in module_param_call()
BACKPORT: treewide: Fix function prototypes for module_param_call()
UPSTREAM: module: Prepare to convert all module_param_call() prototypes
bpf: fix function type for __bpf_prog_run
kallsyms: strip the .cfi postfix from symbols with CONFIG_CFI_CLANG
add support for clang Control Flow Integrity (CFI)
HACK: init: ensure initcall ordering with LTO
xen/efi: don't use -fshort-wchar
drivers/misc: disable LTO for lkdtm_rodata.o
arm64: vdso: disable LTO
FROMLIST: BACKPORT: arm64: select ARCH_SUPPORTS_LTO_CLANG
FROMLIST: BACKPORT: arm64: disable RANDOMIZE_MODULE_REGION_FULL with LTO_CLANG
FROMLIST: arch/arm64/crypto: disable LTO for aes-ce-cipher.c
arm64: disable ARM64_ERRATUM_843419 for clang LTO
arm64: pass code model to LLVMgold
FROMLIST: BACKPORT: arm64: make mrs_s and msr_s macros work with LTO
FROMLIST: arm64: kvm: use -fno-jump-tables with clang
FROMLIST: efi/libstub: disable LTO
FROMLIST: scripts/mod: disable LTO for empty.c
FROMLIST: BACKPORT: kbuild: fix dynamic ftrace with clang LTO
FROMLIST: BACKPORT: kbuild: add support for clang LTO
FROMLIST: BACKPORT: arm64: add a workaround for GNU gold with ARM64_MODULE_PLTS
FROMLIST: arm64: explicitly pass --no-fix-cortex-a53-843419 to GNU gold
FROMLIST: kbuild: add __ld-ifversion and linker-specific macros
FROMLIST: kbuild: add ld-name macro
FROMLIST: BACKPORT: arm64: keep .altinstructions and .altinstr_replacement
arm64: fix LD_DEAD_CODE_DATA_ELIMINATION
FROMLIST: kbuild: fix LD_DEAD_CODE_DATA_ELIMINATION
FROMLIST: BACKPORT: kbuild: add __cc-ifversion and compiler-specific variants
FROMLIST: kbuild: add clang-version.sh
Revert "binder: add missing binder_unlock()"
Linux 4.9.85
x86/entry/64: Clear extra registers beyond syscall arguments, to reduce speculation attack surface
mm: fail get_vaddr_frames() for filesystem-dax mappings
mm: Fix devm_memremap_pages() collision handling
libnvdimm, dax: fix 1GB-aligned namespaces vs physical misalignment
IB/core: disable memory registration of filesystem-dax vmas
v4l2: disable filesystem-dax mapping support
mm: introduce get_user_pages_longterm
device-dax: implement ->split() to catch invalid munmap attempts
libnvdimm: fix integer overflow static analysis warning
fs/dax.c: fix inefficiency in dax_writeback_mapping_range()
mm: avoid spurious 'bad pmd' warning messages
X.509: fix NULL dereference when restricting key with unsupported_sig
binder: add missing binder_unlock()
drm/amdgpu: add new device to use atpx quirk
drm/amdgpu: Avoid leaking PM domain on driver unbind (v2)
drm/amdgpu: add atpx quirk handling (v2)
drm/amdgpu: Add dpm quirk for Jet PRO (v2)
usb: renesas_usbhs: missed the "running" flag in usb_dmac with rx path
usb: gadget: f_fs: Process all descriptors during bind
Revert "usb: musb: host: don't start next rx urb if current one failed"
usb: ldusb: add PIDs for new CASSY devices supported by this driver
usb: dwc3: gadget: Set maxpacket size for ep0 IN
drm/edid: Add 6 bpc quirk for CPT panel in Asus UX303LA
Add delay-init quirk for Corsair K70 RGB keyboards
arm64: Disable unhandled signal log messages by default
usb: ohci: Proper handling of ed_rm_list to handle race condition between usb_kill_urb() and finish_unlinks()
ohci-hcd: Fix race condition caused by ohci_urb_enqueue() and io_watchdog_func()
PCI/cxgb4: Extend T3 PCI quirk to T4+ devices
irqchip/gic-v3: Use wmb() instead of smb_wmb() in gic_raise_softirq()
x86/oprofile: Fix bogus GCC-8 warning in nmi_setup()
iio: adis_lib: Initialize trigger before requesting interrupt
iio: buffer: check if a buffer has been set up when poll is called
RDMA/uverbs: Protect from command mask overflow
PKCS#7: fix certificate chain verification
X.509: fix BUG_ON() when hash algorithm is unsupported
cfg80211: fix cfg80211_beacon_dup
scsi: ibmvfc: fix misdefined reserved field in ibmvfc_fcp_rsp_info
xtensa: fix high memory/reserved memory collision
netfilter: drop outermost socket lock in getsockopt()
ANDROID: sdcardfs: Set num in extension_details during make_item
Conflicts:
Makefile
arch/arm64/include/asm/arch_gicv3.h
arch/arm64/kernel/module.lds
drivers/usb/gadget/function/f_fs.c
scripts/link-vmlinux.sh
Change in module_param_call() definition requires alignment in:
drivers/hwtracing/coresight/coresight-event.c
drivers/media/radio/radio-iris-transport.c
drivers/power/reset/msm-poweroff.c
drivers/soc/qcom/wcnss/wcnss_wlan.c
drivers/video/fbdev/msm/mdss_dsi_status.c
Change-Id: I2fa32c39bd4ba8a132f8f8abc8132a2ceb32907a
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
commit 48d0c9becc7f3c66874c100c126459a9da0fdced upstream.
The POSIX specification defines that relative CLOCK_REALTIME timers are not
affected by clock modifications. Those timers have to use CLOCK_MONOTONIC
to ensure POSIX compliance.
The introduction of the additional HRTIMER_MODE_PINNED mode broke this
requirement for pinned timers.
There is no user space visible impact because user space timers are not
using pinned mode, but for consistency reasons this needs to be fixed.
Check whether the mode has the HRTIMER_MODE_REL bit set instead of
comparing with HRTIMER_MODE_ABS.
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Fixes: 597d027573 ("timers: Framework for identifying pinned timers")
Link: http://lkml.kernel.org/r/20171221104205.7269-7-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* refs/heads/tmp-db04eb4:
Linux 4.9.79
nfsd: auth: Fix gid sorting when rootsquash enabled
bpf: reject stores into ctx via st and xadd
bpf: fix 32-bit divide by zero
bpf: fix divides by zero
bpf: avoid false sharing of map refcount with max_entries
bpf: arsh is not supported in 32 bit alu thus reject it
bpf: introduce BPF_JIT_ALWAYS_ON config
bpf: fix bpf_tail_call() x64 JIT
x86: bpf_jit: small optimization in emit_bpf_tail_call()
hrtimer: Reset hrtimer cpu base proper on CPU hotplug
x86/microcode/intel: Extend BDW late-loading further with LLC size check
perf/x86/amd/power: Do not load AMD power module on !AMD platforms
flow_dissector: properly cap thoff field
tun: fix a memory leak for tfile->tx_array
mlxsw: spectrum_router: Don't log an error on missing neighbor
gso: validate gso_type in GSO handlers
ip6_gre: init dev->mtu and dev->hard_header_len correctly
be2net: restore properly promisc mode after queues reconfiguration
ppp: unlock all_ppp_mutex before registering device
ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
net: Allow neigh contructor functions ability to modify the primary_key
vmxnet3: repair memory leak
tipc: fix a memory leak in tipc_nl_node_get_link()
sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
sctp: do not allow the v4 socket to bind a v4mapped v6 address
r8169: fix memory corruption on retrieval of hardware statistics.
pppoe: take ->needed_headroom of lower device into account on xmit
net: tcp: close sock if net namespace is exiting
net: qdisc_pkt_len_init() should be more robust
net: igmp: fix source address check for IGMPv3 reports
lan78xx: Fix failure in USB Full Speed
ipv6: ip6_make_skb() needs to clear cork.base.dst
ipv6: fix udpv6 sendmsg crash caused by too small MTU
ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL
dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state
eventpoll.h: add missing epoll event masks
vsyscall: Fix permissions for emulate mode with KAISER/PTI
um: link vmlinux with -no-pie
orangefs: fix deadlock; do not write i_size in read_iter
Input: trackpoint - force 3 buttons if 0 button is reported
mm: fix 100% CPU kswapd busyloop on unreclaimable nodes
Revert "module: Add retpoline tag to VERMAGIC"
scsi: libiscsi: fix shifting of DID_REQUEUE host byte
fs/fcntl: f_setown, avoid undefined behaviour
reiserfs: don't preallocate blocks for extended attributes
reiserfs: fix race in prealloc discard
netfilter: xt_osf: Add missing permission checks
netfilter: nfnetlink_cthelper: Add missing permission checks
ACPICA: Namespace: fix operand cache leak
ACPI / scan: Prefer devices without _HID/_CID for _ADR matching
ipc: msg, make msgrcv work with LONG_MIN
mm, page_alloc: fix potential false positive in __zone_watermark_ok
cma: fix calculation of aligned offset
hwpoison, memcg: forcibly uncharge LRU pages
mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
drivers: base: cacheinfo: fix boot error message when acpi is enabled
drivers: base: cacheinfo: fix x86 with CONFIG_OF enabled
Prevent timer value 0 for MWAITX
KVM: arm/arm64: Check pagesize when allocating a hugepage at Stage 2
can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once
can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once
usbip: Fix potential format overflow in userspace tools
usbip: Fix implicit fallthrough warning
usbip: prevent vhci_hcd driver from leaking a socket pointer address
orangefs: initialize op on loop restart in orangefs_devreq_read
orangefs: use list_for_each_entry_safe in purge_waiting_ops
x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
Conflicts:
mm/page_alloc.c
mm/vmscan.c
Change-Id: Ic2906f35cee88313f33650133b26dc3e51cdc488
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
commit d5421ea43d30701e03cadc56a38854c36a8b4433 upstream.
The hrtimer interrupt code contains a hang detection and mitigation
mechanism, which prevents that a long delayed hrtimer interrupt causes a
continous retriggering of interrupts which prevent the system from making
progress. If a hang is detected then the timer hardware is programmed with
a certain delay into the future and a flag is set in the hrtimer cpu base
which prevents newly enqueued timers from reprogramming the timer hardware
prior to the chosen delay. The subsequent hrtimer interrupt after the delay
clears the flag and resumes normal operation.
If such a hang happens in the last hrtimer interrupt before a CPU is
unplugged then the hang_detected flag is set and stays that way when the
CPU is plugged in again. At that point the timer hardware is not armed and
it cannot be armed because the hang_detected flag is still active, so
nothing clears that flag. As a consequence the CPU does not receive hrtimer
interrupts and no timers expire on that CPU which results in RCU stalls and
other malfunctions.
Clear the flag along with some other less critical members of the hrtimer
cpu base to ensure starting from a clean state when a CPU is plugged in.
Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the
root cause of that hard to reproduce heisenbug. Once understood it's
trivial and certainly justifies a brown paperbag.
Fixes: 41d2e49493 ("hrtimer: Tune hrtimer_interrupt hang logic")
Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* 4.9/tmp-fd67b2f:
Linux 4.9.58
usb: dwc3: gadget: Correct ISOC DATA PIDs for short packets
cpufreq: CPPC: add ACPI_PROCESSOR dependency
EDAC, mce_amd: Print IPID and Syndrome on a separate line
btmrvl: avoid double-disable_irq() race
regulator: core: Resolve supplies before disabling unused regulators
drm/nouveau/gr/gf100-: fix ccache error logging
powerpc/perf: Add restrictions to PMC5 in power9 DD1
nfsd/callback: Cleanup callback cred on shutdown
hrtimer: Catch invalid clockids again
target/iscsi: Fix unsolicited data seq_end_offset calculation
IB/hfi1: Allocate context data on memory node
IB/hfi1: Use static CTLE with Preset 6 for integrated HFIs
uapi: fix linux/mroute6.h userspace compilation errors
uapi: fix linux/rds.h userspace compilation errors
ceph: clean up unsafe d_parent accesses in build_dentry_path
ceph: fix bogus endianness change in ceph_ioctl_set_layout
ceph: don't update_dentry_lease unless we actually got one
i2c: at91: ensure state is restored after suspending
qed: Read queue state before releasing buffer
qed: Reserve doorbell BAR space for present CPUs
qede: Prevent index problems in loopback test
net: mvpp2: release reference to txq_cpu[] entry after unmapping
drm/amdgpu: refuse to reserve io mem for split VRAM buffers
ASoC: mediatek: add I2C dependency for CS42XX8
scsi: scsi_dh_emc: return success in clariion_std_inquiry()
slub: do not merge cache if slub_debug contains a never-merge flag
ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock
mm/memory_hotplug: set magic number to page->freelist instead of page->lru.next
crypto: xts - Add ECB dependency
net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs
sparc64: Migrate hvcons irq to panicked cpu
md/linear: shutup lockdep warnning
f2fs: do not wait for writeback in write_begin
Btrfs: send, fix failure to rename top level inode due to name collision
sched/fair: Update rq clock before changing a task's CPU affinity
f2fs: do SSR for data when there is enough free space
iio: adc: xilinx: Fix error handling
netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value.
staging: vchiq_2835_arm: Make cache-line-size a required DT property
net/mlx4_en: fix overflow in mlx4_en_init_timestamp()
mac80211: fix power saving clients handling in iwlwifi
qed: Don't use attention PTT for configuring BW
ALSA: hda: Add Geminilake HDMI codec ID
mac80211_hwsim: check HWSIM_ATTR_RADIO_NAME length
initramfs: finish fput() before accessing any binary from initramfs
irqchip/crossbar: Fix incorrect type of local variables
watchdog: kempld: fix gcc-4.3 build
locking/lockdep: Add nest_lock integrity test
xen-netback: Use GFP_ATOMIC to allocate hash
Revert "bsg-lib: don't free job in bsg_prepare_job"
MIPS: Fix minimum alignment requirement of IRQ stack
Change-Id: I69d2e97df2f9c12da893aa57c6ebe748724edcf7
Signed-off-by: Kyle Yan <kyan@codeaurora.org>
[ Upstream commit 336a9cde10d641e70bac67d90ae91b3190c3edca ]
commit 82e88ff1ea ("hrtimer: Revert CLOCK_MONOTONIC_RAW support") removed
unfortunately a sanity check in the hrtimer code which was part of that
MONOTONIC_RAW patch series.
It would have caught the bogus usage of CLOCK_MONOTONIC_RAW in the wireless
code. So bring it back.
It is way too easy to take any random clockid and feed it to the hrtimer
subsystem. At best, it gets mapped to a monotonic base, but it would be
better to just catch illegal values as early as possible.
Detect invalid clockids, map them to CLOCK_MONOTONIC and emit a warning.
[ tglx: Replaced the BUG by a WARN and gracefully map to CLOCK_MONOTONIC ]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Tomasz Nowicki <tn@semihalf.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Link: http://lkml.kernel.org/r/1452879670-16133-3-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current code drops the base lock and wait for the running
hrtimer's expiry event to be processed on the isolated CPU.
This leaves a window, where the running hrtimer can migrate
to a different CPU or even get freed. The pinned hrtimers that
are maintained in a temporarily list also can get freed while
the lock is dropped. The only reason for waiting for the running
hrtimer is to make sure that this hrtimer is migrated away from
the isolated CPU. This is a problem only if this hrtimer gets
rearmed from it's callback. As the possibility of this race is
very rare, it is better to have this limitation instead of fixing
the above mentioned bugs with more intrusive changes.
Change-Id: I14ba67cacb321d8b561195935592bb9979996a27
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
__migrate_timers() can be called from both hotplug and isolation
contexts. When called from the isolation context, we might sometimes
encounter running timers. This is OK since the currently running or
just expired timers are off of the timer wheel and so everything else
can be migrated off.
In the case of hrtimers, we must wait until all callbacks have finished.
However, a udelay is important while waiting to allow for store-exclusive
fairness when run_hrtimer is attempting to grab the hrtimer base lock.
Change-Id: I4dccc66e09819a44b2f9597408a6a3ac4e11f5d7
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
A timer might be running when we are trying to move the timer to another
CPU so ensure that we wait for the timer to finish before migrating.
Change-Id: I4c9ee39c715baebfbdb8a50476a475e38b092f70
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
[rameezmustafa@codeaurora.org: Port to msm-4.9]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
Do not require CPUSETS to be enabled to allow migration of timers and
hrtimers.
Change-Id: Ib911a0d34c250c4df020bdb265b92d2b8df8db93
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
[rameezmustafa@codeaurora.org: Port to msm-4.9]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
An hrtimer may be pinned to a CPU but inactive, so it is no longer valid
to test the hrtimer.state struct member as having no bits set when inactive.
Changed the test function to mask out the HRTIMER_STATE_PINNED bit when
checking for inactive state.
Change-Id: I632f37874ef79887ee1202a028ef734f392d6ed0
Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
[ohaugan@codeaurora.org: Port to 4.4]
Git-commit: 902e4d4eb0d2158d2792166221a72a829caecf07
Git-repo: git://git.linaro.org/people/mike.holmes/santosh.shukla/lng-isol.git
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
To isolate CPUs (isolate from hrtimers) from sysfs using cpusets, we need some
support from the hrtimer core. i.e. A routine hrtimer_quiesce_cpu() which would
migrate away all the unpinned hrtimers, but shouldn't touch the pinned ones.
This patch creates this routine.
Change-Id: I51259ea41e3bd5cdba50b718201a6840174a7224
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[forward port to 3.18]
Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
[ohaugan@codeaurora.org: Port to 4.4]
Git-commit: d4d50a0ddc35e58ee95137ba4d14e74fea8b682f
Git-repo: git://git.linaro.org/people/mike.holmes/santosh.shukla/lng-isol.git
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
[rameezmustafa@codeaurora.org: Port to msm-4.9]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
'Pinned' information would be required in migrate_hrtimers() now, as we can
migrate non-pinned timers away without a hotplug (i.e. with cpuset.quiesce). And
so we may need to identify pinned timers now, as we can't migrate them.
This patch reuses the timer->state variable for setting this flag as there were
enough number of free bits available in this variable. And there is no point
increasing size of this struct by adding another field.
Change-Id: If3b3770e547971809e789ea7c8033c48ec2aa92d
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[forward port to 3.18]
Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
[ohaugan@codeaurora.org: Port to 4.4]
Git-commit: 62feaf1ed0b64c04868d143d8bdb92d60dc3189b
Git-repo: git://git.linaro.org/people/mike.holmes/santosh.shukla/lng-isol.git
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
hrtimer_init_on_stack() needs a matching call to
destroy_hrtimer_on_stack(), so both need to be exported.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When activating a static object we need make sure that the object is
tracked in the object tracker. If it is a non-static object then the
activation is illegal.
In previous implementation, each subsystem need take care of this in
their fixup callbacks. Actually we can put it into debugobjects core.
Thus we can save duplicated code, and have *pure* fixup callbacks.
To achieve this, a new callback "is_static_object" is introduced to let
the type specific code decide whether a object is static or not. If
yes, we take it into object tracker, otherwise give warning and invoke
fixup callback.
This change has paassed debugobjects selftest, and I also do some test
with all debugobjects supports enabled.
At last, I have a concern about the fixups that can it change the object
which is in incorrect state on fixup? Because the 'addr' may not point
to any valid object if a non-static object is not tracked. Then Change
such object can overwrite someone's memory and cause unexpected
behaviour. For example, the timer_fixup_activate bind timer to function
stub_timer.
Link: http://lkml.kernel.org/r/1462576157-14539-1-git-send-email-changbin.du@intel.com
[changbin.du@intel.com: improve code comments where invoke the new is_static_object callback]
Link: http://lkml.kernel.org/r/1462777431-8171-1-git-send-email-changbin.du@intel.com
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Triplett <josh@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patchset introduces a /proc/<pid>/timerslack_ns interface which
would allow controlling processes to be able to set the timerslack value
on other processes in order to save power by avoiding wakeups (Something
Android currently does via out-of-tree patches).
The first patch tries to fix the internal timer_slack_ns usage which was
defined as a long, which limits the slack range to ~4 seconds on 32bit
systems. It converts it to a u64, which provides the same basically
unlimited slack (500 years) on both 32bit and 64bit machines.
The second patch introduces the /proc/<pid>/timerslack_ns interface
which allows the full 64bit slack range for a task to be read or set on
both 32bit and 64bit machines.
With these two patches, on a 32bit machine, after setting the slack on
bash to 10 seconds:
$ time sleep 1
real 0m10.747s
user 0m0.001s
sys 0m0.005s
The first patch is a little ugly, since I had to chase the slack delta
arguments through a number of functions converting them to u64s. Let me
know if it makes sense to break that up more or not.
Other than that things are fairly straightforward.
This patch (of 2):
The timer_slack_ns value in the task struct is currently a unsigned
long. This means that on 32bit applications, the maximum slack is just
over 4 seconds. However, on 64bit machines, its much much larger (~500
years).
This disparity could make application development a little (as well as
the default_slack) to a u64. This means both 32bit and 64bit systems
have the same effective internal slack range.
Now the existing ABI via PR_GET_TIMERSLACK and PR_SET_TIMERSLACK specify
the interface as a unsigned long, so we preserve that limitation on
32bit systems, where SET_TIMERSLACK can only set the slack to a unsigned
long value, and GET_TIMERSLACK will return ULONG_MAX if the slack is
actually larger then what can be stored by an unsigned long.
This patch also modifies hrtimer functions which specified the slack
delta as a unsigned long.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Android Kernel Team <kernel-team@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If CONFIG_TIME_LOW_RES is enabled we add a jiffie to the relative timeout to
prevent short sleeps, but we do not account for that in interfaces which
retrieve the remaining time.
Helge observed that timerfd can return a remaining time larger than the
relative timeout. That's not expected and breaks userland test programs.
Store the information that the timer was armed relative and provide functions
to adjust the remaining time. To avoid bloating the hrtimer struct make state
a u8, which as a bonus results in better code on x86 at least.
Reported-and-tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: linux-m68k@lists.linux-m68k.org
Cc: dhowells@redhat.com
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20160114164159.273328486@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Commit 75e3b37d05 ("hrtimer: Drop return code of hrtimer_switch_to_hres()")
drops the return code of hrtimer_switch_to_hres(). While doing so, it also
drops the return statement itself on failure. This may cause a system hang.
Seen when running arm:multi_v7_defconfig in qemu with devicetree file
vexpress-v2p-ca9.
Fixes: 75e3b37d05 ("hrtimer: Drop return code of hrtimer_switch_to_hres()")
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: http://lkml.kernel.org/r/1440231047-16256-1-git-send-email-linux@roeck-us.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The variable called "this_base" is confusing because its name suggests
it's of "struct hrtimer_clock_base" type, along with "base" and "new_base"
which doesn't help understanding this complicated function.
Make its name clearer and fix the misleading comment while at it.
[ tglx: Fixed the comment for real ]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1439907509-9553-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull timer updates from Thomas Gleixner:
"A rather largish update for everything time and timer related:
- Cache footprint optimizations for both hrtimers and timer wheel
- Lower the NOHZ impact on systems which have NOHZ or timer migration
disabled at runtime.
- Optimize run time overhead of hrtimer interrupt by making the clock
offset updates smarter
- hrtimer cleanups and removal of restrictions to tackle some
problems in sched/perf
- Some more leap second tweaks
- Another round of changes addressing the 2038 problem
- First step to change the internals of clock event devices by
introducing the necessary infrastructure
- Allow constant folding for usecs/msecs_to_jiffies()
- The usual pile of clockevent/clocksource driver updates
The hrtimer changes contain updates to sched, perf and x86 as they
depend on them plus changes all over the tree to cleanup API changes
and redundant code, which got copied all over the place. The y2038
changes touch s390 to remove the last non 2038 safe code related to
boot/persistant clock"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
clocksource: Increase dependencies of timer-stm32 to limit build wreckage
timer: Minimize nohz off overhead
timer: Reduce timer migration overhead if disabled
timer: Stats: Simplify the flags handling
timer: Replace timer base by a cpu index
timer: Use hlist for the timer wheel hash buckets
timer: Remove FIFO "guarantee"
timers: Sanitize catchup_timer_jiffies() usage
hrtimer: Allow hrtimer::function() to free the timer
seqcount: Introduce raw_write_seqcount_barrier()
seqcount: Rename write_seqcount_barrier()
hrtimer: Fix hrtimer_is_queued() hole
hrtimer: Remove HRTIMER_STATE_MIGRATE
selftest: Timers: Avoid signal deadlock in leap-a-day
timekeeping: Copy the shadow-timekeeper over the real timekeeper last
clockevents: Check state instead of mode in suspend/resume path
selftests: timers: Add leap-second timer edge testing to leap-a-day.c
ntp: Do leapsecond adjustment in adjtimex read path
time: Prevent early expiry of hrtimers[CLOCK_REALTIME] at the leap second edge
ntp: Introduce and use SECS_PER_DAY macro instead of 86400
...
If nohz is disabled on the kernel command line the [hr]timer code
still calls wake_up_nohz_cpu() and tick_nohz_full_cpu(), a pretty
pointless exercise. Cache nohz_active in [hr]timer per cpu bases and
avoid the overhead.
Before:
48.10% hog [.] main
15.25% [kernel] [k] _raw_spin_lock_irqsave
9.76% [kernel] [k] _raw_spin_unlock_irqrestore
6.50% [kernel] [k] mod_timer
6.44% [kernel] [k] lock_timer_base.isra.38
3.87% [kernel] [k] detach_if_pending
3.80% [kernel] [k] del_timer
2.67% [kernel] [k] internal_add_timer
1.33% [kernel] [k] __internal_add_timer
0.73% [kernel] [k] timerfn
0.54% [kernel] [k] wake_up_nohz_cpu
After:
48.73% hog [.] main
15.36% [kernel] [k] _raw_spin_lock_irqsave
9.77% [kernel] [k] _raw_spin_unlock_irqrestore
6.61% [kernel] [k] lock_timer_base.isra.38
6.42% [kernel] [k] mod_timer
3.90% [kernel] [k] detach_if_pending
3.76% [kernel] [k] del_timer
2.41% [kernel] [k] internal_add_timer
1.39% [kernel] [k] __internal_add_timer
0.76% [kernel] [k] timerfn
We probably should have a cached value for nohz full in the per cpu
bases as well to avoid the cpumask check. The base cache line is hot
already, the cpumask not necessarily.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Joonwoo Park <joonwoop@codeaurora.org>
Cc: Wenbo Wang <wenbo.wang@memblaze.com>
Link: http://lkml.kernel.org/r/20150526224512.207378134@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Eric reported that the timer_migration sysctl is not really nice
performance wise as it needs to check at every timer insertion whether
the feature is enabled or not. Further the check does not live in the
timer code, so we have an extra function call which checks an extra
cache line to figure out that it is disabled.
We can do better and store that information in the per cpu (hr)timer
bases. I pondered to use a static key, but that's a nightmare to
update from the nohz code and the timer base cache line is hot anyway
when we select a timer base.
The old logic enabled the timer migration unconditionally if
CONFIG_NO_HZ was set even if nohz was disabled on the kernel command
line.
With this modification, we start off with migration disabled. The user
visible sysctl is still set to enabled. If the kernel switches to NOHZ
migration is enabled, if the user did not disable it via the sysctl
prior to the switch. If nohz=off is on the kernel command line,
migration stays disabled no matter what.
Before:
47.76% hog [.] main
14.84% [kernel] [k] _raw_spin_lock_irqsave
9.55% [kernel] [k] _raw_spin_unlock_irqrestore
6.71% [kernel] [k] mod_timer
6.24% [kernel] [k] lock_timer_base.isra.38
3.76% [kernel] [k] detach_if_pending
3.71% [kernel] [k] del_timer
2.50% [kernel] [k] internal_add_timer
1.51% [kernel] [k] get_nohz_timer_target
1.28% [kernel] [k] __internal_add_timer
0.78% [kernel] [k] timerfn
0.48% [kernel] [k] wake_up_nohz_cpu
After:
48.10% hog [.] main
15.25% [kernel] [k] _raw_spin_lock_irqsave
9.76% [kernel] [k] _raw_spin_unlock_irqrestore
6.50% [kernel] [k] mod_timer
6.44% [kernel] [k] lock_timer_base.isra.38
3.87% [kernel] [k] detach_if_pending
3.80% [kernel] [k] del_timer
2.67% [kernel] [k] internal_add_timer
1.33% [kernel] [k] __internal_add_timer
0.73% [kernel] [k] timerfn
0.54% [kernel] [k] wake_up_nohz_cpu
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Joonwoo Park <joonwoop@codeaurora.org>
Cc: Wenbo Wang <wenbo.wang@memblaze.com>
Link: http://lkml.kernel.org/r/20150526224512.127050787@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
To avoid getting spurious interrupts on a tickless CPU, clockevent
device can now be stopped by switching to ONESHOT_STOPPED state.
The natural place for handling this transition is tick_program_event().
On 'expires == KTIME_MAX', we skip programming the event and so we need
to fix such call sites as well, to always call tick_program_event()
irrespective of the expires value.
Once the clockevent device is required again, check if it was earlier
put into ONESHOT_STOPPED state. If yes, switch its state to ONESHOT
before programming its event.
To make sure we haven't missed any corner case, add a WARN() for the
case where we try to reprogram clockevent device while we aren't
configured in ONESHOT_STOPPED state.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linaro-kernel@lists.linaro.org
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5146b07be7f0bc497e0ebae036590ec2fa73e540.1428031396.git.viresh.kumar@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It was noted that the 32bit implementation of ktime_divns()
was doing unsigned division and didn't properly handle
negative values.
And when a ktime helper was changed to utilize
ktime_divns, it caused a regression on some IR blasters.
See the following bugzilla for details:
https://bugzilla.redhat.com/show_bug.cgi?id=1200353
This patch fixes the problem in ktime_divns by checking
and preserving the sign bit, and then reapplying it if
appropriate after the division, it also changes the return
type to a s64 to make it more obvious this is expected.
Nicolas also pointed out that negative dividers would
cause infinite loops on 32bit systems, negative dividers
is unlikely for users of this function, but out of caution
this patch adds checks for negative dividers for both
32-bit (BUG_ON) and 64-bit(WARN_ON) versions to make sure
no such use cases creep in.
[ tglx: Hand an u64 to do_div() to avoid the compiler warning ]
Fixes: 166afb6451 'ktime: Sanitize ktime_to_us/ms conversion'
Reported-and-tested-by: Trevor Cordes <trevor@tecnopolis.ca>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1431118043-23452-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Because we drop cpu_base->lock around calling hrtimer::function, it is
possible for hrtimer_start() to come in between and enqueue the timer.
If hrtimer::function then returns HRTIMER_RESTART we'll hit the BUG_ON
because HRTIMER_STATE_ENQUEUED will be set.
Since the above is a perfectly valid scenario, remove the BUG_ON and
make the enqueue_hrtimer() call conditional on the timer not being
enqueued already.
NOTE: in that concurrent scenario its entirely common for both sites
to want to modify the hrtimer, since hrtimers don't provide
serialization themselves be sure to provide some such that the
hrtimer::function and the hrtimer_start() caller don't both try and
fudge the expiration state at the same time.
To that effect, add a WARN when someone tries to forward an already
enqueued timer, the most common way to change the expiry of self
restarting timers. Ideally we'd put the WARN in everything modifying
the expiry but most of that is inlines and we don't need the bloat.
Fixes: 2d44ae4d71 ("hrtimer: clean up cpu->base locking tricks")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Cc: Paul Turner <pjt@google.com>
Link: http://lkml.kernel.org/r/20150415113105.GT5029@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We can do a lockless check for hrtimer_active before actually taking
the lock in hrtimer[_try_to]_cancel. This is useful for hotpath users
like nanosleep as they avoid the lock dance when the timer has
expired.
This is safe because active is true when the timer is enqueued or the
callback is running. Taking the hrtimer base lock does not protect
against concurrent hrtimer_start calls, the callsite has to do the
proper serialization itself.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203503.580273114@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
hrtimer softirq is a leftover from the initial implementation and
serves only the purpose to handle the enqueueing of already expired
timers in the high resolution timer mode. We discussed whether we
change the return value and force all start sites to handle that the
timer is already expired, but that would be a Herculean task and I'm
not sure whether its a good idea to enforce that handling on
everyone.
A simpler solution is to enforce a timer interrupt instead of raising
and scheduling a softirq. Just use the existing infrastructure to do
so and remove all the softirq leftovers.
The HRTIMER softirq enum is now unused, but kept around because trace
parsers rely on the existing numbering.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20150414203501.840834708@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>