* 4.9/tmp-86f97ab:
Linux 4.9.57
KVM: nVMX: update last_nonleaf_level when initializing nested EPT
x86/alternatives: Fix alt_max_short macro to really be a max()
USB: serial: console: fix use-after-free after failed setup
USB: serial: qcserial: add Dell DW5818, DW5819
USB: serial: option: add support for TP-Link LTE module
USB: serial: cp210x: add support for ELV TFD500
USB: serial: ftdi_sio: add id for Cypress WICED dev board
bio_copy_user_iov(): don't ignore ->iov_offset
more bio_map_user_iov() leak fixes
fix unbalanced page refcounting in bio_map_user_iov
direct-io: Prevent NULL pointer access in submit_page_section
usb: gadget: composite: Fix use-after-free in usb_composite_overwrite_options
usb: gadget: configfs: Fix memory leak of interface directory data
drm/i915/bios: parse DDI ports also for CHV for HDMI DDC pin and DP AUX channel
drm/i915: Read timings from the correct transcoder in intel_crtc_mode_get()
drm/i915/edp: Get the Panel Power Off timestamp after panel is off
ALSA: line6: Fix leftover URB at error-path during probe
ALSA: line6: Fix missing initialization before error path
ALSA: caiaq: Fix stray URB at probe error path
ALSA: seq: Fix copy_from_user() call inside lock
ALSA: seq: Fix use-after-free at creating a port
ALSA: usb-audio: Kill stray URB at exiting
fs/mpage.c: fix mpage_writepage() for pages with buffers
device property: Track owner device of device property
iommu/amd: Finish TLB flush in amd_iommu_unmap()
pinctrl/amd: Fix build dependency on pinmux code
usb: renesas_usbhs: Fix DMAC sequence for receiving zero-length packet
KVM: nVMX: fix guest CR4 loading when emulating L2 to L1 exit
KVM: MMU: always terminate page walks at level 1
crypto: shash - Fix zero-length shash ahash digest crash
HID: usbhid: fix out-of-bounds bug
dmaengine: ti-dma-crossbar: Fix possible race condition with dma_inuse
dmaengine: edma: Align the memcpy acnt array size with the transfer
MIPS: math-emu: Remove pr_err() calls from fpu_emu()
USB: dummy-hcd: Fix deadlock caused by disconnect detection
rcu: Allow for page faults in NMI handlers
nl80211: Define policy for packet pattern attributes
CIFS: Reconnect expired SMB sessions
ext4: in ext4_seek_{hole,data}, return -ENXIO for negative offsets
Change-Id: I64affb40416c282aa1421e22e4320d5b99670af9
Signed-off-by: Kyle Yan <kyan@codeaurora.org>
commit 28585a832602747cbfa88ad8934013177a3aae38 upstream.
A number of architecture invoke rcu_irq_enter() on exception entry in
order to allow RCU read-side critical sections in the exception handler
when the exception is from an idle or nohz_full CPU. This works, at
least unless the exception happens in an NMI handler. In that case,
rcu_nmi_enter() would already have exited the extended quiescent state,
which would mean that rcu_irq_enter() would (incorrectly) cause RCU
to think that it is again in an extended quiescent state. This will
in turn result in lockdep splats in response to later RCU read-side
critical sections.
This commit therefore causes rcu_irq_enter() and rcu_irq_exit() to
take no action if there is an rcu_nmi_enter() in effect, thus avoiding
the unscheduled return to RCU quiescent state. This in turn should
make the kernel safe for on-demand RCU voyeurism.
Link: http://lkml.kernel.org/r/20170922211022.GA18084@linux.vnet.ibm.com
Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* 4.9/tmp-9ae2c67:
Linux 4.9.40
alarmtimer: don't rate limit one-shot timers
tracing: Fix kmemleak in instance_rmdir
PM / Domains: defer dev_pm_domain_set() until genpd->attach_dev succeeds if present
reiserfs: Don't clear SGID when inheriting ACLs
spmi: Include OF based modalias in device uevent
of: device: Export of_device_{get_modalias, uvent_modalias} to modules
acpi/nfit: Fix memory corruption/Unregister mce decoder on failure
ovl: fix random return value on mount
hfsplus: Don't clear SGID when inheriting ACLs
mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] array
drm/mst: Avoid processing partially received up/down message transactions
drm/mst: Avoid dereferencing a NULL mstb in drm_dp_mst_handle_up_req()
drm/mst: Fix error handling during MST sideband message reception
RDMA/core: Initialize port_num in qp_attr
ceph: fix race in concurrent readdir
staging: lustre: ko2iblnd: check copy_from_iter/copy_to_iter return code
staging: sm750fb: avoid conflicting vesafb
staging: comedi: ni_mio_common: fix AO timer off-by-one regression
staging: rtl8188eu: add TL-WN722N v2 support
Revert "perf/core: Drop kernel samples even though :u is specified"
perf annotate: Fix broken arrow at row 0 connecting jmp instruction to its target
iser-target: Avoid isert_conn->cm_id dereference in isert_login_recv_done
target: Fix COMPARE_AND_WRITE caw_sem leak during se_cmd quiesce
udf: Fix deadlock between writeback and udf_setsize()
NFS: only invalidate dentrys that are clearly invalid.
sunrpc: use constant time memory comparison for mac
IB/core: Namespace is mandatory input for address resolution
IB/iser: Fix connection teardown race condition
Input: i8042 - fix crash at boot time
MIPS: Fix a typo: s/preset/present/ in r2-to-r6 emulation error message
MIPS: Send SIGILL for R6 branches in `__compute_return_epc_for_insn'
MIPS: Send SIGILL for linked branches in `__compute_return_epc_for_insn'
MIPS: Rename `sigill_r6' to `sigill_r2r6' in `__compute_return_epc_for_insn'
MIPS: Send SIGILL for BPOSGE32 in `__compute_return_epc_for_insn'
MIPS: math-emu: Prevent wrong ISA mode instruction emulation
MIPS: Fix unaligned PC interpretation in `compute_return_epc'
MIPS: Actually decode JALX in `__compute_return_epc_for_insn'
MIPS: Save static registers before sysmips
MIPS: Fix MIPS I ISA /proc/cpuinfo reporting
x86/ioapic: Pass the correct data to unmask_ioapic_irq()
x86/acpi: Prevent out of bound access caused by broken ACPI tables
Revert "ACPI / EC: Enable event freeze mode..." to fix a regression
ACPI / EC: Drop EC noirq hooks to fix a regression
ubifs: Don't leak kernel memory to the MTD
MIPS: Negate error syscall return in trace
MIPS: Fix mips_atomic_set() with EVA
MIPS: Fix mips_atomic_set() retry condition
ftrace: Fix uninitialized variable in match_records()
nvme-rdma: remove race conditions from IB signalling
vfio: New external user group/file match
vfio: Fix group release deadlock
ovl: drop CAP_SYS_RESOURCE from saved mounter's credentials
drm/ttm: Fix use-after-free in ttm_bo_clean_mm
f2fs: Don't clear SGID when inheriting ACLs
f2fs: sanity check size of nat and sit cache
xfs: Don't clear SGID when inheriting ACLs
ipmi:ssif: Add missing unlock in error branch
ipmi: use rcu lock around call to intf->handlers->sender()
drm/radeon: Fix eDP for single-display iMac10,1 (v2)
drm/radeon/ci: disable mclk switching for high refresh rates (v2)
drm/amd/amdgpu: Return error if initiating read out of range on vram
s390/syscalls: Fix out of bounds arguments access
Raid5 should update rdev->sectors after reshape
ext2: Don't clear SGID when inheriting ACLs
libnvdimm: fix badblock range handling of ARS range
libnvdimm, btt: fix btt_rw_page not returning errors
cx88: Fix regression in initial video standard setting
x86/xen: allow userspace access during hypercalls
md: don't use flush_signals in userspace processes
usb: renesas_usbhs: gadget: disable all eps when the driver stops
usb: renesas_usbhs: fix usbhsc_resume() for !USBHSF_RUNTIME_PWCTRL
USB: cdc-acm: add device-id for quirky printer
usb: storage: return on error to avoid a null pointer dereference
mxl111sf: Fix driver to use heap allocate buffers for USB messages
xhci: Bad Ethernet performance plugged in ASM1042A host
xhci: Fix NULL pointer dereference when cleaning up streams for removed host
xhci: fix 20000ms port resume timeout
ipvs: SNAT packet replies only for NATed connections
PCI/PM: Restore the status of PCI devices across hibernation
PCI: rockchip: Use normal register bank for config accessors
PCI: Work around poweroff & suspend-to-RAM issue on Macbook Pro 11
af_key: Fix sadb_x_ipsecrequest parsing
powerpc/mm/radix: Properly clear process table entry
powerpc/asm: Mark cr0 as clobbered in mftb()
powerpc: Fix emulation of mfocrf in emulate_step()
powerpc: Fix emulation of mcrf in emulate_step()
powerpc/64: Fix atomic64_inc_not_zero() to return an int
powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp()
xen/scsiback: Fix a TMR related use-after-free
iscsi-target: Add login_keys_workaround attribute for non RFC initiators
scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state
scsi: ses: do not add a device to an enclosure if enclosure_add_links() fails.
PM / Domains: Fix unsafe iteration over modified list of domains
PM / Domains: Fix unsafe iteration over modified list of domain providers
PM / Domains: Fix unsafe iteration over modified list of device links
ASoC: compress: Derive substream from stream based on direction
igb: Explicitly select page 0 at initialization
btrfs: Don't clear SGID when inheriting ACLs
wlcore: fix 64K page support
Bluetooth: use constant time memory comparison for secret values
perf intel-pt: Clear FUP flag on error
perf intel-pt: Use FUP always when scanning for an IP
perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zero
perf intel-pt: Fix last_ip usage
perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IP
perf intel-pt: Fix missing stack clear
perf intel-pt: Improve sample timestamp
perf intel-pt: Move decoder error setting into one condition
NFC: Add sockaddr length checks before accessing sa_family in bind handlers
nfc: Fix the sockaddr length sanitization in llcp_sock_connect
nfc: Ensure presence of required attributes in the activate_target handler
NFC: nfcmrvl: fix firmware-management initialisation
NFC: nfcmrvl: use nfc-device for firmware download
NFC: nfcmrvl: do not use device-managed resources
NFC: nfcmrvl_uart: add missing tty-device sanity check
NFC: fix broken device allocation
ath9k: fix an invalid pointer dereference in ath9k_rng_stop()
ath9k: fix tx99 bus error
ath9k: fix tx99 use after free
thermal: cpu_cooling: Avoid accessing potentially freed structures
thermal: max77620: fix device-node reference imbalance
s5p-jpeg: don't return a random width/height
dm mpath: cleanup -Wbool-operation warning in choose_pgpath()
ir-core: fix gcc-7 warning on bool arithmetic
disable new gcc-7.1.1 warnings for now
Use %zu to print resid (size_t).
ANDROID: keychord: Fix a slab out-of-bounds read.
UPSTREAM: af_key: Fix sadb_x_ipsecrequest parsing
ANDROID: lowmemorykiller: Add tgid to kill message
Revert "ANDROID: proc: smaps: Allow smaps access for CAP_SYS_RESOURCE"
4.9.39
kvm: vmx: allow host to access guest MSR_IA32_BNDCFGS
kvm: vmx: Check value written to IA32_BNDCFGS
kvm: x86: Guest BNDCFGS requires guest MPX support
kvm: vmx: Do not disable intercepts for BNDCFGS
tracing: Use SOFTIRQ_OFFSET for softirq dectection for more accurate results
PM / QoS: return -EINVAL for bogus strings
PM / wakeirq: Convert to SRCU
sched/topology: Fix overlapping sched_group_mask
sched/topology: Optimize build_group_mask()
sched/topology: Fix building of overlapping sched-groups
sched/fair, cpumask: Export for_each_cpu_wrap()
Revert "sched/core: Optimize SCHED_SMT"
crypto: caam - fix signals handling
crypto: caam - properly set IV after {en,de}crypt
crypto: sha1-ssse3 - Disable avx2
crypto: atmel - only treat EBUSY as transient if backlog
crypto: talitos - Extend max key length for SHA384/512-HMAC and AEAD
mm: fix overflow check in expand_upwards()
selftests/capabilities: Fix the test_execve test
mnt: Make propagate_umount less slow for overlapping mount propagation trees
mnt: In propgate_umount handle visiting mounts in any order
mnt: In umount propagation reparent in a separate pass
nvmem: core: fix leaks on registration errors
rcu: Add memory barriers for NOCB leader wakeup
vt: fix unchecked __put_user() in tioclinux ioctls
ARM64: dts: marvell: armada37xx: Fix timer interrupt specifiers
exec: Limit arg stack to at most 75% of _STK_LIM
s390: reduce ELF_ET_DYN_BASE
powerpc: move ELF_ET_DYN_BASE to 4GB / 4MB
arm64: move ELF_ET_DYN_BASE to 4GB / 4MB
arm: move ELF_ET_DYN_BASE to 4MB
binfmt_elf: use ELF_ET_DYN_BASE only for PIE
checkpatch: silence perl 5.26.0 unescaped left brace warnings
fs/dcache.c: fix spin lockup issue on nlru->lock
mm/list_lru.c: fix list_lru_count_node() to be race free
kernel/extable.c: mark core_kernel_text notrace
thp, mm: fix crash due race in MADV_FREE handling
tools/lib/lockdep: Reduce MAX_LOCK_DEPTH to avoid overflowing lock_chain/: Depth
parisc/mm: Ensure IRQs are off in switch_mm()
parisc: DMA API: return error instead of BUG_ON for dma ops on non dma devs
parisc: use compat_sys_keyctl()
parisc: Report SIGSEGV instead of SIGBUS when running out of stack
irqchip/gic-v3: Fix out-of-bound access in gic_set_affinity
cfg80211: Check if NAN service ID is of expected size
cfg80211: Check if PMKID attribute is of expected size
cfg80211: Validate frequencies nested in NL80211_ATTR_SCAN_FREQUENCIES
cfg80211: Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODE
sfc: don't read beyond unicast address list
brcmfmac: Fix glom_skb leak in brcmf_sdiod_recv_chain
brcmfmac: Fix a memory leak in error handling path in 'brcmf_cfg80211_attach'
brcmfmac: fix possible buffer overflow in brcmf_cfg80211_mgmt_tx()
rds: tcp: use sock_create_lite() to create the accept socket
vrf: fix bug_on triggered by rx when destroying a vrf
net: ipv6: Compare lwstate in detecting duplicate nexthops
net: core: Fix slab-out-of-bounds in netdev_stats_to_stats64
vxlan: fix hlist corruption
ipv6: dad: don't remove dynamic addresses if link is down
net/mlx5e: Fix TX carrier errors report in get stats ndo
liquidio: fix bug in soft reset failure detection
net/mlx5: Cancel delayed recovery work when unloading the driver
net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish()
bpf: prevent leaking pointer via xadd on unpriviledged
rocker: move dereference before free
bridge: mdb: fix leak on complete_info ptr on fail path
net: prevent sign extension in dev_get_stats()
tcp: reset sk_rx_dst in tcp_disconnect()
net: dp83640: Avoid NULL pointer dereference.
ipv6: avoid unregistering inet6_dev for loopback
net/phy: micrel: configure intterupts after autoneg workaround
net: sched: Fix one possible panic when no destroy callback
net_sched: fix error recovery at qdisc creation
xen-netfront: Rework the fix for Rx stall during OOM and network stress
ANDROID: android-verity: mark dev as rw for linear target
ANDROID: sdcardfs: Remove unnecessary lock
ANDROID: binder: don't check prio permissions on restore.
Add BINDER_GET_NODE_DEBUG_INFO ioctl
ANDROID: binder: add RT inheritance flag to node.
ANDROID: binder: improve priority inheritance.
ANDROID: binder: add min sched_policy to node.
ANDROID: binder: add support for RT prio inheritance.
ANDROID: binder: push new transactions to waiting threads.
ANDROID: binder: remove proc waitqueue
Conflicts:
drivers/staging/android/lowmemorykiller.c
Change-Id: I2954e47d7e4fc74cf9bb5033fc151537958b78af
Signed-off-by: Kyle Yan <kyan@codeaurora.org>
commit 6b5fc3a1331810db407c9e0e673dc1837afdc9d0 upstream.
Wait/wakeup operations do not guarantee ordering on their own. Instead,
either locking or memory barriers are required. This commit therefore
adds memory barriers to wake_nocb_leader() and nocb_leader_wait().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Krister Johansen <kjlx@templeofstupid.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
RCU_EXPEDITE_BOOT should speed up the boot process by enforcing
synchronize_rcu_expedited() instead of synchronize_rcu() during the boot
process. There should be no reason why one does not want this and there
is no need worry about real time latency at this point.
Therefore make it default.
Note that users wishing to avoid expediting entirely, for example when
bringing up new hardware possibly having flaky IPIs, can use the
rcu_normal boot parameter to override boot-time expediting.
Change-Id: I758bb7819bb2b18ad7232c3e943b9c1cb85d49e9
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
[ paulmck: Reworded commit log. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Enable panic on RCU stalls if the RCU_PANIC_ON_STALL is enabled. This
helps collecting the cpu context for debugging.
Change-Id: Ibc95799df995bf31053152ab9b3894710fcd9f43
Signed-off-by: Channagoud Kadabi <ckadabi@codeaurora.org>
commit 52d7e48b86fc108e45a656d8e53e4237993c481d upstream.
The current preemptible RCU implementation goes through three phases
during bootup. In the first phase, there is only one CPU that is running
with preemption disabled, so that a no-op is a synchronous grace period.
In the second mid-boot phase, the scheduler is running, but RCU has
not yet gotten its kthreads spawned (and, for expedited grace periods,
workqueues are not yet running. During this time, any attempt to do
a synchronous grace period will hang the system (or complain bitterly,
depending). In the third and final phase, RCU is fully operational and
everything works normally.
This has been OK for some time, but there has recently been some
synchronous grace periods showing up during the second mid-boot phase.
This code worked "by accident" for awhile, but started failing as soon
as expedited RCU grace periods switched over to workqueues in commit
8b355e3bc1 ("rcu: Drive expedited grace periods from workqueue").
Note that the code was buggy even before this commit, as it was subject
to failure on real-time systems that forced all expedited grace periods
to run as normal grace periods (for example, using the rcu_normal ksysfs
parameter). The callchain from the failure case is as follows:
early_amd_iommu_init()
|-> acpi_put_table(ivrs_base);
|-> acpi_tb_put_table(table_desc);
|-> acpi_tb_invalidate_table(table_desc);
|-> acpi_tb_release_table(...)
|-> acpi_os_unmap_memory
|-> acpi_os_unmap_iomem
|-> acpi_os_map_cleanup
|-> synchronize_rcu_expedited
The kernel showing this callchain was built with CONFIG_PREEMPT_RCU=y,
which caused the code to try using workqueues before they were
initialized, which did not go well.
This commit therefore reworks RCU to permit synchronous grace periods
to proceed during this mid-boot phase. This commit is therefore a
fix to a regression introduced in v4.9, and is therefore being put
forward post-merge-window in v4.10.
This commit sets a flag from the existing rcu_scheduler_starting()
function which causes all synchronous grace periods to take the expedited
path. The expedited path now checks this flag, using the requesting task
to drive the expedited grace period forward during the mid-boot phase.
Finally, this flag is updated by a core_initcall() function named
rcu_exp_runtime_mode(), which causes the runtime codepaths to be used.
Note that this arrangement assumes that tasks are not sent POSIX signals
(or anything similar) from the time that the first task is spawned
through core_initcall() time.
Fixes: 8b355e3bc1 ("rcu: Drive expedited grace periods from workqueue")
Reported-by: "Zheng, Lv" <lv.zheng@intel.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Stan Kain <stan.kain@gmail.com>
Tested-by: Ivan <waffolz@hotmail.com>
Tested-by: Emanuel Castelo <emanuel.castelo@gmail.com>
Tested-by: Bruno Pesavento <bpesavento@infinito.it>
Tested-by: Borislav Petkov <bp@suse.de>
Tested-by: Frederic Bezies <fredbezies@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f466ae66fa6a599f9a53b5f9bafea4b8cfffa7fb upstream.
It is now legal to invoke synchronize_sched() at early boot, which causes
Tiny RCU's synchronize_sched() to emit spurious splats. This commit
therefore removes the cond_resched() from Tiny RCU's synchronize_sched().
Fixes: 8b355e3bc1 ("rcu: Drive expedited grace periods from workqueue")
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull gcc plugins update from Kees Cook:
"This adds a new gcc plugin named "latent_entropy". It is designed to
extract as much possible uncertainty from a running system at boot
time as possible, hoping to capitalize on any possible variation in
CPU operation (due to runtime data differences, hardware differences,
SMP ordering, thermal timing variation, cache behavior, etc).
At the very least, this plugin is a much more comprehensive example
for how to manipulate kernel code using the gcc plugin internals"
* tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
latent_entropy: Mark functions with __latent_entropy
gcc-plugins: Add latent_entropy plugin
The __latent_entropy gcc attribute can be used only on functions and
variables. If it is on a function then the plugin will instrument it for
gathering control-flow entropy. If the attribute is on a variable then
the plugin will initialize it with random contents. The variable must
be an integer, an integer array type or a structure with integer fields.
These specific functions have been selected because they are init
functions (to help gather boot-time entropy), are called at unpredictable
times, or they have variable loops, each of which provide some level of
latent entropy.
Signed-off-by: Emese Revfy <re.emese@gmail.com>
[kees: expanded commit message]
Signed-off-by: Kees Cook <keescook@chromium.org>
Pull locking updates from Ingo Molnar:
"The main changes in this cycle were:
- rwsem micro-optimizations (Davidlohr Bueso)
- Improve the implementation and optimize the performance of
percpu-rwsems. (Peter Zijlstra.)
- Convert all lglock users to better facilities such as percpu-rwsems
or percpu-spinlocks and remove lglocks. (Peter Zijlstra)
- Remove the ticket (spin)lock implementation. (Peter Zijlstra)
- Korean translation of memory-barriers.txt and related fixes to the
English document. (SeongJae Park)
- misc fixes and cleanups"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
x86/cmpxchg, locking/atomics: Remove superfluous definitions
x86, locking/spinlocks: Remove ticket (spin)lock implementation
locking/lglock: Remove lglock implementation
stop_machine: Remove stop_cpus_lock and lg_double_lock/unlock()
fs/locks: Use percpu_down_read_preempt_disable()
locking/percpu-rwsem: Add down_read_preempt_disable()
fs/locks: Replace lg_local with a per-cpu spinlock
fs/locks: Replace lg_global with a percpu-rwsem
locking/percpu-rwsem: Add DEFINE_STATIC_PERCPU_RWSEMand percpu_rwsem_assert_held()
locking/pv-qspinlock: Use cmpxchg_release() in __pv_queued_spin_unlock()
locking/rwsem, x86: Drop a bogus cc clobber
futex: Add some more function commentry
locking/hung_task: Show all locks
locking/rwsem: Scan the wait_list for readers only once
locking/rwsem: Remove a few useless comments
locking/rwsem: Return void in __rwsem_mark_wake()
locking, rcu, cgroup: Avoid synchronize_sched() in __cgroup_procs_write()
locking/Documentation: Add Korean translation
locking/Documentation: Fix a typo of example result
locking/Documentation: Fix wrong section reference
...
A few rcuperf dmesg output messages have no space between the flag and
the start of the message. In contrast, every other messages consistently
supplies a single space. This difference makes rcuperf dmesg output
hard to read and to mechanically parse. This commit therefore fixes
this problem by modifying a pr_alert() call and PERFOUT_STRING() macro
function to provide that single space.
Signed-off-by: SeongJae Park <sj38.park@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tests for rcu_barrier() were introduced by commit fae4b54f28 ("rcu:
Introduce rcutorture testing for rcu_barrier()"). This commit updated
the documentation to say that the "rtbe" field in rcutorture's dmesg
output indicates test failure. However, the code was not updated, only
the documentation. This commit therefore updates the code to match the
updated documentation.
Signed-off-by: SeongJae Park <sj38.park@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit adds a dump of the scheduler state for stalled rcutorture
writer tasks. This addition provides yet more debug for the intermittent
"failures to proceed", where grace periods move ahead but the rcutorture
writer tasks fail to do so.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Up to now, RCU has assumed that the CPU-online process makes it from
CPU_UP_PREPARE to set_cpu_online() within one jiffy. Given the recent
rise of virtualized environments, this assumption is very clearly
obsolete. Failing to meet this deadline can result in RCU paying
attention to an incoming CPU for one jiffy, then ignoring it until the
grace period following the one in which that CPU sets itself online.
This situation might prove to be fatally disappointing to any RCU
read-side critical sections that had the misfortune to execute during
the time in which RCU was ignoring the slow-to-come-online CPU.
This commit therefore updates RCU's internal CPU state-tracking
information at notify_cpu_starting() time, thus providing RCU with
an exact transition of the CPU's state from offline to online.
Note that this means that incoming CPUs must not use RCU read-side
critical section (other than those of SRCU) until notify_cpu_starting()
time. Note also that the CPU_STARTING notifiers -are- allowed to use
RCU read-side critical sections. (Of course, CPU-hotplug notifiers are
rapidly becoming obsolete, so you need to act fast!)
If a given architecture or CPU family needs to use RCU read-side
critical sections earlier, the call to rcu_cpu_starting() from
notify_cpu_starting() will need to be architecture-specific, with
architectures that need early use being required to hand-place
the call to rcu_cpu_starting() at some point preceding the call to
notify_cpu_starting().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Currently, __note_gp_changes() checks to see if the CPU has slept through
multiple grace periods. If it has, it resynchronizes that CPU's view
of the grace-period state, which includes whether or not the current
grace period needs a quiescent state from this CPU. The fact of this
need (or lack thereof) needs to be in two places, rdp->cpu_no_qs.b.norm
and rdp->core_needs_qs. The former tells RCU's context-switch code to
go get a quiescent state and the latter says that it needs to be reported.
The current code unconditionally sets the former to true, but correctly
sets the latter.
This does not result in failures, but it does unnecessarily increase
the amount of work done on average at context-switch time. This commit
therefore correctly sets both fields.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The Kconfig currently controlling compilation of tree.c is:
init/Kconfig:config TREE_RCU
init/Kconfig: bool
...and update.c and sync.c are "obj-y" meaning that none are ever
built as a module by anyone.
Since MODULE_ALIAS is a no-op for non-modular code, we can remove
them from these files.
We leave moduleparam.h behind since the files instantiate some boot
time configuration parameters with module_param() still.
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Commit abedf8e241 ("rcu: Use simple wait queues where possible in
rcutree") converts Tree RCU's wait queues to simple wait queues,
but it incorrectly reverts the commit 2aa792e6fa ("rcu: Use
rcu_gp_kthread_wake() to wake up grace period kthreads"). This can
result in redundant self-wakeups.
This commit therefore replaces the simple wait-queue wakeups with
rcu_gp_kthread_wake(), thus avoiding the redundant wakeups.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit improves the accuracy of the interaction between CPU hotplug
operations and RCU's expedited grace periods by using RCU's online-CPU
state to determine when failed IPIs should be retried.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The expedited RCU grace periods currently rely on a failure indication
from smp_call_function_single() to determine that a given CPU is offline.
This works after a fashion, but is more contorted and less precise than
relying on RCU's internal state. This commit therefore takes a first
step towards relying on internal state.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The expedited RCU CPU stall warnings currently responds to neither the
panic_on_rcu_stall sysctl setting nor the rcupdate.rcu_cpu_stall_suppress
kernel boot parameter. This commit therefore updates the expedited code
to respond to these two controls.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Now that RCU expedited grace periods are always driven by a workqueue,
there is no need to account for signal reception, and thus no need
to disable expedited RCU CPU stall warnings due to signal reception.
This commit therefore removes the signal-reception checks, leaving a
WARN_ON() to catch possible future bugs.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The current implementation of expedited grace periods has the user
task drive the grace period. This works, but has downsides: (1) The
user task must awaken tasks piggybacking on this grace period, which
can result in latencies rivaling that of the grace period itself, and
(2) User tasks can receive signals, which interfere with RCU CPU stall
warnings.
This commit therefore uses workqueues to drive the grace periods, so
that the user task need not do the awakening. A subsequent commit
will remove the now-unnecessary code allowing for signals.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The functions synchronize_rcu_expedited() and synchronize_sched_expedited()
have nearly identical code. This commit therefore consolidates this code
into a new _synchronize_rcu_expedited() function.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The current percpu-rwsem read side is entirely free of serializing insns
at the cost of having a synchronize_sched() in the write path.
The latency of the synchronize_sched() is too high for cgroups. The
commit 1ed1328792 talks about the write path being a fairly cold path
but this is not the case for Android which moves task to the foreground
cgroup and back around binder IPC calls from foreground processes to
background processes, so it is significantly hotter than human initiated
operations.
Switch cgroup_threadgroup_rwsem into the slow mode for now to avoid the
problem, hopefully it should not be that slow after another commit:
80127a3968 ("locking/percpu-rwsem: Optimize readers and reduce global impact").
We could just add rcu_sync_enter() into cgroup_init() but we do not want
another synchronize_sched() at boot time, so this patch adds the new helper
which doesn't block but currently can only be called before the first use.
Reported-by: John Stultz <john.stultz@linaro.org>
Reported-by: Dmitry Shmidt <dimitrysh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Colin Cross <ccross@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rom Lemarchand <romlem@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@google.com>
Link: http://lkml.kernel.org/r/20160811165413.GA22807@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently the percpu-rwsem switches to (global) atomic ops while a
writer is waiting; which could be quite a while and slows down
releasing the readers.
This patch cures this problem by ordering the reader-state vs
reader-count (see the comments in __percpu_down_read() and
percpu_down_write()). This changes a global atomic op into a full
memory barrier, which doesn't have the global cacheline contention.
This also enables using the percpu-rwsem with rcu_sync disabled in order
to bias the implementation differently, reducing the writer latency by
adding some cost to readers.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
[ Fixed modular build. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull smp hotplug updates from Thomas Gleixner:
"This is the next part of the hotplug rework.
- Convert all notifiers with a priority assigned
- Convert all CPU_STARTING/DYING notifiers
The final removal of the STARTING/DYING infrastructure will happen
when the merge window closes.
Another 700 hundred line of unpenetrable maze gone :)"
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
timers/core: Correct callback order during CPU hot plug
leds/trigger/cpu: Move from CPU_STARTING to ONLINE level
powerpc/numa: Convert to hotplug state machine
arm/perf: Fix hotplug state machine conversion
irqchip/armada: Avoid unused function warnings
ARC/time: Convert to hotplug state machine
clocksource/atlas7: Convert to hotplug state machine
clocksource/armada-370-xp: Convert to hotplug state machine
clocksource/exynos_mct: Convert to hotplug state machine
clocksource/arm_global_timer: Convert to hotplug state machine
rcu: Convert rcutree to hotplug state machine
KVM/arm/arm64/vgic-new: Convert to hotplug state machine
smp/cfd: Convert core to hotplug state machine
x86/x2apic: Convert to CPU hotplug state machine
profile: Convert to hotplug state machine
timers/core: Convert to hotplug state machine
hrtimer: Convert to hotplug state machine
x86/tboot: Convert to hotplug state machine
arm64/armv8 deprecated: Convert to hotplug state machine
hwtracing/coresight-etm4x: Convert to hotplug state machine
...
Pull locking updates from Ingo Molnar:
"The locking tree was busier in this cycle than the usual pattern - a
couple of major projects happened to coincide.
The main changes are:
- implement the atomic_fetch_{add,sub,and,or,xor}() API natively
across all SMP architectures (Peter Zijlstra)
- add atomic_fetch_{inc/dec}() as well, using the generic primitives
(Davidlohr Bueso)
- optimize various aspects of rwsems (Jason Low, Davidlohr Bueso,
Waiman Long)
- optimize smp_cond_load_acquire() on arm64 and implement LSE based
atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
on arm64 (Will Deacon)
- introduce smp_acquire__after_ctrl_dep() and fix various barrier
mis-uses and bugs (Peter Zijlstra)
- after discovering ancient spin_unlock_wait() barrier bugs in its
implementation and usage, strengthen its semantics and update/fix
usage sites (Peter Zijlstra)
- optimize mutex_trylock() fastpath (Peter Zijlstra)
- ... misc fixes and cleanups"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (67 commits)
locking/atomic: Introduce inc/dec variants for the atomic_fetch_$op() API
locking/barriers, arch/arm64: Implement LDXR+WFE based smp_cond_load_acquire()
locking/static_keys: Fix non static symbol Sparse warning
locking/qspinlock: Use __this_cpu_dec() instead of full-blown this_cpu_dec()
locking/atomic, arch/tile: Fix tilepro build
locking/atomic, arch/m68k: Remove comment
locking/atomic, arch/arc: Fix build
locking/Documentation: Clarify limited control-dependency scope
locking/atomic, arch/rwsem: Employ atomic_long_fetch_add()
locking/atomic, arch/qrwlock: Employ atomic_fetch_add_acquire()
locking/atomic, arch/mips: Convert to _relaxed atomics
locking/atomic, arch/alpha: Convert to _relaxed atomics
locking/atomic: Remove the deprecated atomic_{set,clear}_mask() functions
locking/atomic: Remove linux/atomic.h:atomic_fetch_or()
locking/atomic: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
locking/atomic: Fix atomic64_relaxed() bits
locking/atomic, arch/xtensa: Implement atomic_fetch_{add,sub,and,or,xor}()
locking/atomic, arch/x86: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
locking/atomic, arch/tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
locking/atomic, arch/sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
...
In many cases in the RCU tree code, we iterate over the set of cpus for
a leaf node described by rcu_node::grplo and rcu_node::grphi, checking
per-cpu data for each cpu in this range. However, if the set of possible
cpus is sparse, some cpus described in this range are not possible, and
thus no per-cpu region will have been allocated (or initialised) for
them by the generic percpu code.
Erroneous accesses to a per-cpu area for these !possible cpus may fault
or may hit other data depending on the addressed generated when the
erroneous per cpu offset is applied. In practice, both cases have been
observed on arm64 hardware (the former being silent, but detectable with
additional patches).
To avoid issues resulting from this, we must iterate over the set of
*possible* cpus for a given leaf node. This patch add a new helper,
for_each_leaf_node_possible_cpu, to enable this. As iteration is often
intertwined with rcu_node local bitmask manipulation, a new
leaf_node_cpu_bit helper is added to make this simpler and more
consistent. The RCU tree code is made to use both of these where
appropriate.
Without this patch, running reboot at a shell can result in an oops
like:
[ 3369.075979] Unable to handle kernel paging request at virtual address ffffff8008b21b4c
[ 3369.083881] pgd = ffffffc3ecdda000
[ 3369.087270] [ffffff8008b21b4c] *pgd=00000083eca48003, *pud=00000083eca48003, *pmd=0000000000000000
[ 3369.096222] Internal error: Oops: 96000007 [#1] PREEMPT SMP
[ 3369.101781] Modules linked in:
[ 3369.104825] CPU: 2 PID: 1817 Comm: NetworkManager Tainted: G W 4.6.0+ #3
[ 3369.121239] task: ffffffc0fa13e000 ti: ffffffc3eb940000 task.ti: ffffffc3eb940000
[ 3369.128708] PC is at sync_rcu_exp_select_cpus+0x188/0x510
[ 3369.134094] LR is at sync_rcu_exp_select_cpus+0x104/0x510
[ 3369.139479] pc : [<ffffff80081109a8>] lr : [<ffffff8008110924>] pstate: 200001c5
[ 3369.146860] sp : ffffffc3eb9435a0
[ 3369.150162] x29: ffffffc3eb9435a0 x28: ffffff8008be4f88
[ 3369.155465] x27: ffffff8008b66c80 x26: ffffffc3eceb2600
[ 3369.160767] x25: 0000000000000001 x24: ffffff8008be4f88
[ 3369.166070] x23: ffffff8008b51c3c x22: ffffff8008b66c80
[ 3369.171371] x21: 0000000000000001 x20: ffffff8008b21b40
[ 3369.176673] x19: ffffff8008b66c80 x18: 0000000000000000
[ 3369.181975] x17: 0000007fa951a010 x16: ffffff80086a30f0
[ 3369.187278] x15: 0000007fa9505590 x14: 0000000000000000
[ 3369.192580] x13: ffffff8008b51000 x12: ffffffc3eb940000
[ 3369.197882] x11: 0000000000000006 x10: ffffff8008b51b78
[ 3369.203184] x9 : 0000000000000001 x8 : ffffff8008be4000
[ 3369.208486] x7 : ffffff8008b21b40 x6 : 0000000000001003
[ 3369.213788] x5 : 0000000000000000 x4 : ffffff8008b27280
[ 3369.219090] x3 : ffffff8008b21b4c x2 : 0000000000000001
[ 3369.224406] x1 : 0000000000000001 x0 : 0000000000000140
...
[ 3369.972257] [<ffffff80081109a8>] sync_rcu_exp_select_cpus+0x188/0x510
[ 3369.978685] [<ffffff80081128b4>] synchronize_rcu_expedited+0x64/0xa8
[ 3369.985026] [<ffffff80086b987c>] synchronize_net+0x24/0x30
[ 3369.990499] [<ffffff80086ddb54>] dev_deactivate_many+0x28c/0x298
[ 3369.996493] [<ffffff80086b6bb8>] __dev_close_many+0x60/0xd0
[ 3370.002052] [<ffffff80086b6d48>] __dev_close+0x28/0x40
[ 3370.007178] [<ffffff80086bf62c>] __dev_change_flags+0x8c/0x158
[ 3370.012999] [<ffffff80086bf718>] dev_change_flags+0x20/0x60
[ 3370.018558] [<ffffff80086cf7f0>] do_setlink+0x288/0x918
[ 3370.023771] [<ffffff80086d0798>] rtnl_newlink+0x398/0x6a8
[ 3370.029158] [<ffffff80086cee84>] rtnetlink_rcv_msg+0xe4/0x220
[ 3370.034891] [<ffffff80086e274c>] netlink_rcv_skb+0xc4/0xf8
[ 3370.040364] [<ffffff80086ced8c>] rtnetlink_rcv+0x2c/0x40
[ 3370.045663] [<ffffff80086e1fe8>] netlink_unicast+0x160/0x238
[ 3370.051309] [<ffffff80086e24b8>] netlink_sendmsg+0x2f0/0x358
[ 3370.056956] [<ffffff80086a0070>] sock_sendmsg+0x18/0x30
[ 3370.062168] [<ffffff80086a21cc>] ___sys_sendmsg+0x26c/0x280
[ 3370.067728] [<ffffff80086a30ac>] __sys_sendmsg+0x44/0x88
[ 3370.073027] [<ffffff80086a3100>] SyS_sendmsg+0x10/0x20
[ 3370.078153] [<ffffff8008085e70>] el0_svc_naked+0x24/0x28
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Dennis Chen <dennis.chen@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
It is not always easy to determine the cause of an RCU stall just by
analysing the RCU stall messages, mainly when the problem is caused
by the indirect starvation of rcu threads. For example, when preempt_rcu
is not awakened due to the starvation of a timer softirq.
We have been hard coding panic() in the RCU stall functions for
some time while testing the kernel-rt. But this is not possible in
some scenarios, like when supporting customers.
This patch implements the sysctl kernel.panic_on_rcu_stall. If
set to 1, the system will panic() when an RCU stall takes place,
enabling the capture of a vmcore. The vmcore provides a way to analyze
all kernel/tasks states, helping out to point to the culprit and the
solution for the stall.
The kernel.panic_on_rcu_stall sysctl is disabled by default.
Changes from v1:
- Fixed a typo in the git log
- The if(sysctl_panic_on_rcu_stall) panic() is in a static function
- Fixed the CONFIG_TINY_RCU compilation issue
- The var sysctl_panic_on_rcu_stall is now __read_mostly
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Tested-by: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Currently, if the very first call to call_rcu_tasks() has irqs disabled,
it will create the rcu_tasks_kthread with irqs disabled, which will
result in a splat in the memory allocator, which kthread_run() invokes
with the expectation that irqs are enabled.
This commit fixes this problem by deferring kthread creation if called
with irqs disabled. The first call to call_rcu_tasks() that has irqs
enabled will create the kthread.
This bug was detected by rcutorture changes that were motivated by
Iftekhar Ahmed's mutation-testing efforts.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Fix to return a negative error code -ENOMEM from kcalloc() error
handling case instead of 0, as done elsewhere in this function.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
0day found a boot warning triggered in rcu_perf_writer() on !SMP kernel:
WARN_ON(rcu_gp_is_normal() && gp_exp);
, the root cause of which is trying to measure expedited grace
periods(by setting gp_exp to true by default) when all the grace periods
are normal(TINY RCU only has normal grace periods).
However, such a mis-setting would only result in failing to measure the
performance for a specific kind of grace periods, therefore using a
WARN_ON to check this is a little overkilling. We could handle this
inside rcuperf module via some error messages to tell users about the
mis-settings.
Therefore this patch removes the WARN_ON in rcu_perf_writer() and
handles those checkings in rcu_perf_init() with plain if() code.
Moreover, this patch changes the default value of gp_exp to 1) align
with rcutorture tests and 2) make the default setting work for all RCU
implementations by default.
Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Fixes: http://lkml.kernel.org/r/57411b10.mFvG0+AgcrMXGtcj%fengguang.wu@intel.com
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit removes CONFIG_RCU_TORTURE_TEST_RUNNABLE in favor of the
already-existing rcutorture.torture_runnable kernel boot parameter.
It also converts an #ifdef into IS_ENABLED(), saving a few lines of code.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit applies the infamous IS_ENABLED() macro to eliminate a #ifdef.
It also eliminates the RCU_PERF_TEST_RUNNABLE Kconfig option in favor
of the already-existing rcuperf.perf_runnable kernel boot parameter.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
People have been having some difficulty finding their way around the
RCU code. This commit therefore pulls some of the expedited grace-period
code from tree_plugin.h to a new tree_exp.h file. This commit is strictly
code movement.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
People have been having some difficulty finding their way around the
RCU code. This commit therefore pulls some of the expedited grace-period
code from tree.c to a new tree_exp.h file. This commit is strictly code
movement, with the exception of a forward declaration that was added
for the sync_sched_exp_online_cleanup() function.
A subsequent commit will move the remaining expedited grace-period code
from tree_plugin.h to tree_exp.h.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
I think you'll find this condition is superfluous, as the whole function
is under #ifdef of that same.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
In the past, RCU grace-period initialization excluded CPU-hotplug
operations, but this is no longer the case. This commit therefore
removed an outdated comment in rcu_gp_init() claiming that these
are excluded.
Reported-by: Lihao Liang <lihao.liang@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The comment header for rcu_scheduler_active states that it is used
to optimize synchronize_sched() at early boot. This is incorrect.
The synchronize_sched() function instead checks the number of online
CPUs. This commit therefore replaces the comment's synchronize_sched()
with synchronize_rcu(), which really does use rcu_scheduler_active for
this purpose.
Reported-by: Lihao Liang <lihao.liang@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
When activating a static object we need make sure that the object is
tracked in the object tracker. If it is a non-static object then the
activation is illegal.
In previous implementation, each subsystem need take care of this in
their fixup callbacks. Actually we can put it into debugobjects core.
Thus we can save duplicated code, and have *pure* fixup callbacks.
To achieve this, a new callback "is_static_object" is introduced to let
the type specific code decide whether a object is static or not. If
yes, we take it into object tracker, otherwise give warning and invoke
fixup callback.
This change has paassed debugobjects selftest, and I also do some test
with all debugobjects supports enabled.
At last, I have a concern about the fixups that can it change the object
which is in incorrect state on fixup? Because the 'addr' may not point
to any valid object if a non-static object is not tracked. Then Change
such object can overwrite someone's memory and cause unexpected
behaviour. For example, the timer_fixup_activate bind timer to function
stub_timer.
Link: http://lkml.kernel.org/r/1462576157-14539-1-git-send-email-changbin.du@intel.com
[changbin.du@intel.com: improve code comments where invoke the new is_static_object callback]
Link: http://lkml.kernel.org/r/1462777431-8171-1-git-send-email-changbin.du@intel.com
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Triplett <josh@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>