* tmp-917a9:
ARM/vdso: Mark the vDSO code read-only after init
x86/vdso: Mark the vDSO code read-only after init
lkdtm: Verify that '__ro_after_init' works correctly
arch: Introduce post-init read-only memory
x86/mm: Always enable CONFIG_DEBUG_RODATA and remove the Kconfig option
mm/init: Add 'rodata=off' boot cmdline parameter to disable read-only kernel mappings
asm-generic: Consolidate mark_rodata_ro()
Linux 4.4.6
ld-version: Fix awk regex compile failure
target: Drop incorrect ABORT_TASK put for completed commands
block: don't optimize for non-cloned bio in bio_get_last_bvec()
MIPS: smp.c: Fix uninitialised temp_foreign_map
MIPS: Fix build error when SMP is used without GIC
ovl: fix getcwd() failure after unsuccessful rmdir
ovl: copy new uid/gid into overlayfs runtime inode
userfaultfd: don't block on the last VM updates at exit time
powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages
powerpc/powernv: Add a kmsg_dumper that flushes console output on panic
powerpc: Fix dedotify for binutils >= 2.26
Revert "drm/radeon/pm: adjust display configuration after powerstate"
drm/radeon: Fix error handling in radeon_flip_work_func.
drm/amdgpu: Fix error handling in amdgpu_flip_work_func.
Revert "drm/radeon: call hpd_irq_event on resume"
x86/mm: Fix slow_virt_to_phys() for X86_PAE again
gpu: ipu-v3: Do not bail out on missing optional port nodes
mac80211: Fix Public Action frame RX in AP mode
mac80211: check PN correctly for GCMP-encrypted fragmented MPDUs
mac80211: minstrel_ht: fix a logic error in RTS/CTS handling
mac80211: minstrel_ht: set default tx aggregation timeout to 0
mac80211: fix use of uninitialised values in RX aggregation
mac80211: minstrel: Change expected throughput unit back to Kbps
iwlwifi: mvm: inc pending frames counter also when txing non-sta
can: gs_usb: fixed disconnect bug by removing erroneous use of kfree()
cfg80211/wext: fix message ordering
wext: fix message delay/ordering
ovl: fix working on distributed fs as lower layer
ovl: ignore lower entries when checking purity of non-directory entries
ASoC: wm8958: Fix enum ctl accesses in a wrong type
ASoC: wm8994: Fix enum ctl accesses in a wrong type
ASoC: samsung: Use IRQ safe spin lock calls
ASoC: dapm: Fix ctl value accesses in a wrong type
ncpfs: fix a braino in OOM handling in ncp_fill_cache()
jffs2: reduce the breakage on recovery from halfway failed rename()
dmaengine: at_xdmac: fix residue computation
tracing: Fix check for cpu online when event is disabled
s390/dasd: fix diag 0x250 inline assembly
s390/mm: four page table levels vs. fork
KVM: MMU: fix reserved bit check for ept=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0
KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo
KVM: PPC: Book3S HV: Sanitize special-purpose register values on guest exit
KVM: s390: correct fprs on SIGP (STOP AND) STORE STATUS
KVM: VMX: disable PEBS before a guest entry
kvm: cap halt polling at exactly halt_poll_ns
PCI: Allow a NULL "parent" pointer in pci_bus_assign_domain_nr()
ARM: OMAP2+: hwmod: Introduce ti,no-idle dt property
ARM: dts: dra7: do not gate cpsw clock due to errata i877
ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window
arm64: account for sparsemem section alignment when choosing vmemmap offset
Linux 4.4.5
drm/amdgpu: fix topaz/tonga gmc assignment in 4.4 stable
modules: fix longstanding /proc/kallsyms vs module insertion race.
drm/i915: refine qemu south bridge detection
drm/i915: more virtual south bridge detection
block: get the 1st and last bvec via helpers
block: check virt boundary in bio_will_gap()
drm/amdgpu: Use drm_calloc_large for VM page_tables array
thermal: cpu_cooling: fix out of bounds access in time_in_idle
i2c: brcmstb: allocate correct amount of memory for regmap
ubi: Fix out of bounds write in volume update code
cxl: Fix PSL timebase synchronization detection
MIPS: traps: Fix SIGFPE information leak from `do_ov' and `do_trap_or_bp'
MIPS: scache: Fix scache init with invalid line size.
USB: serial: option: add support for Quectel UC20
USB: serial: option: add support for Telit LE922 PID 0x1045
USB: qcserial: add Sierra Wireless EM74xx device ID
USB: qcserial: add Dell Wireless 5809e Gobi 4G HSPA+ (rev3)
USB: cp210x: Add ID for Parrot NMEA GPS Flight Recorder
usb: chipidea: otg: change workqueue ci_otg as freezable
ALSA: timer: Fix broken compat timer user status ioctl
ALSA: hdspm: Fix zero-division
ALSA: hdsp: Fix wrong boolean ctl value accesses
ALSA: hdspm: Fix wrong boolean ctl value accesses
ALSA: seq: oss: Don't drain at closing a client
ALSA: pcm: Fix ioctls for X32 ABI
ALSA: timer: Fix ioctls for X32 ABI
ALSA: rawmidi: Fix ioctls X32 ABI
ALSA: hda - Fix mic issues on Acer Aspire E1-472
ALSA: ctl: Fix ioctls for X32 ABI
ALSA: usb-audio: Add a quirk for Plantronics DA45
adv7604: fix tx 5v detect regression
dmaengine: pxa_dma: fix cyclic transfers
Fix directory hardlinks from deleted directories
jffs2: Fix page lock / f->sem deadlock
Revert "jffs2: Fix lock acquisition order bug in jffs2_write_begin"
Btrfs: fix loading of orphan roots leading to BUG_ON
pata-rb532-cf: get rid of the irq_to_gpio() call
tracing: Do not have 'comm' filter override event 'comm' field
ata: ahci: don't mark HotPlugCapable Ports as external/removable
PM / sleep / x86: Fix crash on graph trace through x86 suspend
arm64: vmemmap: use virtual projection of linear region
Adding Intel Lewisburg device IDs for SATA
writeback: flush inode cgroup wb switches instead of pinning super_block
block: bio: introduce helpers to get the 1st and last bvec
libata: Align ata_device's id on a cacheline
libata: fix HDIO_GET_32BIT ioctl
drm/amdgpu: return from atombios_dp_get_dpcd only when error
drm/amdgpu/gfx8: specify which engine to wait before vm flush
drm/amdgpu: apply gfx_v8 fixes to gfx_v7 as well
drm/amdgpu/pm: update current crtc info after setting the powerstate
drm/radeon/pm: update current crtc info after setting the powerstate
drm/ast: Fix incorrect register check for DRAM width
target: Fix WRITE_SAME/DISCARD conversion to linux 512b sectors
iommu/vt-d: Use BUS_NOTIFY_REMOVED_DEVICE in hotplug path
iommu/amd: Fix boot warning when device 00:00.0 is not iommu covered
iommu/amd: Apply workaround for ATS write permission check
arm/arm64: KVM: Fix ioctl error handling
KVM: x86: fix root cause for missed hardware breakpoints
vfio: fix ioctl error handling
Fix cifs_uniqueid_to_ino_t() function for s390x
CIFS: Fix SMB2+ interim response processing for read requests
cifs: fix out-of-bounds access in lease parsing
fbcon: set a default value to blink interval
kvm: x86: Update tsc multiplier on change.
mips/kvm: fix ioctl error handling
parisc: Fix ptrace syscall number and return value modification
PCI: keystone: Fix MSI code that retrieves struct pcie_port pointer
block: Initialize max_dev_sectors to 0
drm/amdgpu: mask out WC from BO on unsupported arches
btrfs: async-thread: Fix a use-after-free error for trace
btrfs: Fix no_space in write and rm loop
Btrfs: fix deadlock running delayed iputs at transaction commit time
drivers: sh: Restore legacy clock domain on SuperH platforms
use ->d_seq to get coherency between ->d_inode and ->d_flags
Linux 4.4.4
iwlwifi: mvm: don't allow sched scans without matches to be started
iwlwifi: update and fix 7265 series PCI IDs
iwlwifi: pcie: properly configure the debug buffer size for 8000
iwlwifi: dvm: fix WoWLAN
security: let security modules use PTRACE_MODE_* with bitmasks
IB/cma: Fix RDMA port validation for iWarp
x86/irq: Plug vector cleanup race
x86/irq: Call irq_force_move_complete with irq descriptor
x86/irq: Remove outgoing CPU from vector cleanup mask
x86/irq: Remove the cpumask allocation from send_cleanup_vector()
x86/irq: Clear move_in_progress before sending cleanup IPI
x86/irq: Remove offline cpus from vector cleanup
x86/irq: Get rid of code duplication
x86/irq: Copy vectormask instead of an AND operation
x86/irq: Check vector allocation early
x86/irq: Reorganize the search in assign_irq_vector
x86/irq: Reorganize the return path in assign_irq_vector
x86/irq: Do not use apic_chip_data.old_domain as temporary buffer
x86/irq: Validate that irq descriptor is still active
x86/irq: Fix a race in x86_vector_free_irqs()
x86/irq: Call chip->irq_set_affinity in proper context
x86/entry/compat: Add missing CLAC to entry_INT80_32
x86/mpx: Fix off-by-one comparison with nr_registers
hpfs: don't truncate the file when delete fails
do_last(): ELOOP failure exit should be done after leaving RCU mode
should_follow_link(): validate ->d_seq after having decided to follow
xen/pcifront: Fix mysterious crashes when NUMA locality information was extracted.
xen/pciback: Save the number of MSI-X entries to be copied later.
xen/pciback: Check PF instead of VF for PCI_COMMAND_MEMORY
xen/scsiback: correct frontend counting
xen/arm: correctly handle DMA mapping of compound pages
ARM: at91/dt: fix typo in sama5d2 pinmux descriptions
ARM: OMAP2+: Fix onenand initialization to avoid filesystem corruption
do_last(): don't let a bogus return value from ->open() et.al. to confuse us
kernel/resource.c: fix muxed resource handling in __request_region()
sunrpc/cache: fix off-by-one in qword_get()
tracing: Fix showing function event in available_events
powerpc/eeh: Fix partial hotplug criterion
KVM: x86: MMU: fix ubsan index-out-of-range warning
KVM: x86: fix conversion of addresses to linear in 32-bit protected mode
KVM: x86: fix missed hardware breakpoints
KVM: arm/arm64: vgic: Ensure bitmaps are long enough
KVM: async_pf: do not warn on page allocation failures
of/irq: Fix msi-map calculation for nonzero rid-base
NFSv4: Fix a dentry leak on alias use
nfs: fix nfs_size_to_loff_t
block: fix use-after-free in dio_bio_complete
bio: return EINTR if copying to user space got interrupted
i2c: i801: Adding Intel Lewisburg support for iTCO
phy: core: fix wrong err handle for phy_power_on
writeback: keep superblock pinned during cgroup writeback association switches
cgroup: make sure a parent css isn't offlined before its children
cpuset: make mm migration asynchronous
PCI/AER: Flush workqueue on device remove to avoid use-after-free
ARCv2: SMP: Emulate IPI to self using software triggered interrupt
ARCv2: STAR 9000950267: Handle return from intr to Delay Slot #2
libata: fix sff host state machine locking while polling
qla2xxx: Fix stale pointer access.
spi: atmel: fix gpio chip-select in case of non-DT platform
target: Fix race with SCF_SEND_DELAYED_TAS handling
target: Fix remote-port TMR ABORT + se_cmd fabric stop
target: Fix TAS handling for multi-session se_node_acls
target: Fix LUN_RESET active TMR descriptor handling
target: Fix LUN_RESET active I/O handling for ACK_KREF
ALSA: hda - Fixing background noise on Dell Inspiron 3162
ALSA: hda - Apply clock gate workaround to Skylake, too
Revert "workqueue: make sure delayed work run in local cpu"
workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup
mac80211: Requeue work after scan complete for all VIF types.
rfkill: fix rfkill_fop_read wait_event usage
tick/nohz: Set the correct expiry when switching to nohz/lowres mode
perf stat: Do not clean event's private stats
cdc-acm:exclude Samsung phone 04e8:685d
Revert "Staging: panel: usleep_range is preferred over udelay"
Staging: speakup: Fix getting port information
sd: Optimal I/O size is in bytes, not sectors
libceph: don't spam dmesg with stray reply warnings
libceph: use the right footer size when skipping a message
libceph: don't bail early from try_read() when skipping a message
libceph: fix ceph_msg_revoke()
seccomp: always propagate NO_NEW_PRIVS on tsync
cpufreq: Fix NULL reference crash while accessing policy->governor_data
cpufreq: pxa2xx: fix pxa_cpufreq_change_voltage prototype
hwmon: (ads1015) Handle negative conversion values correctly
hwmon: (gpio-fan) Remove un-necessary speed_index lookup for thermal hook
hwmon: (dell-smm) Blacklist Dell Studio XPS 8000
Thermal: do thermal zone update after a cooling device registered
Thermal: handle thermal zone device properly during system sleep
Thermal: initialize thermal zone device correctly
IB/mlx5: Expose correct maximum number of CQE capacity
IB/qib: Support creating qps with GFP_NOIO flag
IB/qib: fix mcast detach when qp not attached
IB/cm: Fix a recently introduced deadlock
dmaengine: dw: disable BLOCK IRQs for non-cyclic xfer
dmaengine: at_xdmac: fix resume for cyclic transfers
dmaengine: dw: fix cyclic transfer callbacks
dmaengine: dw: fix cyclic transfer setup
nfit: fix multi-interface dimm handling, acpi6.1 compatibility
ACPI / PCI / hotplug: unlock in error path in acpiphp_enable_slot()
ACPI: Revert "ACPI / video: Add Dell Inspiron 5737 to the blacklist"
ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Satellite R830
ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Portege R700
lib: sw842: select crc32
uapi: update install list after nvme.h rename
ideapad-laptop: Add Lenovo Yoga 700 to no_hw_rfkill dmi list
ideapad-laptop: Add Lenovo ideapad Y700-17ISK to no_hw_rfkill dmi list
toshiba_acpi: Fix blank screen at boot if transflective backlight is supported
make sure that freeing shmem fast symlinks is RCU-delayed
drm/radeon/pm: adjust display configuration after powerstate
drm/radeon: Don't hang in radeon_flip_work_func on disabled crtc. (v2)
drm: Fix treatment of drm_vblank_offdelay in drm_vblank_on() (v2)
drm: Fix drm_vblank_pre/post_modeset regression from Linux 4.4
drm: Prevent vblank counter bumps > 1 with active vblank clients. (v2)
drm: No-Op redundant calls to drm_vblank_off() (v2)
drm/radeon: use post-decrement in error handling
drm/qxl: use kmalloc_array to alloc reloc_info in qxl_process_single_command
drm/i915: fix error path in intel_setup_gmbus()
drm/i915/dsi: don't pass arbitrary data to sideband
drm/i915/dsi: defend gpio table against out of bounds access
drm/i915/skl: Don't skip mst encoders in skl_ddi_pll_select()
drm/i915: Don't reject primary plane windowing with color keying enabled on SKL+
drm/i915/dp: fall back to 18 bpp when sink capability is unknown
drm/i915: Make sure DC writes are coherent on flush.
drm/i915: Init power domains early in driver load
drm/i915: intel_hpd_init(): Fix suspend/resume reprobing
drm/i915: Restore inhibiting the load of the default context
drm: fix missing reference counting decrease
drm/radeon: hold reference to fences in radeon_sa_bo_new
drm/radeon: mask out WC from BO on unsupported arches
drm: add helper to check for wc memory support
drm/radeon: fix DP audio support for APU with DCE4.1 display engine
drm/radeon: Add a common function for DFS handling
drm/radeon: cleaned up VCO output settings for DP audio
drm/radeon: properly byte swap vce firmware setup
drm/radeon: clean up fujitsu quirks
drm/radeon: Fix "slow" audio over DP on DCE8+
drm/radeon: call hpd_irq_event on resume
drm/radeon: Fix off-by-one errors in radeon_vm_bo_set_addr
drm/dp/mst: deallocate payload on port destruction
drm/dp/mst: Reverse order of MST enable and clearing VC payload table.
drm/dp/mst: move GUID storage from mgr, port to only mst branch
drm/dp/mst: Calculate MST PBN with 31.32 fixed point
drm: Add drm_fixp_from_fraction and drm_fixp2int_ceil
drm/dp/mst: fix in RAD element access
drm/dp/mst: fix in MSTB RAD initialization
drm/dp/mst: always send reply for UP request
drm/dp/mst: process broadcast messages correctly
drm/nouveau: platform: Fix deferred probe
drm/nouveau/disp/dp: ensure sink is powered up before attempting link training
drm/nouveau/display: Enable vblank irqs after display engine is on again.
drm/nouveau/kms: take mode_config mutex in connector hotplug path
drm/amdgpu/pm: adjust display configuration after powerstate
drm/amdgpu: Don't hang in amdgpu_flip_work_func on disabled crtc.
drm/amdgpu: use post-decrement in error handling
drm/amdgpu: fix issue with overlapping userptrs
drm/amdgpu: hold reference to fences in amdgpu_sa_bo_new (v2)
drm/amdgpu: remove unnecessary forward declaration
drm/amdgpu: fix s4 resume
drm/amdgpu: remove exp hardware support from iceland
drm/amdgpu: don't load MEC2 on topaz
drm/amdgpu: drop topaz support from gmc8 module
drm/amdgpu: pull topaz gmc bits into gmc_v7
drm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above
drm/amdgpu: iceland use CI based MC IP
drm/amdgpu: move gmc7 support out of CIK dependency
drm/amdgpu: no need to load MC firmware on fiji
drm/amdgpu: fix amdgpu_bo_pin_restricted VRAM placing v2
drm/amdgpu: fix tonga smu resume
drm/amdgpu: fix lost sync_to if scheduler is enabled.
drm/amdgpu: call hpd_irq_event on resume
drm/amdgpu: Fix off-by-one errors in amdgpu_vm_bo_map
drm/vmwgfx: respect 'nomodeset'
drm/vmwgfx: Fix a width / pitch mismatch on framebuffer updates
drm/vmwgfx: Fix an incorrect lock check
virtio_pci: fix use after free on release
virtio_balloon: fix race between migration and ballooning
virtio_balloon: fix race by fill and leak
regulator: mt6311: MT6311_REGULATOR needs to select REGMAP_I2C
regulator: axp20x: Fix GPIO LDO enable value for AXP22x
clk: exynos: use irqsave version of spin_lock to avoid deadlock with irqs
cxl: use correct operator when writing pcie config space values
sparc64: fix incorrect sign extension in sys_sparc64_personality
EDAC, mc_sysfs: Fix freeing bus' name
EDAC: Robustify workqueues destruction
MIPS: Fix buffer overflow in syscall_get_arguments()
MIPS: Fix some missing CONFIG_CPU_MIPSR6 #ifdefs
MIPS: hpet: Choose a safe value for the ETIME check
MIPS: Loongson-3: Fix SMP_ASK_C0COUNT IPI handler
Revert "MIPS: Fix PAGE_MASK definition"
cputime: Prevent 32bit overflow in time[val|spec]_to_cputime()
time: Avoid signed overflow in timekeeping_get_ns()
Bluetooth: 6lowpan: Fix handling of uncompressed IPv6 packets
Bluetooth: 6lowpan: Fix kernel NULL pointer dereferences
Bluetooth: Fix incorrect removing of IRKs
Bluetooth: Add support of Toshiba Broadcom based devices
Bluetooth: Use continuous scanning when creating LE connections
Drivers: hv: vmbus: Fix a Host signaling bug
tools: hv: vss: fix the write()'s argument: error -> vss_msg
mmc: sdhci: Allow override of get_cd() called from sdhci_request()
mmc: sdhci: Allow override of mmc host operations
mmc: sdhci-pci: Fix card detect race for Intel BXT/APL
mmc: pxamci: fix again read-only gpio detection polarity
mmc: sdhci-acpi: Fix card detect race for Intel BXT/APL
mmc: mmci: fix an ages old detection error
mmc: core: Enable tuning according to the actual timing
mmc: sdhci: Fix sdhci_runtime_pm_bus_on/off()
mmc: mmc: Fix incorrect use of driver strength switching HS200 and HS400
mmc: sdio: Fix invalid vdd in voltage switch power cycle
mmc: sdhci: Fix DMA descriptor with zero data length
mmc: sdhci-pci: Do not default to 33 Ohm driver strength for Intel SPT
mmc: usdhi6rol0: handle NULL data in timeout
clockevents/tcb_clksrc: Prevent disabling an already disabled clock
posix-clock: Fix return code on the poll method's error path
irqchip/gic-v3-its: Fix double ICC_EOIR write for LPI in EOImode==1
irqchip/atmel-aic: Fix wrong bit operation for IRQ priority
irqchip/mxs: Add missing set_handle_irq()
irqchip/omap-intc: Add support for spurious irq handling
coresight: checking for NULL string in coresight_name_match()
dm: fix dm_rq_target_io leak on faults with .request_fn DM w/ blk-mq paths
dm snapshot: fix hung bios when copy error occurs
dm space map metadata: remove unused variable in brb_pop()
tda1004x: only update the frontend properties if locked
vb2: fix a regression in poll() behavior for output,streams
gspca: ov534/topro: prevent a division by 0
si2157: return -EINVAL if firmware blob is too big
media: dvb-core: Don't force CAN_INVERSION_AUTO in oneshot mode
rc: sunxi-cir: Initialize the spinlock properly
namei: ->d_inode of a pinned dentry is stable only for positives
mei: validate request value in client notify request ioctl
mei: fix fasync return value on error
rtlwifi: rtl8723be: Fix module parameter initialization
rtlwifi: rtl8188ee: Fix module parameter initialization
rtlwifi: rtl8192se: Fix module parameter initialization
rtlwifi: rtl8723ae: Fix initialization of module parameters
rtlwifi: rtl8192de: Fix incorrect module parameter descriptions
rtlwifi: rtl8192ce: Fix handling of module parameters
rtlwifi: rtl8192cu: Add missing parameter setup
rtlwifi: rtl_pci: Fix kernel panic
locks: fix unlock when fcntl_setlk races with a close
um: link with -lpthread
uml: fix hostfs mknod()
uml: flush stdout before forking
s390/fpu: signals vs. floating point control register
s390/compat: correct restore of high gprs on signal return
s390/dasd: fix performance drop
s390/dasd: fix refcount for PAV reassignment
s390/dasd: prevent incorrect length error under z/VM after PAV changes
s390: fix normalization bug in exception table sorting
btrfs: initialize the seq counter in struct btrfs_device
Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots
Btrfs: fix transaction handle leak on failure to create hard link
Btrfs: fix number of transaction units required to create symlink
Btrfs: send, don't BUG_ON() when an empty symlink is found
btrfs: statfs: report zero available if metadata are exhausted
Btrfs: igrab inode in writepage
Btrfs: add missing brelse when superblock checksum fails
KVM: s390: fix memory overwrites when vx is disabled
s390/kvm: remove dependency on struct save_area definition
clocksource/drivers/vt8500: Increase the minimum delta
genirq: Validate action before dereferencing it in handle_irq_event_percpu()
mm: numa: quickly fail allocations for NUMA balancing on full nodes
mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED
ocfs2: unlock inode if deleting inode from orphan fails
drm/i915: shut up gen8+ SDE irq dmesg noise
iw_cxgb3: Fix incorrectly returning error on success
spi: omap2-mcspi: Prevent duplicate gpio_request
drivers: android: correct the size of struct binder_uintptr_t for BC_DEAD_BINDER_DONE
USB: option: add "4G LTE usb-modem U901"
USB: option: add support for SIM7100E
USB: cp210x: add IDs for GE B650V3 and B850V3 boards
usb: dwc3: Fix assignment of EP transfer resources
can: ems_usb: Fix possible tx overflow
dm thin: fix race condition when destroying thin pool workqueue
bcache: Change refill_dirty() to always scan entire disk if necessary
bcache: prevent crash on changing writeback_running
bcache: allows use of register in udev to avoid "device_busy" error.
bcache: unregister reboot notifier if bcache fails to unregister device
bcache: fix a leak in bch_cached_dev_run()
bcache: clear BCACHE_DEV_UNLINK_DONE flag when attaching a backing device
bcache: Add a cond_resched() call to gc
bcache: fix a livelock when we cause a huge number of cache misses
lib/ucs2_string: Correct ucs2 -> utf8 conversion
efi: Add pstore variables to the deletion whitelist
efi: Make efivarfs entries immutable by default
efi: Make our variable validation list include the guid
efi: Do variable name validation tests in utf8
efi: Use ucs2_as_utf8 in efivarfs instead of open coding a bad version
lib/ucs2_string: Add ucs2 -> utf8 helper functions
ARM: 8457/1: psci-smp is built only for SMP
drm/gma500: Use correct unref in the gem bo create function
devm_memremap: Fix error value when memremap failed
KVM: s390: fix guest fprs memory leak
arm64: errata: Add -mpc-relative-literal-loads to build flags
ARM: debug-ll: fix BCM63xx entry for multiplatform
ext4: fix bh->b_state corruption
sctp: Fix port hash table size computation
unix_diag: fix incorrect sign extension in unix_lookup_by_ino
tipc: unlock in error path
rtnl: RTM_GETNETCONF: fix wrong return value
IFF_NO_QUEUE: Fix for drivers not calling ether_setup()
tcp/dccp: fix another race at listener dismantle
route: check and remove route cache when we get route
net_sched fix: reclassification needs to consider ether protocol changes
pppoe: fix reference counting in PPPoE proxy
l2tp: Fix error creating L2TP tunnels
net/mlx4_en: Avoid changing dev->features directly in run-time
net/mlx4_en: Choose time-stamping shift value according to HW frequency
net/mlx4_en: Count HW buffer overrun only once
qmi_wwan: add "4G LTE usb-modem U901"
tcp: md5: release request socket instead of listener
tipc: fix premature addition of node to lookup table
af_unix: Guard against other == sk in unix_dgram_sendmsg
af_unix: Don't set err in unix_stream_read_generic unless there was an error
ipv4: fix memory leaks in ip_cmsg_send() callers
bonding: Fix ARP monitor validation
bpf: fix branch offset adjustment on backjumps after patching ctx expansion
flow_dissector: Fix unaligned access in __skb_flow_dissector when used by eth_get_headlen
net: Copy inner L3 and L4 headers as unaligned on GRE TEB
sctp: translate network order to host order when users get a hmacid
enic: increment devcmd2 result ring in case of timeout
tg3: Fix for tg3 transmit queue 0 timed out when too many gso_segs
net:Add sysctl_max_skb_frags
tcp: do not drop syn_recv on all icmp reports
unix: correctly track in-flight fds in sending process user_struct
ipv6: fix a lockdep splat
ipv6: addrconf: Fix recursive spin lock call
ipv6/udp: use sticky pktinfo egress ifindex on connect()
ipv6: enforce flowi6_oif usage in ip6_dst_lookup_tail()
tcp: beware of alignments in tcp_get_info()
switchdev: Require RTNL mutex to be held when sending FDB notifications
inet: frag: Always orphan skbs inside ip_defrag()
tipc: fix connection abort during subscription cancel
net: dsa: fix mv88e6xxx switches
sctp: allow setting SCTP_SACK_IMMEDIATELY by the application
pptp: fix illegal memory access caused by multiple bind()s
af_unix: fix struct pid memory leak
tcp: fix NULL deref in tcp_v4_send_ack()
lwt: fix rx checksum setting for lwt devices tunneling over ipv6
tunnels: Allow IPv6 UDP checksums to be correctly controlled.
net: dp83640: Fix tx timestamp overflow handling.
gro: Make GRO aware of lightweight tunnels.
af_iucv: Validate socket address length in iucv_sock_bind()
Conflicts:
arch/arm64/Makefile
arch/arm64/include/asm/cacheflush.h
drivers/mmc/host/sdhci.c
drivers/usb/dwc3/ep0.c
drivers/usb/dwc3/gadget.c
kernel/module.c
sound/core/pcm_compat.c
CRs-Fixed: 1010239
Signed-off-by: Runmin Wang <runminw@codeaurora.org>
Change-Id: I41a28636fc9ad91f9d979b191784609476294cdf
Workqueue stalls can happen from a variety of usage bugs such as
missing WQ_MEM_RECLAIM flag or concurrency managed work item
indefinitely staying RUNNING. These stalls can be extremely difficult
to hunt down because the usual warning mechanisms can't detect
workqueue stalls and the internal state is pretty opaque.
To alleviate the situation, this patch implements workqueue lockup
detector. It periodically monitors all worker_pools periodically and,
if any pool failed to make forward progress longer than the threshold
duration, triggers warning and dumps workqueue state as follows.
BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 31s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=17/256
pending: monkey_wrench_fn, e1000_watchdog, cache_reap, vmstat_shepherd, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, cgroup_release_agent
workqueue events_power_efficient: flags=0x80
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
pending: check_lifetime, neigh_periodic_work
workqueue cgroup_pidlist_destroy: flags=0x0
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
pending: cgroup_pidlist_destroy_work_fn
...
The detection mechanism is controller through kernel parameter
workqueue.watchdog_thresh and can be updated at runtime through the
sysfs module parameter file.
v2: Decoupled from softlockup control knobs.
CRs-Fixed: 1007459
Change-Id: Id7dfbbd2701128a942b1bcac2299e07a66db8657
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Chris Mason <clm@fb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Git-commit: 82607adcf9cdf40fb7b5331269780c8f70ec6e35
Git-repo: git://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
Data corruptions in the kernel often end up in system crashes that
are easier to debug closer to the time of detection. Specifically,
if we do not panic immediately after lock or list corruptions have been
detected, the problem context is lost in the ensuing system mayhem.
Add support for allowing system crash immediately after such corruptions
are detected. The CONFIG option controls the enabling/disabling of the
feature.
Change-Id: I9b2eb62da506a13007acff63e85e9515145909ff
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[abhimany: minor merge conflict resolution]
Signed-off-by: Abhimanyu Kapur <abhimany@codeaurora.org>
commit 041bd12e272c53a35c54c13875839bcb98c999ce upstream.
This reverts commit 874bbfe600.
Workqueue used to implicity guarantee that work items queued without
explicit CPU specified are put on the local CPU. Recent changes in
timer broke the guarantee and led to vmstat breakage which was fixed
by 176bed1de5 ("vmstat: explicitly schedule per-cpu work on the CPU
we need it to run on").
vmstat is the most likely to expose the issue and it's quite possible
that there are other similar problems which are a lot more difficult
to trigger. As a preventive measure, 874bbfe600 ("workqueue: make
sure delayed work run in local cpu") was applied to restore the local
CPU guarnatee. Unfortunately, the change exposed a bug in timer code
which got fixed by 22b886dd10 ("timers: Use proper base migration in
add_timer_on()"). Due to code restructuring, the commit couldn't be
backported beyond certain point and stable kernels which only had
874bbfe600 started crashing.
The local CPU guarantee was accidental more than anything else and we
want to get rid of it anyway. As, with the vmstat case fixed,
874bbfe600 is causing more problems than it's fixing, it has been
decided to take the chance and officially break the guarantee by
reverting the commit. A debug feature will be added to force foreign
CPU assignment to expose cases relying on the guarantee and fixes for
the individual cases will be backported to stable as necessary.
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 874bbfe600 ("workqueue: make sure delayed work run in local cpu")
Link: http://lkml.kernel.org/g/20160120211926.GJ10810@quack.suse.cz
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Daniel Bilik <daniel.bilik@neosystem.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Shaohua Li <shli@fb.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Bilik <daniel.bilik@neosystem.cz>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d6e022f1d207a161cd88e08ef0371554680ffc46 upstream.
When looking up the pool_workqueue to use for an unbound workqueue,
workqueue assumes that the target CPU is always bound to a valid NUMA
node. However, currently, when a CPU goes offline, the mapping is
destroyed and cpu_to_node() returns NUMA_NO_NODE.
This has always been broken but hasn't triggered often enough before
874bbfe600 ("workqueue: make sure delayed work run in local cpu").
After the commit, workqueue forcifully assigns the local CPU for
delayed work items without explicit target CPU to fix a different
issue. This widens the window where CPU can go offline while a
delayed work item is pending causing delayed work items dispatched
with target CPU set to an already offlined CPU. The resulting
NUMA_NO_NODE mapping makes workqueue try to queue the work item on a
NULL pool_workqueue and thus crash.
While 874bbfe600 has been reverted for a different reason making the
bug less visible again, it can still happen. Fix it by mapping
NUMA_NO_NODE to the default pool_workqueue from unbound_pwq_by_node().
This is a temporary workaround. The long term solution is keeping CPU
-> NODE mapping stable across CPU off/online cycles which is being
worked on.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Len Brown <len.brown@intel.com>
Link: http://lkml.kernel.org/g/1454424264.11183.46.camel@gmail.com
Link: http://lkml.kernel.org/g/1453702100-2597-1-git-send-email-tangchen@cn.fujitsu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull workqueue update from Tejun Heo:
"This pull request contains one patch to make an unbound worker pool
allocated from the NUMA node containing it if such node exists. As
unbound worker pools are node-affine by default, this makes most pools
allocated on the right node"
* 'for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Allocate the unbound pool using local node memory
Currently, get_unbound_pool() uses kzalloc() to allocate the
worker pool. Actually, we can use the right node to do the
allocation, achieving local memory access.
This patch selects target node first, and uses kzalloc_node()
instead.
Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
My system keeps crashing with below message. vmstat_update() schedules a delayed
work in current cpu and expects the work runs in the cpu.
schedule_delayed_work() is expected to make delayed work run in local cpu. The
problem is timer can be migrated with NO_HZ. __queue_work() queues work in
timer handler, which could run in a different cpu other than where the delayed
work is scheduled. The end result is the delayed work runs in different cpu.
The patch makes __queue_delayed_work records local cpu earlier. Where the timer
runs doesn't change where the work runs with the change.
[ 28.010131] ------------[ cut here ]------------
[ 28.010609] kernel BUG at ../mm/vmstat.c:1392!
[ 28.011099] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[ 28.011860] Modules linked in:
[ 28.012245] CPU: 0 PID: 289 Comm: kworker/0:3 Tainted: G W4.3.0-rc3+ #634
[ 28.013065] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153802- 04/01/2014
[ 28.014160] Workqueue: events vmstat_update
[ 28.014571] task: ffff880117682580 ti: ffff8800ba428000 task.ti: ffff8800ba428000
[ 28.015445] RIP: 0010:[<ffffffff8115f921>] [<ffffffff8115f921>]vmstat_update+0x31/0x80
[ 28.016282] RSP: 0018:ffff8800ba42fd80 EFLAGS: 00010297
[ 28.016812] RAX: 0000000000000000 RBX: ffff88011a858dc0 RCX:0000000000000000
[ 28.017585] RDX: ffff880117682580 RSI: ffffffff81f14d8c RDI:ffffffff81f4df8d
[ 28.018366] RBP: ffff8800ba42fd90 R08: 0000000000000001 R09:0000000000000000
[ 28.019169] R10: 0000000000000000 R11: 0000000000000121 R12:ffff8800baa9f640
[ 28.019947] R13: ffff88011a81e340 R14: ffff88011a823700 R15:0000000000000000
[ 28.020071] FS: 0000000000000000(0000) GS:ffff88011a800000(0000)knlGS:0000000000000000
[ 28.020071] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 28.020071] CR2: 00007ff6144b01d0 CR3: 00000000b8e93000 CR4:00000000000006f0
[ 28.020071] Stack:
[ 28.020071] ffff88011a858dc0 ffff8800baa9f640 ffff8800ba42fe00ffffffff8106bd88
[ 28.020071] ffffffff8106bd0b 0000000000000096 0000000000000000ffffffff82f9b1e8
[ 28.020071] ffffffff829f0b10 0000000000000000 ffffffff81f18460ffff88011a81e340
[ 28.020071] Call Trace:
[ 28.020071] [<ffffffff8106bd88>] process_one_work+0x1c8/0x540
[ 28.020071] [<ffffffff8106bd0b>] ? process_one_work+0x14b/0x540
[ 28.020071] [<ffffffff8106c214>] worker_thread+0x114/0x460
[ 28.020071] [<ffffffff8106c100>] ? process_one_work+0x540/0x540
[ 28.020071] [<ffffffff81071bf8>] kthread+0xf8/0x110
[ 28.020071] [<ffffffff81071b00>] ?kthread_create_on_node+0x200/0x200
[ 28.020071] [<ffffffff81a6522f>] ret_from_fork+0x3f/0x70
[ 28.020071] [<ffffffff81071b00>] ?kthread_create_on_node+0x200/0x200
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # v2.6.31+
Pull workqueue updates from Tejun Heo:
"Only three trivial changes for workqueue this time - doc, MAINTAINERS
and EXPORT_SYMBOL updates"
* 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix some docbook warnings
workqueue: Make flush_workqueue() available again to non GPL modules
workqueue: add myself as a dedicated reviwer
Pull scheduler updates from Ingo Molnar:
"The biggest change in this cycle is the rewrite of the main SMP load
balancing metric: the CPU load/utilization. The main goal was to make
the metric more precise and more representative - see the changelog of
this commit for the gory details:
9d89c257df ("sched/fair: Rewrite runnable load and utilization average tracking")
It is done in a way that significantly reduces complexity of the code:
5 files changed, 249 insertions(+), 494 deletions(-)
and the performance testing results are encouraging. Nevertheless we
need to keep an eye on potential regressions, since this potentially
affects every SMP workload in existence.
This work comes from Yuyang Du.
Other changes:
- SCHED_DL updates. (Andrea Parri)
- Simplify architecture callbacks by removing finish_arch_switch().
(Peter Zijlstra et al)
- cputime accounting: guarantee stime + utime == rtime. (Peter
Zijlstra)
- optimize idle CPU wakeups some more - inspired by Facebook server
loads. (Mike Galbraith)
- stop_machine fixes and updates. (Oleg Nesterov)
- Introduce the 'trace_sched_waking' tracepoint. (Peter Zijlstra)
- sched/numa tweaks. (Srikar Dronamraju)
- misc fixes and small cleanups"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
sched/deadline: Fix comment in enqueue_task_dl()
sched/deadline: Fix comment in push_dl_tasks()
sched: Change the sched_class::set_cpus_allowed() calling context
sched: Make sched_class::set_cpus_allowed() unconditional
sched: Fix a race between __kthread_bind() and sched_setaffinity()
sched: Ensure a task has a non-normalized vruntime when returning back to CFS
sched/numa: Fix NUMA_DIRECT topology identification
tile: Reorganize _switch_to()
sched, sparc32: Update scheduler comments in copy_thread()
sched: Remove finish_arch_switch()
sched, tile: Remove finish_arch_switch
sched, sh: Fold finish_arch_switch() into switch_to()
sched, score: Remove finish_arch_switch()
sched, avr32: Remove finish_arch_switch()
sched, MIPS: Get rid of finish_arch_switch()
sched, arm: Remove finish_arch_switch()
sched/fair: Clean up load average references
sched/fair: Provide runnable_load_avg back to cfs_rq
sched/fair: Remove task and group entity load when they are dead
sched/fair: Init cfs_rq's sched_entity load average
...
Commit 37b1ef31a5 ("workqueue: move
flush_scheduled_work() to workqueue.h") moved the exported non GPL
flush_scheduled_work() from a function to an inline wrapper.
Unfortunately, it directly calls flush_workqueue() which is a GPL function.
This has the effect of changing the licensing requirement for this function
and makes it unavailable to non GPL modules.
See commit ad7b1f841f ("workqueue: Make
schedule_work() available again to non GPL modules") for precedent.
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
This commit renames rcu_lockdep_assert() to RCU_LOCKDEP_WARN() for
consistency with the WARN() series of macros. This also requires
inverting the sense of the conditional, which this commit also does.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Pull module updates from Rusty Russell:
"Main excitement here is Peter Zijlstra's lockless rbtree optimization
to speed module address lookup. He found some abusers of the module
lock doing that too.
A little bit of parameter work here too; including Dan Streetman's
breaking up the big param mutex so writing a parameter can load
another module (yeah, really). Unfortunately that broke the usual
suspects, !CONFIG_MODULES and !CONFIG_SYSFS, so those fixes were
appended too"
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (26 commits)
modules: only use mod->param_lock if CONFIG_MODULES
param: fix module param locks when !CONFIG_SYSFS.
rcu: merge fix for Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE()
module: add per-module param_lock
module: make perm const
params: suppress unused variable error, warn once just in case code changes.
modules: clarify CONFIG_MODULE_COMPRESS help, suggest 'N'.
kernel/module.c: avoid ifdefs for sig_enforce declaration
kernel/workqueue.c: remove ifdefs over wq_power_efficient
kernel/params.c: export param_ops_bool_enable_only
kernel/params.c: generalize bool_enable_only
kernel/module.c: use generic module param operaters for sig_enforce
kernel/params: constify struct kernel_param_ops uses
sysfs: tightened sysfs permission checks
module: Rework module_addr_{min,max}
module: Use __module_address() for module_address_lookup()
module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING
module: Optimize __module_address() using a latched RB-tree
rbtree: Implement generic latch_tree
seqlock: Introduce raw_read_seqcount_latch()
...
flush_scheduled_work() is just a simple call to flush_work().
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Reading to wq->unbound_attrs requires protection of either wq_pool_mutex
or wq->mutex, and wq_sysfs_prep_attrs() is called with wq_pool_mutex held,
so we don't need to grab wq->mutex here.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
This pre-declaration was unneeded since a previous refactor patch
6ba94429c8 ("workqueue: Reorder sysfs code").
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Current modification to attrs via sysfs is not fully synchronized.
Process A (change cpumask) | Process B (change numa affinity)
wq_cpumask_store() |
wq_sysfs_prep_attrs() |
| apply_workqueue_attrs()
apply_workqueue_attrs() |
It results that the Process B's operation is totally reverted
without any notification, it is a buggy behavior. So this patch
moves wq_sysfs_prep_attrs() into the protection under wq_pool_mutex
to ensure attrs changes are properly synchronized.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Applying attrs requires two locks: get_online_cpus() and wq_pool_mutex,
and this code is duplicated at two places (apply_workqueue_attrs() and
workqueue_set_unbound_cpumask()). So we separate out this locking
code into apply_wqattrs_[un]lock() and do a minor refactor on
apply_workqueue_attrs().
The apply_wqattrs_[un]lock() will be also used on later patch for
ensuring attrs changes are properly synchronized.
tj: minor updates to comments
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
wq_update_unbound_numa() is known be called with wq_pool_mutex held.
But wq_update_unbound_numa() requests wq->mutex before reading
wq->unbound_attrs, wq->numa_pwq_tbl[] and wq->dfl_pwq. But these fields
were changed to be allowed being read with wq_pool_mutex held. So we
simply remove the mutex_lock(&wq->mutex).
Without the dependence on the the mutex_lock(&wq->mutex), the test
of wq->unbound_attrs->no_numa can also be moved upward.
The old code need a long comment to describe the stableness of
@wq->unbound_attrs which is also guaranteed by wq_pool_mutex now,
so we don't need this such comment.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Current wq_pool_mutex doesn't proctect the attrs-installation, it results
that ->unbound_attrs, ->numa_pwq_tbl[] and ->dfl_pwq can only be accessed
under wq->mutex and causes some inconveniences. Example, wq_update_unbound_numa()
has to acquire wq->mutex before fetching the wq->unbound_attrs->no_numa
and the old_pwq.
attrs-installation is a short operation, so this change will no cause any
latency for other operations which also acquire the wq_pool_mutex.
The only unprotected attrs-installation code is in apply_workqueue_attrs(),
so this patch touches code less than comments.
It is also a preparation patch for next several patches which read
wq->unbound_attrs, wq->numa_pwq_tbl[] and wq->dfl_pwq with
only wq_pool_mutex held.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Allow to modify the low-level unbound workqueues cpumask through
sysfs. This is performed by traversing the entire workqueue list
and calling apply_wqattrs_prepare() on the unbound workqueues
with the new low level mask. Only after all the preparation are done,
we commit them all together.
Ordered workqueues are ignored from the low level unbound workqueue
cpumask, it will be handled in near future.
All the (default & per-node) pwqs are mandatorily controlled by
the low level cpumask. If the user configured cpumask doesn't overlap
with the low level cpumask, the low level cpumask will be used for the
wq instead.
The comment of wq_calc_node_cpumask() is updated and explicitly
requires that its first argument should be the attrs of the default
pwq.
The default wq_unbound_cpumask is cpu_possible_mask. The workqueue
subsystem doesn't know its best default value, let the system manager
or the other subsystem set it when needed.
Changed from V8:
merge the calculating code for the attrs of the default pwq together.
minor change the code&comments for saving the user configured attrs.
remove unnecessary list_del().
minor update the comment of wq_calc_node_cpumask().
update the comment of workqueue_set_unbound_cpumask();
Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Original-patch-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Create a cpumask that limits the affinity of all unbound workqueues.
This cpumask is controlled through a file at the root of the workqueue
sysfs directory.
It works on a lower-level than the per WQ_SYSFS workqueues cpumask files
such that the effective cpumask applied for a given unbound workqueue is
the intersection of /sys/devices/virtual/workqueue/$WORKQUEUE/cpumask and
the new /sys/devices/virtual/workqueue/cpumask file.
This patch implements the basic infrastructure and the read interface.
wq_unbound_cpumask is initially set to cpu_possible_mask.
Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Current apply_workqueue_attrs() includes pwqs-allocation and pwqs-installation,
so when we batch multiple apply_workqueue_attrs()s as a transaction, we can't
ensure the transaction must succeed or fail as a complete unit.
To solve this, we split apply_workqueue_attrs() into three stages.
The first stage does the preparation: allocation memory, pwqs.
The second stage does the attrs-installaion and pwqs-installation.
The third stage frees the allocated memory and (old or unused) pwqs.
As the result, batching multiple apply_workqueue_attrs()s can
succeed or fail as a complete unit:
1) batch do all the first stage for all the workqueues
2) only commit all when all the above succeed.
This patch is a preparation for the next patch ("Allow modifying low level
unbound workqueue cpumask") which will do a multiple apply_workqueue_attrs().
The patch doesn't have functionality changed except two minor adjustment:
1) free_unbound_pwq() for the error path is removed, we use the
heavier version put_pwq_unlocked() instead since the error path
is rare. this adjustment simplifies the code.
2) the memory-allocation is also moved into wq_pool_mutex.
this is needed to avoid to do the further splitting.
tj: minor updates to comments.
Suggested-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
The sysfs code usually belongs to the botom of the file since it deals
with high level objects. In the workqueue code it's misplaced and such
that we'll need to work around functions references to allow the sysfs
code to call APIs like apply_workqueue_attrs().
Lets move that block further in the file, almost the botom.
And declare workqueue_sysfs_unregister() just before destroy_workqueue()
which reference it.
tj: Moved workqueue_sysfs_unregister() forward declaration where other
forward declarations are.
Suggested-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Workqueues are used extensively throughout the kernel but sometimes
it's difficult to debug stalls involving work items because visibility
into its inner workings is fairly limited. Although sysrq-t task dump
annotates each active worker task with the information on the work
item being executed, it is challenging to find out which work items
are pending or delayed on which queues and how pools are being
managed.
This patch implements show_workqueue_state() which dumps all busy
workqueues and pools and is called from the sysrq-t handler. At the
end of sysrq-t dump, something like the following is printed.
Showing busy workqueues and worker pools:
...
workqueue filler_wq: flags=0x0
pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256
in-flight: 491:filler_workfn, 507:filler_workfn
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
in-flight: 501:filler_workfn
pending: filler_workfn
...
workqueue test_wq: flags=0x8
pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/1
in-flight: 510(RESCUER):test_workfn BAR(69) BAR(500)
delayed: test_workfn1 BAR(492), test_workfn2
...
pool 0: cpus=0 node=0 flags=0x0 nice=0 workers=2 manager: 137
pool 2: cpus=1 node=0 flags=0x0 nice=0 workers=3 manager: 469
pool 3: cpus=1 node=0 flags=0x0 nice=-20 workers=2 idle: 16
pool 8: cpus=0-3 flags=0x4 nice=0 workers=2 manager: 62
The above shows that test_wq is executing test_workfn() on pid 510
which is the rescuer and also that there are two tasks 69 and 500
waiting for the work item to finish in flush_work(). As test_wq has
max_active of 1, there are two work items for test_workfn1() and
test_workfn2() which are delayed till the current work item is
finished. In addition, pid 492 is flushing test_workfn1().
The work item for test_workfn() is being executed on pwq of pool 2
which is the normal priority per-cpu pool for CPU 1. The pool has
three workers, two of which are executing filler_workfn() for
filler_wq and the last one is assuming the manager role trying to
create more workers.
This extra workqueue state dump will hopefully help chasing down hangs
involving workqueues.
v3: cpulist_pr_cont() replaced with "%*pbl" printf formatting.
v2: As suggested by Andrew, minor formatting change in pr_cont_work(),
printk()'s replaced with pr_info()'s, and cpumask printing now
uses cpulist_pr_cont().
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
CC: Ingo Molnar <mingo@redhat.com>
Add wq_barrier->task and worker_pool->manager to keep track of the
flushing task and pool manager respectively. These are purely
informational and will be used to implement sysrq dump of workqueues.
Signed-off-by: Tejun Heo <tj@kernel.org>
The workqueues list is protected by wq_pool_mutex and a workqueue and
its subordinate data structures are freed directly on destruction. We
want to add the ability dump workqueues from a sysrq callback which
requires walking all workqueues without grabbing wq_pool_mutex. This
patch makes freeing of workqueues RCU protected and makes the
workqueues list walkable while holding RCU read lock.
Note that pool_workqueues and pools are already sched-RCU protected.
For consistency, workqueues are also protected with sched-RCU.
While at it, reverse the workqueues list so that a workqueue which is
created earlier comes before. The order of the list isn't significant
functionally but this makes the planned sysrq dump list system
workqueues first.
Signed-off-by: Tejun Heo <tj@kernel.org>
cancel[_delayed]_work_sync() are implemented using
__cancel_work_timer() which grabs the PENDING bit using
try_to_grab_pending() and then flushes the work item with PENDING set
to prevent the on-going execution of the work item from requeueing
itself.
try_to_grab_pending() can always grab PENDING bit without blocking
except when someone else is doing the above flushing during
cancelation. In that case, try_to_grab_pending() returns -ENOENT. In
this case, __cancel_work_timer() currently invokes flush_work(). The
assumption is that the completion of the work item is what the other
canceling task would be waiting for too and thus waiting for the same
condition and retrying should allow forward progress without excessive
busy looping
Unfortunately, this doesn't work if preemption is disabled or the
latter task has real time priority. Let's say task A just got woken
up from flush_work() by the completion of the target work item. If,
before task A starts executing, task B gets scheduled and invokes
__cancel_work_timer() on the same work item, its try_to_grab_pending()
will return -ENOENT as the work item is still being canceled by task A
and flush_work() will also immediately return false as the work item
is no longer executing. This puts task B in a busy loop possibly
preventing task A from executing and clearing the canceling state on
the work item leading to a hang.
task A task B worker
executing work
__cancel_work_timer()
try_to_grab_pending()
set work CANCELING
flush_work()
block for work completion
completion, wakes up A
__cancel_work_timer()
while (forever) {
try_to_grab_pending()
-ENOENT as work is being canceled
flush_work()
false as work is no longer executing
}
This patch removes the possible hang by updating __cancel_work_timer()
to explicitly wait for clearing of CANCELING rather than invoking
flush_work() after try_to_grab_pending() fails with -ENOENT.
Link: http://lkml.kernel.org/g/20150206171156.GA8942@axis.com
v3: bit_waitqueue() can't be used for work items defined in vmalloc
area. Switched to custom wake function which matches the target
work item and exclusive wait and wakeup.
v2: v1 used wake_up() on bit_waitqueue() which leads to NULL deref if
the target bit waitqueue has wait_bit_queue's on it. Use
DEFINE_WAIT_BIT() and __wake_up_bit() instead. Reported by Tomeu
Vizoso.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Rabin Vincent <rabin.vincent@axis.com>
Cc: Tomeu Vizoso <tomeu.vizoso@gmail.com>
Cc: stable@vger.kernel.org
Tested-by: Jesper Nilsson <jesper.nilsson@axis.com>
Tested-by: Rabin Vincent <rabin.vincent@axis.com>
printk and friends can now format bitmaps using '%*pb[l]'. cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A worker_pool's forward progress is guaranteed by the fact that the
last idle worker assumes the manager role to create more workers and
summon the rescuers if creating workers doesn't succeed in timely
manner before proceeding to execute work items.
This manager role is implemented in manage_workers(), which indicates
whether the worker may proceed to work item execution with its return
value. This is necessary because multiple workers may contend for the
manager role, and, if there already is a manager, others should
proceed to work item execution.
Unfortunately, the function also indicates that the worker may proceed
to work item execution if need_to_create_worker() is false at the head
of the function. need_to_create_worker() tests the following
conditions.
pending work items && !nr_running && !nr_idle
The first and third conditions are protected by pool->lock and thus
won't change while holding pool->lock; however, nr_running can change
asynchronously as other workers block and resume and while it's likely
to be zero, as someone woke this worker up in the first place, some
other workers could have become runnable inbetween making it non-zero.
If this happens, manage_worker() could return false even with zero
nr_idle making the worker, the last idle one, proceed to execute work
items. If then all workers of the pool end up blocking on a resource
which can only be released by a work item which is pending on that
pool, the whole pool can deadlock as there's no one to create more
workers or summon the rescuers.
This patch fixes the problem by removing the early exit condition from
maybe_create_worker() and making manage_workers() return false iff
there's already another manager, which ensures that the last worker
doesn't start executing work items.
We can leave the early exit condition alone and just ignore the return
value but the only reason it was put there is because the
manage_workers() used to perform both creations and destructions of
workers and thus the function may be invoked while the pool is trying
to reduce the number of workers. Now that manage_workers() is called
only when more workers are needed, the only case this early exit
condition is triggered is rare race conditions rendering it pointless.
Tested with simulated workload and modified workqueue code which
trigger the pool deadlock reliably without this patch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Eric Sandeen <sandeen@sandeen.net>
Link: http://lkml.kernel.org/g/54B019F4.8030009@sandeen.net
Cc: Dave Chinner <david@fromorbit.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: stable@vger.kernel.org
When there is serious memory pressure, all workers in a pool could be
blocked, and a new thread cannot be created because it requires memory
allocation.
In this situation a WQ_MEM_RECLAIM workqueue will wake up the
rescuer thread to do some work.
The rescuer will only handle requests that are already on ->worklist.
If max_requests is 1, that means it will handle a single request.
The rescuer will be woken again in 100ms to handle another max_requests
requests.
I've seen a machine (running a 3.0 based "enterprise" kernel) with
thousands of requests queued for xfslogd, which has a max_requests of
1, and is needed for retiring all 'xfs' write requests. When one of
the worker pools gets into this state, it progresses extremely slowly
and possibly never recovers (only waited an hour or two).
With this patch we leave a pool_workqueue on mayday list
until it is clearly no longer in need of assistance. This allows
all requests to be handled in a timely fashion.
We keep each pool_workqueue on the mayday list until
need_to_create_worker() is false, and no work for this workqueue is
found in the pool.
I have tested this in combination with a (hackish) patch which forces
all work items to be handled by the rescuer thread. In that context
it significantly improves performance. A similar patch for a 3.0
kernel significantly improved performance on a heavy work load.
Thanks to Jan Kara for some design ideas, and to Dongsu Park for
some comments and testing.
tj: Inverted the lock order between wq_mayday_lock and pool->lock with
a preceding patch and simplified this patch. Added comment and
updated changelog accordingly. Dongsu spotted missing get_pwq()
in the simplified code.
Cc: Dongsu Park <dongsu.park@profitbricks.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Currently, pool->lock nests inside pool->lock. There's no inherent
reason for this order. The only place where the two locks are held
together is pool_mayday_timeout() and it just got decided that way.
This nesting order turns out to complicate things with the planned
rescuer_thread() update. Let's invert them. This doesn't cause any
behavior differences.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Dongsu Park <dongsu.park@profitbricks.com>
rescuer_thread() caches &rescuer->scheduled in a local variable
scheduled for convenience. There's one WARN_ON_ONCE() which was using
&rescuer->scheduled directly. Replace it with the local variable.
This patch causes no functional difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tidy up and use cond_resched_rcu_qs when calling cond_resched and
reporting potential quiescent state to RCU. Splitting this change in
this way allows easy backporting to -stable for kernel versions not
having cond_resched_rcu_qs().
Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Pull percpu updates from Tejun Heo:
- Major reorganization of percpu header files which I think makes
things a lot more readable and logical than before.
- percpu-refcount is updated so that it requires explicit destruction
and can be reinitialized if necessary. This was pulled into the
block tree to replace the custom percpu refcnting implemented in
blk-mq.
- In the process, percpu and percpu-refcount got cleaned up a bit
* 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (21 commits)
percpu-refcount: implement percpu_ref_reinit() and percpu_ref_is_zero()
percpu-refcount: require percpu_ref to be exited explicitly
percpu-refcount: use unsigned long for pcpu_count pointer
percpu-refcount: add helpers for ->percpu_count accesses
percpu-refcount: one bit is enough for REF_STATUS
percpu-refcount, aio: use percpu_ref_cancel_init() in ioctx_alloc()
workqueue: stronger test in process_one_work()
workqueue: clear POOL_DISASSOCIATED in rebind_workers()
percpu: Use ALIGN macro instead of hand coding alignment calculation
percpu: invoke __verify_pcpu_ptr() from the generic part of accessors and operations
percpu: preffity percpu header files
percpu: use raw_cpu_*() to define __this_cpu_*()
percpu: reorder macros in percpu header files
percpu: move {raw|this}_cpu_*() definitions to include/linux/percpu-defs.h
percpu: move generic {raw|this}_cpu_*_N() definitions to include/asm-generic/percpu.h
percpu: only allow sized arch overrides for {raw|this}_cpu_*() ops
percpu: reorganize include/linux/percpu-defs.h
percpu: move accessors from include/linux/percpu.h to percpu-defs.h
percpu: include/asm-generic/percpu.h should contain only arch-overridable parts
percpu: introduce arch_raw_cpu_ptr()
...
Pull workqueue updates from Tejun Heo:
"Lai has been doing a lot of cleanups of workqueue and kthread_work.
No significant behavior change. Just a lot of cleanups all over the
place. Some are a bit invasive but overall nothing too dangerous"
* 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
kthread_work: remove the unused wait_queue_head
kthread_work: wake up worker only when the worker is idle
workqueue: use nr_node_ids instead of wq_numa_tbl_len
workqueue: remove the misnamed out_unlock label in get_unbound_pool()
workqueue: remove the stale comment in pwq_unbound_release_workfn()
workqueue: move rescuer pool detachment to the end
workqueue: unfold start_worker() into create_worker()
workqueue: remove @wakeup from worker_set_flags()
workqueue: remove an unneeded UNBOUND test before waking up the next worker
workqueue: wake regular worker if need_more_worker() when rescuer leave the pool
workqueue: alloc struct worker on its local node
workqueue: reuse the already calculated pwq in try_to_grab_pending()
workqueue: stronger test in process_one_work()
workqueue: clear POOL_DISASSOCIATED in rebind_workers()
workqueue: sanity check pool->cpu in wq_worker_sleeping()
workqueue: clear leftover flags when detached
workqueue: remove useless WARN_ON_ONCE()
workqueue: use schedule_timeout_interruptible() instead of open code
workqueue: remove the empty check in too_many_workers()
workqueue: use "pool->cpu < 0" to stand for an unbound pool
They are the same and nr_node_ids is provided by the memory subsystem.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
After the locking was moved up to the caller of the get_unbound_pool(),
out_unlock label doesn't need to do any unlock operation and the name
became bad, so we just remove this label, and the only usage-site
"goto out_unlock" is subsituted to "return pool".
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
In 75ccf5950f ("workqueue: prepare flush_workqueue() for dynamic
creation and destrucion of unbound pool_workqueues"), a comment
about the synchronization for the pwq in pwq_unbound_release_workfn()
was added. The comment claimed the flush_mutex wasn't strictly
necessary, it was correct in that time, due to the pwq was protected
by workqueue_lock.
But it is incorrect now since the wq->flush_mutex was renamed to
wq->mutex and workqueue_lock was removed, the wq->mutex is strictly
needed. But the comment was miss-updated when the synchronization
was changed.
This patch removes the incorrect comments and doesn't add any new
comment to explain why wq->mutex is needed here, which is definitely
obvious and wq->pwqs_node has "WQ" notation in its definition which is
better comment.
The old commit mentioned above also introduced a comment in link_pwq()
about the synchronization. This comment is also removed in this patch
since the whole link_pwq() is proteced by wq->mutex.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
In 51697d3939 ("workqueue: use generic attach/detach routine for
rescuers"), The rescuer detaches itself from the pool before put_pwq()
so that the put_unbound_pool() will not destroy the rescuer-attached
pool.
It is unnecessary. worker_detach_from_pool() can be used as the last
statement to access to the pool just like the regular workers,
put_unbound_pool() will wait for it to detach and then free the pool.
So we move the worker_detach_from_pool() down, make it coincide with
the regular workers.
tj: Minor description update.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Simply unfold the code of start_worker() into create_worker() and
remove the original start_worker() and create_and_start_worker().
The only trade-off is the introduced overhead that the pool->lock
is released and regrabbed after the newly worker is started.
The overhead is acceptible since the manager is slow path.
And because this new locking behavior, the newly created worker
may grab the lock earlier than the manager and go to process
work items. In this case, the recheck need_to_create_worker() may be
true as expected and the manager goes to restart which is the
correct behavior.
tj: Minor updates to description and comments.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
worker_set_flags() has only two callers, each specifying %true and
%false for @wakeup. Let's push the wake up to the caller and remove
@wakeup from worker_set_flags(). The caller can use the following
instead if wakeup is necessary:
worker_set_flags();
if (need_more_worker(pool))
wake_up_worker(pool);
This makes the code simpler. This patch doesn't introduce behavior
changes.
tj: Updated description and comments.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
In process_one_work():
if ((worker->flags & WORKER_UNBOUND) && need_more_worker(pool))
wake_up_worker(pool);
the first test is unneeded. Even if the first test is removed, it
doesn't affect the wake-up logic for WORKER_UNBOUND, and it will not
introduce any useless wake-ups for normal per-cpu workers since
nr_running is always >= 1. It will introduce useless/redundant
wake-ups for CPU_INTENSIVE, but this case is rare and the next patch
will also remove this redundant wake-up.
tj: Minor updates to the description and comment.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
We don't need to wake up regular worker when nr_running==1,
so need_more_worker() is sufficient here.
And need_more_worker() gives us better readability due to the name of
"keep_working()" implies the rescuer should keep working now but
the rescuer is actually leaving.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>