38 Commits

Author SHA1 Message Date
Blagovest Kolenichev
dc1d03db8d Merge android-4.14.114 (c680586) into msm-4.14
* refs/heads/tmp-c680586:
  dm: Restore reverted changes
  Linux 4.14.114
  kernel/sysctl.c: fix out-of-bounds access when setting file-max
  Revert "locking/lockdep: Add debug_locks check in __lock_downgrade()"
  i2c-hid: properly terminate i2c_hid_dmi_desc_override_table[] array
  xfs: hold xfs_buf locked between shortform->leaf conversion and the addition of an attribute
  xfs: add the ability to join a held buffer to a defer_ops
  iomap: report collisions between directio and buffered writes to userspace
  tools include: Adopt linux/bits.h
  percpu: stop printing kernel addresses
  ALSA: info: Fix racy addition/deletion of nodes
  mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
  device_cgroup: fix RCU imbalance in error case
  sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup
  Revert "kbuild: use -Oz instead of -Os when using clang"
  net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c
  net: IP6 defrag: use rbtrees for IPv6 defrag
  ipv6: remove dependency of nf_defrag_ipv6 on ipv6 module
  net: IP defrag: encapsulate rbtree defrag code into callable functions
  ipv6: frags: fix a lockdep false positive
  tpm/tpm_i2c_atmel: Return -E2BIG when the transfer is incomplete
  modpost: file2alias: check prototype of handler
  modpost: file2alias: go back to simple devtable lookup
  mmc: sdhci: Handle auto-command errors
  mmc: sdhci: Rename SDHCI_ACMD12_ERR and SDHCI_INT_ACMD12ERR
  mmc: sdhci: Fix data command CRC error handling
  crypto: crypto4xx - properly set IV after de- and encrypt
  x86/speculation: Prevent deadlock on ssb_state::lock
  perf/x86: Fix incorrect PEBS_REGS
  x86/cpu/bugs: Use __initconst for 'const' init data
  perf/x86/amd: Add event map for AMD Family 17h
  mac80211: do not call driver wake_tx_queue op during reconfig
  rt2x00: do not increment sequence number while re-transmitting
  kprobes: Fix error check when reusing optimized probes
  kprobes: Mark ftrace mcount handler functions nokprobe
  x86/kprobes: Verify stack frame on kretprobe
  arm64: futex: Restore oldval initialization to work around buggy compilers
  crypto: x86/poly1305 - fix overflow during partial reduction
  coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping
  Revert "svm: Fix AVIC incomplete IPI emulation"
  Revert "scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO"
  scsi: core: set result when the command cannot be dispatched
  ALSA: core: Fix card races between register and disconnect
  ALSA: hda/realtek - add two more pin configuration sets to quirk table
  staging: comedi: ni_usb6501: Fix possible double-free of ->usb_rx_buf
  staging: comedi: ni_usb6501: Fix use of uninitialized mutex
  staging: comedi: vmk80xx: Fix possible double-free of ->usb_rx_buf
  staging: comedi: vmk80xx: Fix use of uninitialized semaphore
  io: accel: kxcjk1013: restore the range after resume.
  iio: core: fix a possible circular locking dependency
  iio: adc: at91: disable adc channel interrupt in timeout case
  iio: Fix scan mask selection
  iio: dac: mcp4725: add missing powerdown bits in store eeprom
  iio: ad_sigma_delta: select channel when reading register
  iio: cros_ec: Fix the maths for gyro scale calculation
  iio/gyro/bmg160: Use millidegrees for temperature scale
  iio: gyro: mpu3050: fix chip ID reading
  staging: iio: ad7192: Fix ad7193 channel address
  Staging: iio: meter: fixed typo
  KVM: x86: svm: make sure NMI is injected after nmi_singlestep
  KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPU
  CIFS: keep FileInfo handle live during oplock break
  net: thunderx: don't allow jumbo frames with XDP
  net: thunderx: raise XDP MTU to 1508
  ipv4: ensure rcu_read_lock() in ipv4_link_failure()
  ipv4: recompile ip options in ipv4_link_failure
  vhost: reject zero size iova range
  team: set slave to promisc if team is already in promisc mode
  tcp: tcp_grow_window() needs to respect tcp_space()
  net: fou: do not use guehdr after iptunnel_pull_offloads in gue_udp_recv
  net: bridge: multicast: use rcu to access port list from br_multicast_start_querier
  net: bridge: fix per-port af_packet sockets
  net: atm: Fix potential Spectre v1 vulnerabilities
  bonding: fix event handling for stacked bonds
  ANDROID: cuttlefish_defconfig: Enable CONFIG_XFRM_STATISTICS
  Linux 4.14.113
  appletalk: Fix compile regression
  mm: hide incomplete nr_indirectly_reclaimable in sysfs
  net: stmmac: Set dma ring length before enabling the DMA
  bpf: Fix selftests are changes for CVE 2019-7308
  bpf: fix sanitation rewrite in case of non-pointers
  bpf: do not restore dst_reg when cur_state is freed
  bpf: fix inner map masking to prevent oob under speculation
  bpf: fix sanitation of alu op with pointer / scalar type from different paths
  bpf: prevent out of bounds speculation on pointer arithmetic
  bpf: fix check_map_access smin_value test when pointer contains offset
  bpf: restrict unknown scalars of mixed signed bounds for unprivileged
  bpf: restrict stack pointer arithmetic for unprivileged
  bpf: restrict map value pointer arithmetic for unprivileged
  bpf: enable access to ax register also from verifier rewrite
  bpf: move tmp variable into ax register in interpreter
  bpf: move {prev_,}insn_idx into verifier env
  bpf: fix stack state printing in verifier log
  bpf: fix verifier NULL pointer dereference
  bpf: fix verifier memory leaks
  bpf: reduce verifier memory consumption
  dm: disable CRYPTO_TFM_REQ_MAY_SLEEP to fix a GFP_KERNEL recursion deadlock
  bpf: fix use after free in bpf_evict_inode
  include/linux/swap.h: use offsetof() instead of custom __swapoffset macro
  lib/div64.c: off by one in shift
  appletalk: Fix use-after-free in atalk_proc_exit
  drm/amdkfd: use init_mqd function to allocate object for hid_mqd (CI)
  ARM: 8839/1: kprobe: make patch_lock a raw_spinlock_t
  drm/nouveau/volt/gf117: fix speedo readout register
  coresight: cpu-debug: Support for CA73 CPUs
  Revert "ACPI / EC: Remove old CLEAR_ON_RESUME quirk"
  crypto: axis - fix for recursive locking from bottom half
  drm/panel: panel-innolux: set display off in innolux_panel_unprepare
  lkdtm: Add tests for NULL pointer dereference
  lkdtm: Print real addresses
  soc/tegra: pmc: Drop locking from tegra_powergate_is_powered()
  iommu/dmar: Fix buffer overflow during PCI bus notification
  crypto: sha512/arm - fix crash bug in Thumb2 build
  crypto: sha256/arm - fix crash bug in Thumb2 build
  kernel: hung_task.c: disable on suspend
  cifs: fallback to older infolevels on findfirst queryinfo retry
  compiler.h: update definition of unreachable()
  KVM: nVMX: restore host state in nested_vmx_vmexit for VMFail
  ACPI / SBS: Fix GPE storm on recent MacBookPro's
  usbip: fix vhci_hcd controller counting
  ARM: samsung: Limit SAMSUNG_PM_CHECK config option to non-Exynos platforms
  HID: i2c-hid: override HID descriptors for certain devices
  media: au0828: cannot kfree dev before usb disconnect
  powerpc/pseries: Remove prrn_work workqueue
  serial: uartps: console_setup() can't be placed to init section
  netfilter: xt_cgroup: shrink size of v2 path
  f2fs: fix to do sanity check with current segment number
  9p locks: add mount option for lock retry interval
  9p: do not trust pdu content for stat item size
  rsi: improve kernel thread handling to fix kernel panic
  gpio: pxa: handle corner case of unprobed device
  ext4: prohibit fstrim in norecovery mode
  fix incorrect error code mapping for OBJECTID_NOT_FOUND
  x86/hw_breakpoints: Make default case in hw_breakpoint_arch_parse() return an error
  iommu/vt-d: Check capability before disabling protected memory
  drm/nouveau/debugfs: Fix check of pm_runtime_get_sync failure
  x86/cpu/cyrix: Use correct macros for Cyrix calls on Geode processors
  x86/hpet: Prevent potential NULL pointer dereference
  irqchip/mbigen: Don't clear eventid when freeing an MSI
  perf tests: Fix a memory leak in test__perf_evsel__tp_sched_test()
  perf tests: Fix memory leak by expr__find_other() in test__expr()
  perf tests: Fix a memory leak of cpu_map object in the openat_syscall_event_on_all_cpus test
  perf evsel: Free evsel->counts in perf_evsel__exit()
  perf hist: Add missing map__put() in error case
  perf top: Fix error handling in cmd_top()
  perf build-id: Fix memory leak in print_sdt_events()
  perf config: Fix a memory leak in collect_config()
  perf config: Fix an error in the config template documentation
  perf list: Don't forget to drop the reference to the allocated thread_map
  tools/power turbostat: return the exit status of a command
  x86/mm: Don't leak kernel addresses
  scsi: iscsi: flush running unbind operations when removing a session
  thermal/intel_powerclamp: fix truncated kthread name
  thermal/int340x_thermal: fix mode setting
  thermal/int340x_thermal: Add additional UUIDs
  thermal: bcm2835: Fix crash in bcm2835_thermal_debugfs
  thermal/intel_powerclamp: fix __percpu declaration of worker_data
  ALSA: opl3: fix mismatch between snd_opl3_drum_switch definition and declaration
  mmc: davinci: remove extraneous __init annotation
  IB/mlx4: Fix race condition between catas error reset and aliasguid flows
  auxdisplay: hd44780: Fix memory leak on ->remove()
  ALSA: sb8: add a check for request_region
  ALSA: echoaudio: add a check for ioremap_nocache
  ext4: report real fs size after failed resize
  ext4: add missing brelse() in add_new_gdb_meta_bg()
  perf/core: Restore mmap record type correctly
  arc: hsdk_defconfig: Enable CONFIG_BLK_DEV_RAM
  ARC: u-boot args: check that magic number is correct
  ANDROID: cuttlefish_defconfig: Enable L2TP/PPTP
  ANDROID: Makefile: Properly resolve 4.14.112 merge
  Make arm64 serial port config compatible with crosvm
  Linux 4.14.112
  arm64: dts: rockchip: Fix vcc_host1_5v GPIO polarity on rk3328-rock64
  arm64: dts: rockchip: fix vcc_host1_5v pin assign on rk3328-rock64
  dm table: propagate BDI_CAP_STABLE_WRITES to fix sporadic checksum errors
  PCI: Add function 1 DMA alias quirk for Marvell 9170 SATA controller
  x86/perf/amd: Remove need to check "running" bit in NMI handler
  x86/perf/amd: Resolve NMI latency issues for active PMCs
  x86/perf/amd: Resolve race condition when disabling PMC
  xtensa: fix return_address
  sched/fair: Do not re-read ->h_load_next during hierarchical load calculation
  xen: Prevent buffer overflow in privcmd ioctl
  arm64: backtrace: Don't bother trying to unwind the userspace stack
  arm64: dts: rockchip: fix rk3328 rgmii high tx error rate
  arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value
  ARM: dts: at91: Fix typo in ISC_D0 on PC9
  ARM: dts: am335x-evm: Correct the regulators for the audio codec
  ARM: dts: am335x-evmsk: Correct the regulators for the audio codec
  virtio: Honour 'may_reduce_num' in vring_create_virtqueue
  genirq: Initialize request_mutex if CONFIG_SPARSE_IRQ=n
  genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent()
  block: fix the return errno for direct IO
  block: do not leak memory in bio_copy_user_iov()
  btrfs: prop: fix vanished compression property after failed set
  btrfs: prop: fix zstd compression parameter validation
  Btrfs: do not allow trimming when a fs is mounted with the nologreplay option
  ASoC: fsl_esai: fix channel swap issue when stream starts
  include/linux/bitrev.h: fix constant bitrev
  drm/udl: add a release method and delay modeset teardown
  alarmtimer: Return correct remaining time
  parisc: regs_return_value() should return gpr28
  parisc: Detect QEMU earlier in boot process
  arm64: dts: rockchip: fix rk3328 sdmmc0 write errors
  hv_netvsc: Fix unwanted wakeup after tx_disable
  ip6_tunnel: Match to ARPHRD_TUNNEL6 for dev type
  ALSA: seq: Fix OOB-reads from strlcpy
  net: ethtool: not call vzalloc for zero sized memory request
  netns: provide pure entropy for net_hash_mix()
  net/sched: act_sample: fix divide by zero in the traffic path
  bnxt_en: Reset device on RX buffer errors.
  bnxt_en: Improve RX consumer index validity check.
  nfp: validate the return code from dev_queue_xmit()
  net/mlx5e: Add a lock on tir list
  net/mlx5e: Fix error handling when refreshing TIRs
  vrf: check accept_source_route on the original netdevice
  tcp: Ensure DCTCP reacts to losses
  sctp: initialize _pad of sockaddr_in before copying to user memory
  qmi_wwan: add Olicard 600
  openvswitch: fix flow actions reallocation
  net/sched: fix ->get helper of the matchall cls
  net: rds: force to destroy connection if t_sock is NULL in rds_tcp_kill_sock().
  net/mlx5: Decrease default mr cache size
  net-gro: Fix GRO flush when receiving a GSO packet.
  kcm: switch order of device registration to fix a crash
  ipv6: sit: reset ip header pointer in ipip6_rcv
  ipv6: Fix dangling pointer when ipv6 fragment
  tty: ldisc: add sysctl to prevent autoloading of ldiscs
  tty: mark Siemens R3964 line discipline as BROKEN
  arm64: kaslr: Reserve size of ARM64_MEMSTART_ALIGN in linear region
  stating: ccree: revert "staging: ccree: fix leak of import() after init()"
  lib/string.c: implement a basic bcmp
  x86/vdso: Drop implicit common-page-size linker flag
  x86: vdso: Use $LD instead of $CC to link
  kbuild: clang: choose GCC_TOOLCHAIN_DIR not on LD
  powerpc/tm: Limit TM code inside PPC_TRANSACTIONAL_MEM
  drm/i915/gvt: do not let pin count of shadow mm go negative
  x86/power: Make restore_processor_context() sane
  x86/power/32: Move SYSENTER MSR restoration to fix_processor_context()
  x86/power/64: Use struct desc_ptr for the IDT in struct saved_context
  x86/power: Fix some ordering bugs in __restore_processor_context()
  net: sfp: move sfp_register_socket call from sfp_remove to sfp_probe
  Revert "CHROMIUM: dm: boot time specification of dm="
  Revert "ANDROID: dm: do_mounts_dm: Rebase on top of 4.9"
  Revert "ANDROID: dm: do_mounts_dm: fix dm_substitute_devices()"
  Revert "ANDROID: dm: do_mounts_dm: Update init/do_mounts_dm.c to the latest ChromiumOS version."
  sched/fair: remove printk while schedule is in progress
  ANDROID: Makefile: Add '-fsplit-lto-unit' to cfi-clang-flags
  ANDROID: cfi: Remove unused variable in ptr_to_check_fn
  ANDROID: cuttlefish_defconfig: Enable CONFIG_FUSE_FS

Conflicts:
	arch/arm64/kernel/traps.c
	drivers/mmc/host/sdhci.c
	drivers/mmc/host/sdhci.h
	drivers/tty/Kconfig
	kernel/sched/fair.c

Change-Id: Ic4c01204f58cdb536e2cab04e4f1a2451977f6a3
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2019-06-25 03:05:18 -07:00
Vitaly Kuznetsov
507f2f17e0 kernel: hung_task.c: disable on suspend
[ Upstream commit a1c6ca3c6de763459a6e93b644ec6518c890ba1c ]

It is possible to observe hung_task complaints when system goes to
suspend-to-idle state:

 # echo freeze > /sys/power/state

 PM: Syncing filesystems ... done.
 Freezing user space processes ... (elapsed 0.001 seconds) done.
 OOM killer disabled.
 Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
 sd 0:0:0:0: [sda] Synchronizing SCSI cache
 INFO: task bash:1569 blocked for more than 120 seconds.
       Not tainted 4.19.0-rc3_+ #687
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 bash            D    0  1569    604 0x00000000
 Call Trace:
  ? __schedule+0x1fe/0x7e0
  schedule+0x28/0x80
  suspend_devices_and_enter+0x4ac/0x750
  pm_suspend+0x2c0/0x310

Register a PM notifier to disable the detector on suspend and re-enable
back on wakeup.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20 09:15:05 +02:00
Blagovest Kolenichev
f1b30ef999 Merge android-4.14-p.99 (b952da4) into msm-4.14
* refs/heads/tmp-b952da4:
  Revert "iommu/arm-smmu: Add support for qcom,smmu-v2 variant"
  Linux 4.14.99
  ath9k: dynack: check da->enabled first in sampling routines
  ath9k: dynack: make ewma estimation faster
  perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu()
  IB/hfi1: Add limit test for RC/UC send via loopback
  nfsd4: catch some false session retries
  nfsd4: fix cached replies to solo SEQUENCE compounds
  serial: 8250_pci: Make PCI class test non fatal
  serial: fix race between flush_to_ldisc and tty_open
  perf tests evsel-tp-sched: Fix bitwise operator
  perf/core: Don't WARN() for impossible ring-buffer sizes
  x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()
  perf/x86/intel/uncore: Add Node ID mask
  cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM
  KVM: nVMX: unconditionally cancel preemption timer in free_nested (CVE-2019-7221)
  kvm: fix kvm_ioctl_create_device() reference counting (CVE-2019-6974)
  KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222)
  scsi: aic94xx: fix module loading
  scsi: cxlflash: Prevent deadlock when adapter probe fails
  staging: speakup: fix tty-operation NULL derefs
  usb: gadget: musb: fix short isoc packets with inventra dma
  usb: gadget: udc: net2272: Fix bitwise and boolean operations
  usb: dwc3: gadget: Handle 0 xfer length for OUT EP
  usb: phy: am335x: fix race condition in _probe
  irqchip/gic-v3-its: Plug allocation race for devices sharing a DevID
  futex: Handle early deadlock return correctly
  dmaengine: imx-dma: fix wrong callback invoke
  dmaengine: bcm2835: Fix abort of transactions
  dmaengine: bcm2835: Fix interrupt race on RT
  fuse: handle zero sized retrieve correctly
  fuse: decrement NR_WRITEBACK_TEMP on the right page
  fuse: call pipe_buf_release() under pipe lock
  ALSA: hda - Serialize codec registrations
  ALSA: compress: Fix stop handling on compressed capture streams
  net: dsa: slave: Don't propagate flag changes on down slave interfaces
  net/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet frames
  net: systemport: Fix WoL with password after deep sleep
  rds: fix refcount bug in rds_sock_addref
  skge: potential memory corruption in skge_get_regs()
  rxrpc: bad unlock balance in rxrpc_recvmsg
  net: dp83640: expire old TX-skb
  enic: fix checksum validation for IPv6
  dccp: fool proof ccid_hc_[rt]x_parse_options()
  thermal: hwmon: inline helpers when CONFIG_THERMAL_HWMON is not set
  scripts/gdb: fix lx-version string output
  exec: load_script: don't blindly truncate shebang string
  fs/epoll: drop ovflist branch prediction
  kernel/hung_task.c: force console verbose before panic
  proc/sysctl: fix return error for proc_doulongvec_minmax()
  kernel/hung_task.c: break RCU locks based on jiffies
  HID: lenovo: Add checks to fix of_led_classdev_register
  thermal: generic-adc: Fix adc to temp interpolation
  kdb: Don't back trace on a cpu that didn't round up
  thermal: bcm2835: enable hwmon explicitly
  block/swim3: Fix -EBUSY error when re-opening device after unmount
  fsl/fman: Use GFP_ATOMIC in {memac,tgec}_add_hash_mac_address()
  gdrom: fix a memory leak bug
  isdn: hisax: hfc_pci: Fix a possible concurrency use-after-free bug in HFCPCI_l1hw()
  ocfs2: improve ocfs2 Makefile
  ocfs2: don't clear bh uptodate for block read
  scripts/decode_stacktrace: only strip base path when a prefix of the path
  cgroup: fix parsing empty mount option string
  f2fs: fix sbi->extent_list corruption issue
  niu: fix missing checks of niu_pci_eeprom_read
  um: Avoid marking pages with "changed protection"
  cifs: check ntwrk_buf_start for NULL before dereferencing it
  MIPS: ralink: Select CONFIG_CPU_MIPSR2_IRQ_VI on MT7620/8
  crypto: ux500 - Use proper enum in hash_set_dma_transfer
  crypto: ux500 - Use proper enum in cryp_set_dma_transfer
  seq_buf: Make seq_buf_puts() null-terminate the buffer
  hwmon: (lm80) fix a missing check of bus read in lm80 probe
  hwmon: (lm80) fix a missing check of the status of SMBus read
  NFS: nfs_compare_mount_options always compare auth flavors.
  kvm: Change offset in kvm_write_guest_offset_cached to unsigned
  powerpc/fadump: Do not allow hot-remove memory from fadump reserved area.
  KVM: x86: svm: report MSR_IA32_MCG_EXT_CTL as unsupported
  pinctrl: meson: meson8b: fix the GPIO function for the GPIOAO pins
  pinctrl: meson: meson8: fix the GPIO function for the GPIOAO pins
  powerpc/mm: Fix reporting of kernel execute faults on the 8xx
  fbdev: fbcon: Fix unregister crash when more than one framebuffer
  ACPI/APEI: Clear GHES block_status before panic()
  igb: Fix an issue that PME is not enabled during runtime suspend
  i40e: define proper net_device::neigh_priv_len
  fbdev: fbmem: behave better with small rotated displays and many CPUs
  md: fix raid10 hang issue caused by barrier
  video: clps711x-fb: release disp device node in probe()
  drbd: Avoid Clang warning about pointless switch statment
  drbd: skip spurious timeout (ping-timeo) when failing promote
  drbd: disconnect, if the wrong UUIDs are attached on a connected peer
  drbd: narrow rcu_read_lock in drbd_sync_handshake
  powerpc/perf: Fix thresholding counter data for unknown type
  cw1200: Fix concurrency use-after-free bugs in cw1200_hw_scan()
  scsi: smartpqi: increase fw status register read timeout
  scsi: smartpqi: correct volume status
  scsi: smartpqi: correct host serial num for ssa
  mlxsw: spectrum: Properly cleanup LAG uppers when removing port from LAG
  Bluetooth: Fix unnecessary error message for HCI request completion
  xfrm6_tunnel: Fix spi check in __xfrm6_tunnel_alloc_spi
  mac80211: fix radiotap vendor presence bitmap handling
  powerpc/uaccess: fix warning/error with access_ok()
  percpu: convert spin_lock_irq to spin_lock_irqsave.
  usb: musb: dsps: fix otg state machine
  arm64: KVM: Skip MMIO insn after emulation
  perf probe: Fix unchecked usage of strncpy()
  perf header: Fix unchecked usage of strncpy()
  perf test: Fix perf_event_attr test failure
  tty: serial: samsung: Properly set flags in autoCTS mode
  mmc: sdhci-xenon: Fix timeout checks
  mmc: sdhci-of-esdhc: Fix timeout checks
  memstick: Prevent memstick host from getting runtime suspended during card detection
  mmc: bcm2835: reset host on timeout
  mmc: bcm2835: Recover from MMC_SEND_EXT_CSD
  KVM: PPC: Book3S: Only report KVM_CAP_SPAPR_TCE_VFIO on powernv machines
  ASoC: fsl: Fix SND_SOC_EUKREA_TLV320 build error on i.MX8M
  ARM: pxa: avoid section mismatch warning
  selftests/bpf: use __bpf_constant_htons in test_prog.c
  switchtec: Fix SWITCHTEC_IOCTL_EVENT_IDX_ALL flags overwrite
  udf: Fix BUG on corrupted inode
  phy: sun4i-usb: add support for missing USB PHY index
  i2c-axxia: check for error conditions first
  OPP: Use opp_table->regulators to verify no regulator case
  cpuidle: big.LITTLE: fix refcount leak
  clk: imx6sl: ensure MMDC CH0 handshake is bypassed
  sata_rcar: fix deferred probing
  iommu/arm-smmu-v3: Use explicit mb() when moving cons pointer
  iommu/arm-smmu: Add support for qcom,smmu-v2 variant
  usb: dwc3: gadget: Disable CSP for stream OUT ep
  watchdog: renesas_wdt: don't set divider while watchdog is running
  ARM: dts: Fix up the D-Link DIR-685 MTD partition info
  media: coda: fix H.264 deblocking filter controls
  mips: bpf: fix encoding bug for mm_srlv32_op
  ARM: dts: Fix OMAP4430 SDP Ethernet startup
  iommu/amd: Fix amd_iommu=force_isolation
  pinctrl: sx150x: handle failure case of devm_kstrdup
  usb: dwc3: trace: add missing break statement to make compiler happy
  IB/hfi1: Unreserve a reserved request when it is completed
  kobject: return error code if writing /sys/.../uevent fails
  driver core: Move async_synchronize_full call
  clk: sunxi-ng: a33: Set CLK_SET_RATE_PARENT for all audio module clocks
  usb: mtu3: fix the issue about SetFeature(U1/U2_Enable)
  timekeeping: Use proper seqcount initializer
  usb: hub: delay hub autosuspend if USB3 port is still link training
  usb: dwc3: Correct the logic for checking TRB full in __dwc3_prepare_one_trb()
  smack: fix access permissions for keyring
  media: DaVinci-VPBE: fix error handling in vpbe_initialize()
  x86/fpu: Add might_fault() to user_insn()
  ARM: dts: mmp2: fix TWSI2
  arm64: ftrace: don't adjust the LR value
  s390/zcrypt: improve special ap message cmd handling
  firmware/efi: Add NULL pointer checks in efivars API functions
  Thermal: do not clear passive state during system sleep
  arm64: io: Ensure value passed to __iormb() is held in a 64-bit register
  drm: Clear state->acquire_ctx before leaving drm_atomic_helper_commit_duplicated_state()
  nfsd4: fix crash on writing v4_end_grace before nfsd startup
  soc: bcm: brcmstb: Don't leak device tree node reference
  sunvdc: Do not spin in an infinite loop when vio_ldc_send() returns EAGAIN
  arm64: io: Ensure calls to delay routines are ordered against prior readX()
  i2c: sh_mobile: add support for r8a77990 (R-Car E3)
  f2fs: fix wrong return value of f2fs_acl_create
  f2fs: fix race between write_checkpoint and write_begin
  f2fs: move dir data flush to write checkpoint process
  staging: pi433: fix potential null dereference
  ACPI: SPCR: Consider baud rate 0 as preconfigured state
  media: adv*/tc358743/ths8200: fill in min width/height/pixelclock
  iio: accel: kxcjk1013: Add KIOX010A ACPI Hardware-ID
  iio: adc: meson-saradc: fix internal clock names
  iio: adc: meson-saradc: check for devm_kasprintf failure
  dmaengine: xilinx_dma: Remove __aligned attribute on zynqmp_dma_desc_ll
  ptp: Fix pass zero to ERR_PTR() in ptp_clock_register
  media: mtk-vcodec: Release device nodes in mtk_vcodec_init_enc_pm()
  soc/tegra: Don't leak device tree node reference
  perf tools: Add Hygon Dhyana support
  modpost: validate symbol names also in find_elf_symbol
  net/mlx5: EQ, Use the right place to store/read IRQ affinity hint
  ARM: OMAP2+: hwmod: Fix some section annotations
  drm/rockchip: fix for mailbox read size
  usbnet: smsc95xx: fix rx packet alignment
  staging: iio: ad7780: update voltage on read
  platform/chrome: don't report EC_MKBP_EVENT_SENSOR_FIFO as wakeup
  Tools: hv: kvp: Fix a warning of buffer overflow with gcc 8.0.1
  fpga: altera-cvp: Fix registration for CvP incapable devices
  staging:iio:ad2s90: Make probe handle spi_setup failure
  MIPS: Boston: Disable EG20T prefetch
  ptp: check gettime64 return code in PTP_SYS_OFFSET ioctl
  serial: fsl_lpuart: clear parity enable bit when disable parity
  drm/vc4: ->x_scaling[1] should never be set to VC4_SCALING_NONE
  crypto: aes_ti - disable interrupts while accessing S-box
  powerpc/pseries: add of_node_put() in dlpar_detach_node()
  x86/PCI: Fix Broadcom CNB20LE unintended sign extension (redux)
  dlm: Don't swamp the CPU with callbacks queued during recovery
  clk: boston: fix possible memory leak in clk_boston_setup()
  ARM: 8808/1: kexec:offline panic_smp_self_stop CPU
  scsi: lpfc: Fix LOGO/PLOGI handling when triggerd by ABTS Timeout event
  scsi: mpt3sas: Call sas_remove_host before removing the target devices
  scsi: lpfc: Correct LCB RJT handling
  ath9k: dynack: use authentication messages for 'late' ack
  gpu: ipu-v3: image-convert: Prevent race between run and unprepare
  ASoC: Intel: mrfld: fix uninitialized variable access
  pinctrl: bcm2835: Use raw spinlock for RT compatibility
  drm/vgem: Fix vgem_init to get drm device available.
  staging: iio: adc: ad7280a: handle error from __ad7280_read32()
  drm/bufs: Fix Spectre v1 vulnerability

Conflicts:
	drivers/thermal/thermal_core.c

File below is aligned according to change [1] from this LTS import:

    arch/arm64/include/asm/io.h

[1] arm64: io: Ensure calls to delay routines are ordered against prior readX()

Change-Id: Ieae90a5ca7b81b08fcfedb150da732f5986aefe5
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2019-03-01 11:14:23 -08:00
Liu, Chuansheng
31a38a0c02 kernel/hung_task.c: force console verbose before panic
[ Upstream commit 168e06f7937d96c7222037d8a05565e8a6eb00fe ]

Based on commit 401c636a0eeb ("kernel/hung_task.c: show all hung tasks
before panic"), we could get the call stack of hung task.

However, if the console loglevel is not high, we still can not see the
useful panic information in practice, and in most cases users don't set
console loglevel to high level.

This patch is to force console verbose before system panic, so that the
real useful information can be seen in the console, instead of being
like the following, which doesn't have hung task information.

  INFO: task init:1 blocked for more than 120 seconds.
        Tainted: G     U  W         4.19.0-quilt-2e5dc0ac-g51b6c21d76cc #1
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  Kernel panic - not syncing: hung_task: blocked tasks
  CPU: 2 PID: 479 Comm: khungtaskd Tainted: G     U  W         4.19.0-quilt-2e5dc0ac-g51b6c21d76cc #1
  Call Trace:
   dump_stack+0x4f/0x65
   panic+0xde/0x231
   watchdog+0x290/0x410
   kthread+0x12c/0x150
   ret_from_fork+0x35/0x40
  reboot: panic mode set: p,w
  Kernel Offset: 0x34000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Link: http://lkml.kernel.org/r/27240C0AC20F114CBF8149A2696CBE4A6015B675@SHSMSX101.ccr.corp.intel.com
Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-12 19:46:10 +01:00
Tetsuo Handa
53015f1e04 kernel/hung_task.c: break RCU locks based on jiffies
[ Upstream commit 304ae42739b108305f8d7b3eb3c1aec7c2b643a9 ]

check_hung_uninterruptible_tasks() is currently calling rcu_lock_break()
for every 1024 threads.  But check_hung_task() is very slow if printk()
was called, and is very fast otherwise.

If many threads within some 1024 threads called printk(), the RCU grace
period might be extended enough to trigger RCU stall warnings.
Therefore, calling rcu_lock_break() for every some fixed jiffies will be
safer.

Link: http://lkml.kernel.org/r/1544800658-11423-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-12 19:46:09 +01:00
Imran Khan
db1aa31ae8 hung task: check specific tasks for long uninterruptible sleep state
khungtask by default monitors all tasks for long unterruptible sleep.
This change introduces a sysctl option, /proc/sys/kernel/
hung_task_selective_monitoring, to enable monitoring selected tasks.
If this sysctl option is enabled then only the tasks with
/proc/$PID/hang_detection_enabled set are to be monitored,
otherwise all tasks are monitored as default case.

Some tasks may intentionally moves to uninterruptable sleep state,
which shouldn't leads to khungtask panics, as those are recoverable
hungs. So to avoid false hung reports, add an option to select tasks
to be monitored and report/panic them only.
By default, enable the feature always to monitor selected tasks.

Change-Id: I48cd8cfe73dbe2b577541fe9607190eac5556bb2
Signed-off-by: Imran Khan <kimran@codeaurora.org>
Signed-off-by: Lingutla Chandrasekhar <clingutla@codeaurora.org>
[sramana: Resolved minor merge conflict]
Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>
2018-08-29 18:02:28 +05:30
Tetsuo Handa
9691035cbf kernel/hung_task.c: show all hung tasks before panic
[ Upstream commit 401c636a0eeb0d51862fce222da1bf08e3a0ffd0 ]

When we get a hung task it can often be valuable to see _all_ the hung
tasks on the system before calling panic().

Quoting from https://syzkaller.appspot.com/text?tag=CrashReport&id=5316056503549952
----------------------------------------
INFO: task syz-executor0:6540 blocked for more than 120 seconds.
      Not tainted 4.16.0+ #13
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor0   D23560  6540   4521 0x80000004
Call Trace:
 context_switch kernel/sched/core.c:2848 [inline]
 __schedule+0x8fb/0x1ef0 kernel/sched/core.c:3490
 schedule+0xf5/0x430 kernel/sched/core.c:3549
 schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3607
 __mutex_lock_common kernel/locking/mutex.c:833 [inline]
 __mutex_lock+0xb7f/0x1810 kernel/locking/mutex.c:893
 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
 lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355
 __blkdev_driver_ioctl block/ioctl.c:303 [inline]
 blkdev_ioctl+0x1759/0x1e00 block/ioctl.c:601
 ioctl_by_bdev+0xa5/0x110 fs/block_dev.c:2060
 isofs_get_last_session fs/isofs/inode.c:567 [inline]
 isofs_fill_super+0x2ba9/0x3bc0 fs/isofs/inode.c:660
 mount_bdev+0x2b7/0x370 fs/super.c:1119
 isofs_mount+0x34/0x40 fs/isofs/inode.c:1560
 mount_fs+0x66/0x2d0 fs/super.c:1222
 vfs_kern_mount.part.26+0xc6/0x4a0 fs/namespace.c:1037
 vfs_kern_mount fs/namespace.c:2514 [inline]
 do_new_mount fs/namespace.c:2517 [inline]
 do_mount+0xea4/0x2b90 fs/namespace.c:2847
 ksys_mount+0xab/0x120 fs/namespace.c:3063
 SYSC_mount fs/namespace.c:3077 [inline]
 SyS_mount+0x39/0x50 fs/namespace.c:3074
 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
(...snipped...)
Showing all locks held in the system:
(...snipped...)
2 locks held by syz-executor0/6540:
 #0: 00000000566d4c39 (&type->s_umount_key#49/1){+.+.}, at: alloc_super fs/super.c:211 [inline]
 #0: 00000000566d4c39 (&type->s_umount_key#49/1){+.+.}, at: sget_userns+0x3b2/0xe60 fs/super.c:502 /* down_write_nested(&s->s_umount, SINGLE_DEPTH_NESTING); */
 #1: 0000000043ca8836 (&lo->lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 /* mutex_lock_nested(&lo->lo_ctl_mutex, 1); */
(...snipped...)
3 locks held by syz-executor7/6541:
 #0: 0000000043ca8836 (&lo->lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 /* mutex_lock_nested(&lo->lo_ctl_mutex, 1); */
 #1: 000000007bf3d3f9 (&bdev->bd_mutex){+.+.}, at: blkdev_reread_part+0x1e/0x40 block/ioctl.c:192
 #2: 00000000566d4c39 (&type->s_umount_key#50){.+.+}, at: __get_super.part.10+0x1d3/0x280 fs/super.c:663 /* down_read(&sb->s_umount); */
----------------------------------------

When reporting an AB-BA deadlock like shown above, it would be nice if
trace of PID=6541 is printed as well as trace of PID=6540 before calling
panic().

Showing hung tasks up to /proc/sys/kernel/hung_task_warnings could delay
calling panic() but normally there should not be so many hung tasks.

Link: http://lkml.kernel.org/r/201804050705.BHE57833.HVFOFtSOMQJFOL@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-03 07:50:23 +02:00
Tetsuo Handa
780cbcf287 kernel/hung_task.c: defer showing held locks
When I was running my testcase which may block hundreds of threads on fs
locks, I got lockup due to output from debug_show_all_locks() added by
commit b2d4c2edb2 ("locking/hung_task: Show all locks").

For example, if 1000 threads were blocked in TASK_UNINTERRUPTIBLE state
and 500 out of 1000 threads hold some lock, debug_show_all_locks() from
for_each_process_thread() loop will report locks held by 500 threads for
1000 times.  This is a too much noise.

In order to make sure rcu_lock_break() is called frequently, we should
avoid calling debug_show_all_locks() from for_each_process_thread() loop
because debug_show_all_locks() effectively calls for_each_process_thread()
loop.  Let's defer calling debug_show_all_locks() till before panic() or
leaving for_each_process_thread() loop.

Link: http://lkml.kernel.org/r/1489296834-60436-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08 17:15:10 -07:00
Ingo Molnar
b17b01533b sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h>
We are going to split <linux/sched/debug.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/debug.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:34 +01:00
Ingo Molnar
3f07c01441 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h>
We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:29 +01:00
Tetsuo Handa
4ca5ede07c hung_task: decrement sysctl_hung_task_warnings only if it is positive
Since sysctl_hung_task_warnings == -1 is allowed (infinite warnings),
commit 48a6d64eda ("hung_task: allow hung_task_panic when
hung_task_warnings is 0") should decrement it only when it is not -1.

This prevents the kernel from ceasing warnings after the first
4294967295 ;)

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: John Siddle <jsiddle@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12 18:55:09 -08:00
John Siddle
48a6d64eda hung_task: allow hung_task_panic when hung_task_warnings is 0
Previously hung_task_panic would not be respected if enabled after
hung_task_warnings had already been decremented to 0.

Permit the kernel to panic if hung_task_panic is enabled after
hung_task_warnings has already been decremented to 0 and another task
hangs for hung_task_timeout_secs seconds.

Check if hung_task_panic is enabled so we don't return prematurely, and
check if hung_task_warnings is non-zero so we don't print the warning
unnecessarily.

[akpm@linux-foundation.org: fix off-by-one]
Link: http://lkml.kernel.org/r/1473450214-4049-1-git-send-email-jsiddle@redhat.com
Signed-off-by: John Siddle <jsiddle@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 15:06:33 -07:00
Vegard Nossum
b2d4c2edb2 locking/hung_task: Show all locks
When we get a hung task it can often be valuable to see _all_ the held
locks on the system (in case we are being blocked on trying to acquire
one), e.g. with this patch we can immediately see where the problem is
below:

    INFO: task trinity-c3:14933 blocked for more than 120 seconds.
	  Not tainted 4.8.0-rc1+ #135
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    trinity-c3      D ffff88010c16fc88     0 14933      1 0x00080004
     ffff88010c16fc88 000000003b9aca00 0000000000000000 0000000000000296
     00000000776cdf88 ffff88011a520ae0 ffff88011a520b08 ffff88011a520198
     ffffffff867d7f00 ffff88011942c080 ffff880116841580 ffff88010c168000
    Call Trace:
     [<ffffffff845e9d37>] schedule+0x77/0x230
     [<ffffffff833cb8b9>] __lock_sock+0x129/0x250
     [<ffffffff833cb790>] ? __sk_destruct+0x450/0x450
     [<ffffffff81408ac0>] ? wake_bit_function+0x2e0/0x2e0
     [<ffffffff833d832b>] lock_sock_nested+0xeb/0x120
     [<ffffffff83bad815>] irda_setsockopt+0x65/0xb40
     [<ffffffff833c6c09>] SyS_setsockopt+0x139/0x230
     [<ffffffff833c6ad0>] ? SyS_recv+0x20/0x20
     [<ffffffff81004660>] ? trace_event_raw_event_sys_enter+0xb90/0xb90
     [<ffffffff823c7023>] ? __this_cpu_preempt_check+0x13/0x20
     [<ffffffff8162ee60>] ? __context_tracking_exit.part.3+0x30/0x1b0
     [<ffffffff833c6ad0>] ? SyS_recv+0x20/0x20
     [<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0
     [<ffffffff845f84aa>] entry_SYSCALL64_slow_path+0x25/0x25

    Showing all locks held in the system:
    2 locks held by khungtaskd/563:
     #0:  (rcu_read_lock){......}, at: [<ffffffff81534ce6>] watchdog+0x106/0x910
     #1:  (tasklist_lock){......}, at: [<ffffffff8141b3c4>] debug_show_all_locks+0x74/0x360
    1 lock held by trinity-c0/19280:
     #0:  (sk_lock-AF_IRDA){......}, at: [<ffffffff83bab7c6>] irda_accept+0x176/0x10f0
    1 lock held by trinity-c0/12865:
     #0:  (sk_lock-AF_IRDA){......}, at: [<ffffffff83bab7c6>] irda_accept+0x176/0x10f0

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1471538460-7505-1-git-send-email-vegard.nossum@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-24 12:16:13 +02:00
Tetsuo Handa
b4aa14a63c kernel/hung_task.c: use timeout diff when timeout is updated
When new timeout is written to /proc/sys/kernel/hung_task_timeout_secs,
khungtaskd is interrupted and again sleeps for full timeout duration.

This means that hang task will not be checked if new timeout is written
periodically within old timeout duration and/or checking of hang task
will be delayed for up to previous timeout duration.  Fix this by
remembering last time khungtaskd checked hang task.

This change will allow other watchdog tasks (if any) to share khungtaskd
by sleeping for minimal timeout diff of all watchdog tasks.  Doing more
watchdog tasks from khungtaskd will reduce the possibility of printk()
collisions by multiple watchdog threads.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Aaron Tomlin
972fae6993 kernel/hung_task.c: change hung_task.c to use for_each_process_thread()
In check_hung_uninterruptible_tasks() avoid the use of deprecated
while_each_thread().

The "max_count" logic will prevent a livelock - see commit 0c740d0a
("introduce for_each_thread() to replace the buggy while_each_thread()").
Having said this let's use for_each_process_thread().

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dave Wysochanski <dwysocha@redhat.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:22 -07:00
Fabian Frederick
b51dbec68c kernel/hung_task.c: convert simple_strtoul to kstrtouint
sysctl_hung_task_panic has been changed to unsigned int.  use kstrtouint
instead of obsolete simple_strtoul

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:15 -07:00
Paul Gortmaker
c96d6660dc kernel: audit/fix non-modular users of module_init in core code
Code that is obj-y (always built-in) or dependent on a bool Kconfig
(built-in or absent) can never be modular.  So using module_init as an
alias for __initcall can be somewhat misleading.

Fix these up now, so that we can relocate module_init from init.h into
module.h in the future.  If we don't do this, we'd have to add module.h
to obviously non-modular code, and that would be a worse thing.

The audit targets the following module_init users for change:
 kernel/user.c                  obj-y
 kernel/kexec.c                 bool KEXEC (one instance per arch)
 kernel/profile.c               bool PROFILING
 kernel/hung_task.c             bool DETECT_HUNG_TASK
 kernel/sched/stats.c           bool SCHEDSTATS
 kernel/user_namespace.c        bool USER_NS

Note that direct use of __initcall is discouraged, vs.  one of the
priority categorized subgroups.  As __initcall gets mapped onto
device_initcall, our use of subsys_initcall (which makes sense for these
files) will thus change this registration from level 6-device to level
4-subsys (i.e.  slightly earlier).  However no observable impact of that
difference has been observed during testing.

Also, two instances of missing ";" at EOL are fixed in kexec.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03 16:21:07 -07:00
Aaron Tomlin
270750dbc1 hung_task: Display every hung task warning
When khungtaskd detects hung tasks, it prints out
backtraces from a number of those tasks.

Limiting the number of backtraces being printed
out can result in the user not seeing the information
necessary to debug the issue. The hung_task_warnings
sysctl controls this feature.

This patch makes it possible for hung_task_warnings
to accept a special value to print an unlimited
number of backtraces when khungtaskd detects hung
tasks.

The special value is -1. To use this value it is
necessary to change types from ulong to int.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: oleg@redhat.com
Link: http://lkml.kernel.org/r/1390239253-24030-3-git-send-email-atomlin@redhat.com
[ Build warning fix. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-25 12:13:33 +01:00
Linus Torvalds
f080480488 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM changes from Paolo Bonzini:
 "Here are the 3.13 KVM changes.  There was a lot of work on the PPC
  side: the HV and emulation flavors can now coexist in a single kernel
  is probably the most interesting change from a user point of view.

  On the x86 side there are nested virtualization improvements and a few
  bugfixes.

  ARM got transparent huge page support, improved overcommit, and
  support for big endian guests.

  Finally, there is a new interface to connect KVM with VFIO.  This
  helps with devices that use NoSnoop PCI transactions, letting the
  driver in the guest execute WBINVD instructions.  This includes some
  nVidia cards on Windows, that fail to start without these patches and
  the corresponding userspace changes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (146 commits)
  kvm, vmx: Fix lazy FPU on nested guest
  arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu
  arm/arm64: KVM: MMIO support for BE guest
  kvm, cpuid: Fix sparse warning
  kvm: Delete prototype for non-existent function kvm_check_iopl
  kvm: Delete prototype for non-existent function complete_pio
  hung_task: add method to reset detector
  pvclock: detect watchdog reset at pvclock read
  kvm: optimize out smp_mb after srcu_read_unlock
  srcu: API for barrier after srcu read unlock
  KVM: remove vm mmap method
  KVM: IOMMU: hva align mapping page size
  KVM: x86: trace cpuid emulation when called from emulator
  KVM: emulator: cleanup decode_register_operand() a bit
  KVM: emulator: check rex prefix inside decode_register()
  KVM: x86: fix emulation of "movzbl %bpl, %eax"
  kvm_host: typo fix
  KVM: x86: emulate SAHF instruction
  MAINTAINERS: add tree for kvm.git
  Documentation/kvm: add a 00-INDEX file
  ...
2013-11-15 13:51:36 +09:00
Marcelo Tosatti
8b414521bc hung_task: add method to reset detector
In certain occasions it is possible for a hung task detector
positive to be false: continuation from a paused VM, for example.

Add a method to reset detection, similar as is done
with other kernel watchdogs.

Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-11-06 09:49:02 +02:00
Oleg Nesterov
6a716c90a5 hung_task debugging: Add tracepoint to report the hang
Currently check_hung_task() prints a warning if it detects the
problem, but it is not convenient to watch the system logs if
user-space wants to be notified about the hang.

Add the new trace_sched_process_hang() into check_hung_task(),
this way a user-space monitor can easily wait for the hang and
potentially resolve a problem.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Dave Sullivan <dsulliva@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20131019161828.GA7439@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-31 11:16:18 +01:00
Li Zefan
cd64647f04 hung_task: Change sysctl_hung_task_check_count to 'int'
As 'sysctl_hung_task_check_count' is 'unsigned long' when this
value is assigned to max_count in check_hung_uninterruptible_tasks(),
it's truncated to 'int' type.

This causes a minor artifact: if we write 2^32 to sysctl.hung_task_check_count,
hung task detection will be effectively disabled.

With this fix, it will still truncate the user input to 32 bits, but
reading sysctl.hung_task_check_count reflects the actual truncated value.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/523FFF4E.9050401@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-23 11:10:49 +02:00
Oleg Nesterov
41e85ce822 hung_task debugging: Print more info when reporting the problem
printk(KERN_ERR) from check_hung_task() likely means we have a bug,
but unlike BUG_ON()/WARN_ON ()it doesn't show the kernel version,
this complicates the bug-reports investigation.

Add the additional pr_err() to print tainted/release/version
like dump_stack_print_info() does, the output becomes:

        INFO: task perl:504 blocked for more than 2 seconds.
	      Not tainted 3.11.0-rc1-10367-g136bb46-dirty #1763
        "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
        ...

While at it, turn the old printk's into pr_err().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: ahecox@redhat.com
Cc: Christopher Williams <cww@redhat.com>
Cc: dwysocha@redhat.com
Cc: gavin@redhat.com
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: nshi@redhat.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130801165941.GA17544@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-08-02 11:02:42 +02:00
Sasha Levin
625056b65e hung task debugging: Inject NMI when hung and going to panic
Send an NMI to all CPUs when a hung task is detected and the hung
task code is configured to panic. This gives us a fairly uptodate
snapshot of all CPUs in the system.

This lets us get stack trace of all CPUs which makes life easier
trying to debug a deadlock, and the NMI doesn't change anything
since the next step is a kernel panic.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1331848040-1676-1-git-send-email-levinsasha928@gmail.com
[ extended the changelog a bit ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-25 12:39:25 +02:00
Oleg Nesterov
6027ce497d hung_task: fix the broken rcu_lock_break() logic
check_hung_uninterruptible_tasks()->rcu_lock_break() introduced by
"softlockup: check all tasks in hung_task" commit ce9dbe24 looks
absolutely wrong.

	- rcu_lock_break() does put_task_struct(). If the task has exited
	  it is not safe to even read its ->state, nothing protects this
	  task_struct.

	- The TASK_DEAD checks are wrong too. Contrary to the comment, we
	  can't use it to check if the task was unhashed. It can be unhashed
	  without TASK_DEAD, or it can be valid with TASK_DEAD.

	  For example, an autoreaping task can do release_task(current)
	  long before it sets TASK_DEAD in do_exit().

	  Or, a zombie task can have ->state == TASK_DEAD but release_task()
	  was not called, and in this case we must not break the loop.

Change this code to check pid_alive() instead, and do this before we drop
the reference to the task_struct.

Note: while_each_thread() under rcu_read_lock() is not really safe, it can
livelock.  This will be fixed later, but fortunately in this case the
"max_count" logic saves us anyway.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Mandeep Singh Baines <msb@google.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-05 15:49:42 -08:00
Mandeep Singh Baines
f9fab10bbd hung_task: fix false positive during vfork
vfork parent uninterruptibly and unkillably waits for its child to
exec/exit. This wait is of unbounded length. Ignore such waits
in the hung_task detector.

Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Reported-by: Sasha Levin <levinsasha928@gmail.com>
LKML-Reference: <1325344394.28904.43.camel@lappy>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Kacur <jkacur@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-03 16:14:32 -08:00
Paul Gortmaker
9984de1a5a kernel: Map most files to use export.h instead of module.h
The changed files were only including linux/module.h for the
EXPORT_SYMBOL infrastructure, and nothing else.  Revector them
onto the isolated export header for faster compile times.

Nothing to see here but a whole lot of instances of:

  -#include <linux/module.h>
  +#include <linux/export.h>

This commit is only changing the kernel dir; next targets
will probably be mm, fs, the arch dirs, etc.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 09:20:12 -04:00
Jeff Mahoney
e11feaa119 watchdog, hung_task_timeout: Add Kconfig configurable default
This patch allows the default value for sysctl_hung_task_timeout_secs
to be set at build time. The feature carries virtually no overhead,
so it makes sense to keep it enabled. On heavily loaded systems, though,
it can end up triggering stack traces when there is no bug other than
the system being underprovisioned. We use this patch to keep the hung task
facility available but disabled at boot-time.

The default of 120 seconds is preserved. As a note, commit e162b39a may
have accidentally reverted commit fb822db4, which raised the default from
120 seconds to 480 seconds.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Acked-by: Mandeep Singh Baines <msb@google.com>
Link: http://lkml.kernel.org/r/4DB8600C.8080000@suse.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-28 09:13:17 +02:00
John Kacur
6a103b0d44 lockup detector: Fix grammar by adding a missing "to" in the comments
This fixes a minor grammar problem in the comments in
hung_task.c

Signed-off-by: John Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1281021054-4228-2-git-send-email-jkacur@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-17 09:11:52 +02:00
John Kacur
f1b499f029 lockdep: Remove __debug_show_held_locks
There is no longer any functional difference between
__debug_show_held_locks() and debug_show_held_locks(),
so remove the former.

Signed-off-by: John Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1281021054-4228-1-git-send-email-jkacur@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-17 09:11:10 +02:00
Anton Blanchard
e5af022616 softlockup: Fix hung_task_check_count sysctl
I'm seeing spikes of up to 0.5ms in khungtaskd on a large
machine. To reduce this source of jitter I tried setting
hung_task_check_count to 0:

 # echo 0 > /proc/sys/kernel/hung_task_check_count

which didn't have the intended response. Change to a post
increment of max_count, so a value of 0 means check 0 tasks.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: msb@google.com
LKML-Reference: <20091127022820.GU32182@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-27 06:21:57 +01:00
Alexey Dobriyan
8d65af789f sysctl: remove "struct file *" argument of ->proc_handler
It's unused.

It isn't needed -- read or write flag is already passed and sysctl
shouldn't care about the rest.

It _was_ used in two places at arch/frv for some reason.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24 07:21:04 -07:00
Frederic Weisbecker
cf2592f59c softlockup: ensure the task has been switched out once
When we check if a task has been switched out since the last scan, we might
have a race condition on the following scenario:

- the task is freshly created and scheduled

- it puts its state to TASK_UNINTERRUPTIBLE and is not yet switched out

- check_hung_task() scans this task and will report a false positive because
  t->nvcsw + t->nivcsw == t->last_switch_count == 0

Add a check for such cases.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Mandeep Singh Baines <msb@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 11:04:16 +01:00
Mandeep Singh Baines
17406b82d6 softlockup: remove timestamp checking from hung_task
Impact: saves sizeof(long) bytes per task_struct

By guaranteeing that sysctl_hung_task_timeout_secs have elapsed between
tasklist scans we can avoid using timestamps.

Signed-off-by: Mandeep Singh Baines <msb@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 11:03:49 +01:00
Mandeep Singh Baines
94be52dc07 softlockup: convert read_lock in hung_task to rcu_read_lock
Since the tasklist is protected by rcu list operations, it is safe
to convert the read_lock()s to rcu_read_lock().

Suggested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mandeep Singh Baines <msb@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 19:55:31 +01:00
Mandeep Singh Baines
ce9dbe244b softlockup: check all tasks in hung_task
Impact: extend the scope of hung-task checks

Changed the default value of hung_task_check_count to PID_MAX_LIMIT.
hung_task_batch_count added to put an upper bound on the critical
section. Every hung_task_batch_count checks, the rcu lock is never
held for a too long time.

Keeping the critical section small minimizes time preemption is disabled
and keeps rcu grace periods small.

To prevent following a stale pointer, get_task_struct is called on g and t.
To verify that g and t have not been unhashed while outside the critical
section, the task states are checked.

The design was proposed by Frédéric Weisbecker.

Signed-off-by: Mandeep Singh Baines <msb@google.com>
Suggested-by: Frédéric Weisbecker <fweisbec@gmail.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 19:54:57 +01:00
Mandeep Singh Baines
603a148f43 softlockup: fix potential race in hung_task when resetting timeout
Impact: fix potential false panic

A potential race exists if sysctl_hung_task_timeout_secs is reset to 0
while inside check_hung_uniterruptible_tasks(). If check_task() is
entered, a comparison with 0 will result in a false hung_task being
detected.

If sysctl_hung_task_panic is set, the system will panic.

Signed-off-by: Mandeep Singh Baines <msb@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-18 19:20:17 +01:00
Mandeep Singh Baines
e162b39a36 softlockup: decouple hung tasks check from softlockup detection
Decoupling allows:

* hung tasks check to happen at very low priority

* hung tasks check and softlockup to be enabled/disabled independently
  at compile and/or run-time

* individual panic settings to be enabled disabled independently
  at compile and/or run-time

* softlockup threshold to be reduced without increasing hung tasks
  poll frequency (hung task check is expensive relative to softlock watchdog)

* hung task check to be zero over-head when disabled at run-time

Signed-off-by: Mandeep Singh Baines <msb@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:06:04 +01:00