d4414bc0e93d8da170fd0fc9fef65fe84015677d
98 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
a452d64169 |
Merge android-4.19-stable (4.19.239) into android-msm-pixel-4.19-lts
Merge 4.19.239 into android-4.19-stable
Linux 4.19.239
i2c: pasemi: Wait for write xfers to finish
* smp: Fix offline cpu check in flush_smp_call_function_queue()
kernel/smp.c
ARM: davinci: da850-evm: Avoid NULL pointer dereference
* ipv6: fix panic when forwarding a pkt with no in6 dev
net/ipv6/ip6_output.c
* ALSA: pcm: Test for "silence" field in struct "pcm_format_data"
sound/core/pcm_misc.c
ALSA: hda/realtek: Add quirk for Clevo PD50PNT
gcc-plugins: latent_entropy: use /dev/urandom
mm: kmemleak: take a full lowmem check in kmemleak_*_phys()
* mm, page_alloc: fix build_zonerefs_node()
mm/page_alloc.c
drivers: net: slip: fix NPD bug in sl_tx_timeout()
scsi: mvsas: Add PCI ID of RocketRaid 2640
drm/amd/display: Fix allocate_mst_payload assert on resume
* arm64: alternatives: mark patch_alternative() as `noinstr`
arch/arm64/kernel/alternative.c
gpu: ipu-v3: Fix dev_dbg frequency output
ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs
* net: micrel: fix KS8851_MLL Kconfig
drivers/net/ethernet/micrel/Kconfig
scsi: ibmvscsis: Increase INITIAL_SRP_LIMIT to 1024
scsi: target: tcmu: Fix possible page UAF
Drivers: hv: vmbus: Prevent load re-ordering when reading ring buffer
drm/amdkfd: Check for potential null return of kmalloc_array()
drm/amd: Add USBC connector ID
cifs: potential buffer overflow in handling symlinks
nfc: nci: add flush_workqueue to prevent uaf
testing/selftests/mqueue: Fix mq_perf_tests to free the allocated cpu set
* sctp: Initialize daddr on peeled off socket
net/sctp/socket.c
net: ethernet: stmmac: fix altr_tse_pcs function when using a fixed-link
mlxsw: i2c: Fix initialization error flow
gpiolib: acpi: use correct format characters
* veth: Ensure eth header is in skb's linear part
drivers/net/veth.c
* net/sched: flower: fix parsing of ethertype following VLAN header
include/net/flow_dissector.h
net/core/flow_dissector.c
memory: atmel-ebi: Fix missing of_node_put in atmel_ebi_probe
* ANDROID: GKI: fix crc issue with commit
|
||
|
|
6157d4ede6 |
sched/debug: Remove mpol_get/put and task_lock/unlock from sched_show_numa
[ Upstream commit 28c988c3ec29db74a1dda631b18785958d57df4f ]
The older format of /proc/pid/sched printed home node info which
required the mempolicy and task lock around mpol_get(). However
the format has changed since then and there is no need for
sched_show_numa() any more to have mempolicy argument,
asssociated mpol_get/put and task_lock/unlock. Remove them.
Fixes:
|
||
|
|
7ec4bc1d47 |
Merge android-4.19-stable (4.19.191) into android-msm-pixel-4.19-lts
Merge 4.19.191 into android-4.19-stable
Linux 4.19.191
scripts: switch explicitly to Python 3
tweewide: Fix most Shebang lines
* KVM: arm64: Initialize VCPU mdcr_el2 before loading it
arch/arm64/include/asm/kvm_host.h
* iomap: fix sub-page uptodate handling
fs/iomap.c
include/linux/iomap.h
* ipv6: remove extra dev_hold() for fallback tunnels
net/ipv6/ip6_tunnel.c
net/ipv6/ip6_vti.c
net/ipv6/sit.c
* ip6_tunnel: sit: proper dev_{hold|put} in ndo_[un]init methods
net/ipv6/ip6_tunnel.c
* sit: proper dev_{hold|put} in ndo_[un]init methods
net/ipv6/sit.c
ip6_gre: proper dev_{hold|put} in ndo_[un]init methods
net: stmmac: Do not enable RX FIFO overflow interrupts
lib: stackdepot: turn depot_lock spinlock to raw_spinlock
* block: reexpand iov_iter after read/write
fs/block_dev.c
ALSA: hda: generic: change the DAC ctl name for LO+SPK or LO+HP
gpiolib: acpi: Add quirk to ignore EC wakeups on Dell Venue 10 Pro 5055
scsi: target: tcmu: Return from tcmu_handle_completions() if cmd_id not found
ceph: fix fscache invalidation
riscv: Workaround mcount name prior to clang-13
scripts/recordmcount.pl: Fix RISC-V regex for clang
ARM: 9075/1: kernel: Fix interrupted SMC calls
um: Mark all kernel symbols as local
Input: silead - add workaround for x86 BIOS-es which bring the chip up in a stuck state
Input: elants_i2c - do not bind to i2c-hid compatible ACPI instantiated devices
ACPI / hotplug / PCI: Fix reference count leak in enable_slot()
ARM: 9066/1: ftrace: pause/unpause function graph tracer in cpu_suspend()
* PCI: thunder: Fix compile testing
drivers/pci/controller/pci-thunder-ecam.c
drivers/pci/controller/pci-thunder-pem.c
drivers/pci/pci.h
xsk: Simplify detection of empty and full rings
pinctrl: ingenic: Improve unreachable code generation
isdn: capi: fix mismatched prototypes
cxgb4: Fix the -Wmisleading-indentation warning
usb: sl811-hcd: improve misleading indentation
kgdb: fix gcc-11 warning on indentation
x86/msr: Fix wr/rdmsr_safe_regs_on_cpu() prototypes
nvme: do not try to reconfigure APST when the controller is not live
clk: exynos7: Mark aclk_fsys1_200 as critical
* netfilter: conntrack: Make global sysctls readonly in non-init netns
net/netfilter/nf_conntrack_standalone.c
* kobject_uevent: remove warning in init_uevent_argv()
lib/kobject_uevent.c
thermal/core/fair share: Lock the thermal zone while looping over instances
MIPS: Avoid handcoded DIVU in `__div64_32' altogether
MIPS: Avoid DIVU in `__div64_32' is result would be zero
MIPS: Reinstate platform `__div64_32' handler
* FDDI: defxx: Make MMIO the configuration default except for EISA
drivers/net/fddi/Kconfig
KVM: x86: Cancel pvclock_gtod_work on module removal
cdc-wdm: untangle a circular dependency between callback and softint
iio: tsl2583: Fix division by a zero lux_val
iio: gyro: mpu3050: Fix reported temperature value
* xhci: Add reset resume quirk for AMD xhci controller.
drivers/usb/host/xhci-pci.c
* xhci: Do not use GFP_KERNEL in (potentially) atomic context
drivers/usb/host/xhci.c
* usb: dwc3: gadget: Return success always for kick transfer in ep queue
drivers/usb/dwc3/gadget.c
* usb: core: hub: fix race condition about TRSMRCY of resume
drivers/usb/core/hub.c
usb: dwc2: Fix gadget DMA unmap direction
* usb: xhci: Increase timeout for HC halt
drivers/usb/host/xhci-ext-caps.h
usb: dwc3: pci: Enable usb2-gadget-lpm-disable for Intel Merrifield
usb: dwc3: omap: improve extcon initialization
* blk-mq: Swap two calls in blk_mq_exit_queue()
block/blk-mq.c
ACPI: scan: Fix a memory leak in an error handling path
usb: fotg210-hcd: Fix an error message
iio: proximity: pulsedlight: Fix rumtime PM imbalance on error
drm/radeon/dpm: Disable sclk switching on Oland when two 4K 60Hz monitors are connected
* userfaultfd: release page in error path to avoid BUG_ON
mm/shmem.c
squashfs: fix divide error in calculate_skip()
hfsplus: prevent corruption in shrinking truncate
powerpc/64s: Fix crashes when toggling entry flush barrier
powerpc/64s: Fix crashes when toggling stf barrier
ARC: entry: fix off-by-one error in syscall number validation
i40e: Fix use-after-free in i40e_client_subtask()
netfilter: nftables: avoid overflows in nft_hash_buckets()
kernel: kexec_file: fix error return code of kexec_calculate_store_digests()
* sched/fair: Fix unfairness caused by missing load decay
kernel/sched/fair.c
netfilter: nfnetlink_osf: Fix a missing skb_header_pointer() NULL check
smc: disallow TCP_ULP in smc_setsockopt()
* net: fix nla_strcmp to handle more then one trailing null character
lib/nlattr.c
ksm: fix potential missing rmap_item for stable_node
mm/hugeltb: handle the error case in hugetlb_fix_reserve_counts()
khugepaged: fix wrong result value for trace_mm_collapse_huge_page_isolate()
drm/radeon: Avoid power table parsing memory leaks
drm/radeon: Fix off-by-one power_state index heap overwrite
* netfilter: xt_SECMARK: add new revision to fix structure layout
include/uapi/linux/netfilter/xt_SECMARK.h
net/netfilter/xt_SECMARK.c
* sctp: fix a SCTP_MIB_CURRESTAB leak in sctp_sf_do_dupcook_b
net/sctp/sm_statefuns.c
ethernet:enic: Fix a use after free bug in enic_hard_start_xmit
* sctp: do asoc update earlier in sctp_sf_do_dupcook_a
net/sctp/sm_statefuns.c
net: hns3: disable phy loopback setting in hclge_mac_start_phy
rtc: ds1307: Fix wday settings for rx8130
NFSv4.2 fix handling of sr_eof in SEEK's reply
pNFS/flexfiles: fix incorrect size check in decode_nfs_fh()
PCI: endpoint: Fix missing destroy_workqueue()
NFS: Deal correctly with attribute generation counter overflow
NFSv4.2: Always flush out writes in nfs42_proc_fallocate()
* rpmsg: qcom_glink_native: fix error return code of qcom_glink_rx_data()
drivers/rpmsg/qcom_glink_native.c
ARM: 9064/1: hw_breakpoint: Do not directly check the event's overflow_handler hook
* PCI: Release OF node in pci_scan_device()'s error path
drivers/pci/probe.c
PCI: iproc: Fix return value of iproc_msi_irq_domain_alloc()
* f2fs: fix a redundant call to f2fs_balance_fs if an error occurs
fs/f2fs/inline.c
ASoC: rt286: Make RT286_SET_GPIO_* readable and writable
ia64: module: fix symbolizer crash on fdescr
net: ethernet: mtk_eth_soc: fix RX VLAN offload
powerpc/iommu: Annotate nested lock for lockdep
wl3501_cs: Fix out-of-bounds warnings in wl3501_mgmt_join
wl3501_cs: Fix out-of-bounds warnings in wl3501_send_pkt
powerpc/pseries: Stop calling printk in rtas_stop_self()
samples/bpf: Fix broken tracex1 due to kprobe argument change
* ethtool: ioctl: Fix out-of-bounds warning in store_link_ksettings_for_user()
net/core/ethtool.c
ASoC: rt286: Generalize support for ALC3263 codec
powerpc/smp: Set numa node before updating mask
* sctp: Fix out-of-bounds warning in sctp_process_asconf_param()
net/sctp/sm_make_chunk.c
kconfig: nconf: stop endless search loops
selftests: Set CC to clang in lib.mk if LLVM is set
cuse: prevent clone
pinctrl: samsung: use 'int' for register masks in Exynos
mac80211: clear the beacon's CRC after channel switch
* i2c: Add I2C_AQ_NO_REP_START adapter quirk
include/linux/i2c.h
ASoC: Intel: bytcr_rt5640: Add quirk for the Chuwi Hi8 tablet
* ip6_vti: proper dev_{hold|put} in ndo_[un]init methods
net/ipv6/ip6_vti.c
* Bluetooth: check for zapped sk before connecting
net/bluetooth/l2cap_sock.c
* net: bridge: when suppression is enabled exclude RARP packets
net/bridge/br_arp_nd_proxy.c
* Bluetooth: initialize skb_queue_head at l2cap_chan_create()
net/bluetooth/l2cap_core.c
* Bluetooth: Set CONF_NOT_COMPLETE as l2cap_chan default
net/bluetooth/l2cap_core.c
ALSA: rme9652: don't disable if not enabled
ALSA: hdspm: don't disable if not enabled
ALSA: hdsp: don't disable if not enabled
* i2c: bail out early when RDWR parameters are wrong
drivers/i2c/i2c-dev.c
net: stmmac: Set FIFO sizes for ipq806x
ASoC: Intel: bytcr_rt5640: Enable jack-detect support on Asus T100TAF
* tipc: convert dest node's address to network order
net/tipc/netlink_compat.c
fs: dlm: fix debugfs dump
tpm: fix error return code in tpm2_get_cc_attrs_tbl()
* Revert "fdt: Properly handle "no-map" field in the memory region"
drivers/of/fdt.c
* Revert "of/fdt: Make sure no-map does not remove already reserved regions"
drivers/of/fdt.c
* sctp: delay auto_asconf init until binding the first addr
net/sctp/socket.c
* Revert "net/sctp: fix race condition in sctp_destroy_sock"
net/sctp/socket.c
* smp: Fix smp_call_function_single_async prototype
include/linux/smp.h
kernel/smp.c
* net: Only allow init netns to set default tcp cong to a restricted algo
net/ipv4/tcp_cong.c
mm/memory-failure: unnecessary amount of unmapping
* mm/sparse: add the missing sparse_buffer_fini() in error branch
mm/sparse.c
kfifo: fix ternary sign extension bugs
net:nfc:digital: Fix a double free in digital_tg_recv_dep_req
RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res
net:emac/emac-mac: Fix a use after free in emac_mac_tx_buf_send
net: geneve: modify IP header check in geneve6_xmit_skb and geneve_xmit_skb
arm64: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E
ARM: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E
bnxt_en: fix ternary sign extension bug in bnxt_show_temp()
powerpc/52xx: Fix an invalid ASM expression ('addi' used instead of 'add')
ath10k: Fix ath10k_wmi_tlv_op_pull_peer_stats_info() unlock without lock
ath9k: Fix error check in ath9k_hw_read_revisions() for PCI devices
net: davinci_emac: Fix incorrect masking of tx and rx error channel
* ALSA: usb: midi: don't return -ENOMEM when usb_urb_ep_type_check fails
sound/usb/midi.c
RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails
vsock/vmci: log once the failed queue pair allocation
mwl8k: Fix a double Free in mwl8k_probe_hw
i2c: sh7760: fix IRQ error path
rtlwifi: 8821ae: upgrade PHY and RF parameters
powerpc/pseries: extract host bridge from pci_bus prior to bus removal
MIPS: pci-legacy: stop using of_pci_range_to_resource
drm/i915/gvt: Fix error code in intel_gvt_init_device()
ASoC: ak5558: correct reset polarity
i2c: sh7760: add IRQ check
i2c: jz4780: add IRQ check
i2c: emev2: add IRQ check
i2c: cadence: add IRQ check
RDMA/srpt: Fix error return code in srpt_cm_req_recv()
net: thunderx: Fix unintentional sign extension issue
IB/hfi1: Fix error return code in parse_platform_config()
mt7601u: fix always true expression
mac80211: bail out if cipher schemes are invalid
powerpc: iommu: fix build when neither PCI or IBMVIO is set
powerpc/perf: Fix PMU constraint check for EBB events
powerpc/64s: Fix pte update for kernel memory on radix
liquidio: Fix unintented sign extension of a left shift of a u16
* ALSA: usb-audio: Add error checks for usb_driver_claim_interface() calls
sound/usb/card.c
sound/usb/quirks.c
sound/usb/usbaudio.h
net: hns3: Limiting the scope of vector_ring_chain variable
nfc: pn533: prevent potential memory corruption
* bug: Remove redundant condition check in report_bug
lib/bug.c
* ALSA: core: remove redundant spin_lock pair in snd_card_disconnect
sound/core/init.c
powerpc: Fix HAVE_HARDLOCKUP_DETECTOR_ARCH build configuration
powerpc/prom: Mark identical_pvr_fixup as __init
net: lapbether: Prevent racing when checking whether the netif is running
perf symbols: Fix dso__fprintf_symbols_by_name() to return the number of printed chars
* HID: plantronics: Workaround for double volume key presses
drivers/hid/hid-ids.h
drivers/hid/hid-plantronics.c
include/linux/hid.h
drivers/block/null_blk/main: Fix a double free in null_init.
* sched/debug: Fix cgroup_path[] serialization
kernel/sched/debug.c
x86/events/amd/iommu: Fix sysfs type mismatch
HSI: core: fix resource leaks in hsi_add_client_from_dt()
mfd: stm32-timers: Avoid clearing auto reload register
scsi: ibmvfc: Fix invalid state machine BUG_ON()
scsi: sni_53c710: Add IRQ check
scsi: sun3x_esp: Add IRQ check
scsi: jazz_esp: Add IRQ check
clk: uniphier: Fix potential infinite loop
clk: qcom: a53-pll: Add missing MODULE_DEVICE_TABLE
vfio/mdev: Do not allow a mdev_type to have a NULL parent pointer
nvme: retrigger ANA log update if group descriptor isn't found
ata: libahci_platform: fix IRQ check
sata_mv: add IRQ checks
pata_ipx4xx_cf: fix IRQ check
pata_arasan_cf: fix IRQ check
x86/kprobes: Fix to check non boostable prefixes correctly
drm/amdkfd: fix build error with AMD_IOMMU_V2=m
media: m88rs6000t: avoid potential out-of-bounds reads on arrays
media: omap4iss: return error code when omap4iss_get() failed
media: vivid: fix assignment of dev->fbuf_out_flags
soc: aspeed: fix a ternary sign expansion bug
* ttyprintk: Add TTY hangup callback.
drivers/char/ttyprintk.c
usb: dwc2: Fix hibernation between host and device modes.
usb: dwc2: Fix host mode hibernation exit with remote wakeup flow.
Drivers: hv: vmbus: Increase wait time for VMbus unload
x86/platform/uv: Fix !KEXEC build failure
platform/x86: pmc_atom: Match all Beckhoff Automation baytrail boards with critclk_systems DMI table
usbip: vudc: fix missing unlock on error in usbip_sockfd_store()
* firmware: qcom-scm: Fix QCOM_SCM configuration
drivers/firmware/Kconfig
* tty: fix return value for unsupported ioctls
drivers/tty/tty_io.c
include/linux/tty_driver.h
* tty: actually undefine superseded ASYNC flags
include/uapi/linux/tty_flags.h
USB: cdc-acm: fix unprivileged TIOCCSERIAL
usb: gadget: r8a66597: Add missing null check on return from platform_get_resource
cpufreq: armada-37xx: Fix determining base CPU frequency
cpufreq: armada-37xx: Fix driver cleanup when registration failed
clk: mvebu: armada-37xx-periph: Fix workaround for switching from L1 to L0
clk: mvebu: armada-37xx-periph: Fix switching CPU freq from 250 Mhz to 1 GHz
cpufreq: armada-37xx: Fix the AVS value for load L1
clk: mvebu: armada-37xx-periph: remove .set_parent method for CPU PM clock
cpufreq: armada-37xx: Fix setting TBG parent for load levels
crypto: qat - Fix a double free in adf_create_ring
ACPI: CPPC: Replace cppc_attr with kobj_attribute
* soc: qcom: mdt_loader: Detect truncated read of segments
drivers/soc/qcom/mdt_loader.c
* soc: qcom: mdt_loader: Validate that p_filesz < p_memsz
drivers/soc/qcom/mdt_loader.c
* spi: Fix use-after-free with devm_spi_alloc_*
drivers/spi/spi.c
include/linux/spi/spi.h
staging: greybus: uart: fix unprivileged TIOCCSERIAL
staging: rtl8192u: Fix potential infinite loop
* irqchip/gic-v3: Fix OF_BAD_ADDR error handling
drivers/irqchip/irq-gic-v3-mbi.c
mtd: rawnand: gpmi: Fix a double free in gpmi_nand_init
soundwire: stream: fix memory leak in stream config error path
USB: gadget: udc: fix wrong pointer passed to IS_ERR() and PTR_ERR()
usb: gadget: aspeed: fix dma map failure
crypto: qat - fix error path in adf_isr_resource_alloc()
* phy: marvell: ARMADA375_USBCLUSTER_PHY should not default to y, unconditionally
drivers/phy/marvell/Kconfig
soundwire: bus: Fix device found flag correctly
* bus: qcom: Put child node before return
drivers/bus/qcom-ebi2.c
mtd: require write permissions for locking and badblock ioctls
fotg210-udc: Complete OUT requests on short packets
fotg210-udc: Don't DMA more than the buffer can take
fotg210-udc: Mask GRP2 interrupts we don't handle
fotg210-udc: Remove a dubious condition leading to fotg210_done
fotg210-udc: Fix EP0 IN requests bigger than two packets
fotg210-udc: Fix DMA on EP0 for length > max packet size
crypto: qat - ADF_STATUS_PF_RUNNING should be set after adf_dev_init
crypto: qat - don't release uninitialized resources
usb: gadget: pch_udc: Check for DMA mapping error
usb: gadget: pch_udc: Check if driver is present before calling ->setup()
usb: gadget: pch_udc: Replace cpu_to_le32() by lower_32_bits()
x86/microcode: Check for offline CPUs before requesting new microcode
mtd: rawnand: qcom: Return actual error code instead of -ENODEV
mtd: Handle possible -EPROBE_DEFER from parse_mtd_partitions()
mtd: rawnand: brcmnand: fix OOB R/W with Hamming ECC
mtd: rawnand: fsmc: Fix error code in fsmc_nand_probe()
* regmap: set debugfs_name to NULL after it is freed
drivers/base/regmap/regmap-debugfs.c
usb: typec: tcpci: Check ROLE_CONTROL while interpreting CC_STATUS
serial: stm32: fix tx_empty condition
serial: stm32: fix incorrect characters on console
ARM: dts: exynos: correct PMIC interrupt trigger level on Snow
ARM: dts: exynos: correct PMIC interrupt trigger level on SMDK5250
ARM: dts: exynos: correct PMIC interrupt trigger level on Odroid X/U3 family
ARM: dts: exynos: correct PMIC interrupt trigger level on Midas family
ARM: dts: exynos: correct MUIC interrupt trigger level on Midas family
ARM: dts: exynos: correct fuel gauge interrupt trigger level on Midas family
memory: gpmc: fix out of bounds read and dereference on gpmc_cs[]
usb: gadget: pch_udc: Revert
|
||
|
|
302d674cfa |
sched/debug: Fix cgroup_path[] serialization
[ Upstream commit ad789f84c9a145f8a18744c0387cec22ec51651e ]
The handling of sysrq key can be activated by echoing the key to
/proc/sysrq-trigger or via the magic key sequence typed into a terminal
that is connected to the system in some way (serial, USB or other mean).
In the former case, the handling is done in a user context. In the
latter case, it is likely to be in an interrupt context.
Currently in print_cpu() of kernel/sched/debug.c, sched_debug_lock is
taken with interrupt disabled for the whole duration of the calls to
print_*_stats() and print_rq() which could last for the quite some time
if the information dump happens on the serial console.
If the system has many cpus and the sched_debug_lock is somehow busy
(e.g. parallel sysrq-t), the system may hit a hard lockup panic
depending on the actually serial console implementation of the
system.
The purpose of sched_debug_lock is to serialize the use of the global
cgroup_path[] buffer in print_cpu(). The rests of the printk calls don't
need serialization from sched_debug_lock.
Calling printk() with interrupt disabled can still be problematic if
multiple instances are running. Allocating a stack buffer of PATH_MAX
bytes is not feasible because of the limited size of the kernel stack.
The solution implemented in this patch is to allow only one caller at a
time to use the full size group_path[], while other simultaneous callers
will have to use shorter stack buffers with the possibility of path
name truncation. A "..." suffix will be printed if truncation may have
happened. The cgroup path name is provided for informational purpose
only, so occasional path name truncation should not be a big problem.
Fixes:
|
||
|
|
d7864ac281 |
Merge android-4.19.48 (01f5de3) into msm-4.19
* refs/heads/tmp-01f5de3: Linux 4.19.48 tipc: fix modprobe tipc failed after switch order of device registration Revert "tipc: fix modprobe tipc failed after switch order of device registration" xen/pciback: Don't disable PCI_COMMAND on PCI device reset. jump_label: move 'asm goto' support test to Kconfig compiler.h: give up __compiletime_assert_fallback() include/linux/compiler*.h: define asm_volatile_goto crypto: vmx - ghash: do nosimd fallback manually net/tls: don't ignore netdev notifications if no TLS features net/tls: fix state removal with feature flags off bnxt_en: Fix aggregation buffer leak under OOM condition. net: stmmac: dma channel control register need to be init first net/mlx5e: Disable rxhash when CQE compress is enabled net/mlx5: Allocate root ns memory using kzalloc to match kfree tipc: Avoid copying bytes beyond the supplied data net/mlx5: Avoid double free in fs init error unwinding path usbnet: fix kernel crash after disconnect net: stmmac: fix reset gpio free missing net: sched: don't use tc_action->order during action dump net: phy: marvell10g: report if the PHY fails to boot firmware net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value net: mvneta: Fix err code path of probe net-gro: fix use-after-free read in napi_gro_frags() net: fec: fix the clk mismatch in failed_reset path net: dsa: mv88e6xxx: fix handling of upper half of STATS_TYPE_PORT llc: fix skb leak in llc_build_and_send_ui_pkt() ipv6: Fix redirect with VRF ipv6: Consider sk_bound_dev_if when binding a raw socket to an address ipv4/igmp: fix build error if !CONFIG_IP_MULTICAST ipv4/igmp: fix another memory leak in igmpv3_del_delrec() inet: switch IP ID generator to siphash cxgb4: offload VLAN flows regardless of VLAN ethtype bonding/802.3ad: fix slave link initialization transition states Conflicts: include/linux/compiler.h Change-Id: I43dd2908aa00a247ac985c36d210e83370361315 Signed-off-by: Ivaylo Georgiev <irgeorgiev@codeaurora.org> |
||
|
|
a6f81a0b33 |
sched/walt: Improve the scheduler
This change is for general scheduler improvement. Change-Id: I2735e374c0c78b89d0f722aa6346deb7022ed9b3 Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> |
||
|
|
f7de9ee575 |
sched/walt: Improve the scheduler
This change is for general scheduler improvement. Change-Id: I3af4fccd6db844729863f0422a910e2906c8ff9a Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> |
||
|
|
0276ebf166 |
jump_label: move 'asm goto' support test to Kconfig
commit e9666d10a5677a494260d60d1fa0b73cc7646eb3 upstream. Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label". The jump label is controlled by HAVE_JUMP_LABEL, which is defined like this: #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) # define HAVE_JUMP_LABEL #endif We can improve this by testing 'asm goto' support in Kconfig, then make JUMP_LABEL depend on CC_HAS_ASM_GOTO. Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will match to the real kernel capability. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Tested-by: Sedat Dilek <sedat.dilek@gmail.com> [nc: Fix trivial conflicts in 4.19 arch/xtensa/kernel/jump_label.c doesn't exist yet Ensured CC_HAVE_ASM_GOTO and HAVE_JUMP_LABEL were sufficiently eliminated] Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
35c3faa19f |
Merge android-4.19.34 (d885da6) into msm-4.19
* refs/heads/tmp-d885da6: Revert "coresight: etm4x: Add support to enable ETMv4.2" Revert "usb: dwc3: gadget: Fix OTG events when gadget driver isn't loaded" Linux 4.19.34 kprobes/x86: Blacklist non-attachable interrupt functions bcache: fix potential div-zero error of writeback_rate_p_term_inverse ACPI / video: Extend chassis-type detection with a "Lunch Box" check net: stmmac: Avoid one more sometimes uninitialized Clang warning drm/dp/mst: Configure no_stop_bit correctly for remote i2c xfers Input: soc_button_array - fix mapping of the 5th GPIO in a PNP0C40 device dmaengine: tegra: avoid overflow of byte tracking clk: rockchip: fix frac settings of GPLL clock for rk3328 clk: meson: clean-up clock registration drm/fb-helper: fix leaks in error path of drm_fb_helper_fbdev_setup x86/build: Mark per-CPU symbols as absolute explicitly for LLD wlcore: Fix memory leak in case wl12xx_fetch_firmware failure brcmfmac: Use firmware_request_nowarn for the clm_blob selinux: do not override context on context mounts x86/build: Specify elf_i386 linker emulation explicitly for i386 objects drm/nouveau: Stop using drm_crtc_force_disable drm: Auto-set allow_fb_modifiers when given modifiers at plane init pinctrl: meson: meson8b: add the eth_rxd2 and eth_rxd3 pins regulator: act8865: Fix act8600_sudcdc_voltage_ranges setting media: s5p-jpeg: Check for fmt_ver_flag when doing fmt enumeration media: rcar-vin: Allow independent VIN link enablement netfilter: physdev: relax br_netfilter dependency dmaengine: qcom_hidma: initialize tx flags in hidma_prep_dma_* dmaengine: qcom_hidma: assign channel cookie correctly dmaengine: imx-dma: fix warning comparison of distinct pointer types cpu/hotplug: Mute hotplug lockdep during init hpet: Fix missing '=' character in the __setup() code of hpet_mmap_enable f2fs: UBSAN: set boolean value iostat_enable correctly HID: intel-ish: ipc: handle PIMR before ish_wakeup also clear PISR busy_clear bit soc/tegra: fuse: Fix illegal free of IO base address hwrng: virtio - Avoid repeated init of completion media: mt9m111: set initial frame size other than 0x0 perf script python: Add trace_context extension module to sys.modules perf script python: Use PyBytes for attr in trace-event-python platform/x86: intel-hid: Missing power button release on some Dell models usb: dwc3: gadget: Fix OTG events when gadget driver isn't loaded ALSA: dice: add support for Solid State Logic Duende Classic/Mini drm/amd/display: Enable vblank interrupt during CRC capture powerpc/pseries: Perform full re-add of CPU for topology update post-migration tty: increase the default flip buffer limit to 2*640K backlight: pwm_bl: Use gpiod_get_value_cansleep() to get initial state cgroup/pids: turn cgroup_subsys->free() into cgroup_subsys->release() to fix the accounting powerpc/64s: Clear on-stack exception marker upon exception return selftests/bpf: skip verifier tests for unsupported program types bpf: fix missing prototype warnings block, bfq: fix in-service-queue check for queue merging ARM: avoid Cortex-A9 livelock on tight dmb loops ARM: 8830/1: NOMMU: Toggle only bits in EXC_RETURN we are really care of mt7601u: bump supported EEPROM version soc: qcom: gsbi: Fix error handling in gsbi_probe() efi/arm/arm64: Allow SetVirtualAddressMap() to be omitted ARM: dts: lpc32xx: Remove leading 0x and 0s from bindings notation drm/vkms: Bugfix extra vblank frame sched/core: Use READ_ONCE()/WRITE_ONCE() in move_queued_task()/task_rq_lock() efi/memattr: Don't bail on zero VA if it equals the region's PA sched/debug: Initialize sd_sysctl_cpus if !CONFIG_CPUMASK_OFFSTACK ASoC: fsl-asoc-card: fix object reference leaks in fsl_asoc_card_probe iwlwifi: mvm: fix RFH config command with >=10 CPUs staging: spi: mt7621: Add return code check on device_reset() i2c: of: Try to find an I2C adapter matching the parent platform/x86: intel_pmc_core: Fix PCH IP sts reading e1000e: Exclude device from suspend direct complete optimization e1000e: fix cyclic resets at link up with active tx perf/aux: Make perf_event accessible to setup_aux() drm/amd/display: Disconnect mpcc when changing tg drm/amd/display: Don't re-program planes for DPMS changes drm: rcar-du: add missing of_node_put cdrom: Fix race condition in cdrom_sysctl_register fbdev: fbmem: fix memory access if logo is bigger than the screen net: phy: consider latched link-down status in polling mode iw_cxgb4: fix srqidx leak during connection abort net: marvell: mvpp2: fix stuck in-band SGMII negotiation genirq: Avoid summation loops for /proc/stat bcache: improve sysfs_strtoul_clamp() bcache: fix potential div-zero error of writeback_rate_i_term_inverse bcache: fix input overflow to sequential_cutoff bcache: fix input overflow to cache set sysfs file io_error_halflife sched/topology: Fix percpu data types in struct sd_data & struct s_data usb: f_fs: Avoid crash due to out-of-scope stack ptr access ath10k: fix shadow register implementation for WCN3990 ALSA: PCM: check if ops are defined before suspending PCM ARM: dts: meson8b: fix the Ethernet data line signals in eth_rgmii_pins ARM: 8833/1: Ensure that NEON code always compiles with Clang netfilter: conntrack: fix cloned unconfirmed skb->_nfct race in __nf_conntrack_confirm kprobes: Prohibit probing on RCU debug routine kprobes: Prohibit probing on bsearch() selftests: skip seccomp get_metadata test if not real root ACPI / video: Refactor and fix dmi_is_desktop() iwlwifi: pcie: fix emergency path perf report: Add s390 diagnosic sampling descriptor size leds: lp55xx: fix null deref on firmware load failure jbd2: fix race when writing superblock cgroup, rstat: Don't flush subtree root unless necessary HID: intel-ish-hid: avoid binding wrong ishtp_cl_device vfs: fix preadv64v2 and pwritev64v2 compat syscalls with offset == -1 xen/gntdev: Do not destroy context while dma-bufs are in use mt76: usb: do not run mt76u_queues_deinit twice media: mtk-jpeg: Correct return type for mem2mem buffer helpers media: mx2_emmaprp: Correct return type for mem2mem buffer helpers media: s5p-g2d: Correct return type for mem2mem buffer helpers media: rockchip/rga: Correct return type for mem2mem buffer helpers media: s5p-jpeg: Correct return type for mem2mem buffer helpers media: sh_veu: Correct return type for mem2mem buffer helpers media: ov7740: fix runtime pm initialization SoC: imx-sgtl5000: add missing put_device() perf report: Don't shadow inlined symbol with different addr range mwifiex: don't advertise IBSS features without FW support perf test: Fix failure of 'evsel-tp-sched' test on s390 drm/amd/display: Clear stream->mode_changed after commit scsi: fcoe: make use of fip_mode enum complete scsi: megaraid_sas: return error when create DMA pool failed s390/ism: ignore some errors during deregistration efi: cper: Fix possible out-of-bounds access cpufreq: acpi-cpufreq: Report if CPU doesn't support boost technologies ASoC: qcom: Fix of-node refcount unbalance in qcom_snd_parse_of() perf annotate: Fix getting source line failure clk: fractional-divider: check parent rate only if flag is set IB/mlx4: Increase the timeout for CM cache loop: set GENHD_FL_NO_PART_SCAN after blkdev_reread_part() platform/mellanox: mlxreg-hotplug: Fix KASAN warning platform/x86: ideapad-laptop: Fix no_hw_rfkill_list for Lenovo RESCUER R720-15IKBN mlxsw: spectrum: Avoid -Wformat-truncation warnings e1000e: Fix -Wformat-truncation warnings net: dsa: mv88e6xxx: Add lockdep classes to fix false positive splat mmc: omap: fix the maximum timeout setting btrfs: qgroup: Make qgroup async transaction commit more aggressive powerpc/hugetlb: Handle mmap_min_addr correctly in get_unmapped_area callback iommu/io-pgtable-arm-v7s: Only kmemleak_ignore L2 tables ARM: 8840/1: use a raw_spinlock_t in unwind serial: 8250_pxa: honor the port number from devicetree coresight: etm4x: Add support to enable ETMv4.2 powerpc/xmon: Fix opcode being uninitialized in print_insn_powerpc kbuild: invoke syncconfig if include/config/auto.conf.cmd is missing scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c powerpc/powernv/ioda: Fix locked_vm counting for memory used by IOMMU tables usb: chipidea: Grab the (legacy) USB PHY by phandle first crypto: cavium/zip - fix collision with generic cra_driver_name crypto: crypto4xx - add missing of_node_put after of_device_is_available mt76: fix a leaked reference by adding a missing of_node_put wil6210: check null pointer in _wil_cfg80211_merge_extra_ies PCI/PME: Fix hotplug/sysfs remove deadlock in pcie_pme_remove() tools lib traceevent: Fix buffer overflow in arg_eval fs: fix guard_bio_eod to check for real EOD errors jbd2: fix invalid descriptor block checksum netfilter: conntrack: tcp: only close if RST matches exact sequence netfilter: nf_tables: check the result of dereferencing base_chain->stats cifs: Fix NULL pointer dereference of devname cifs: Accept validate negotiate if server return NT_STATUS_NOT_SUPPORTED f2fs: fix to check inline_xattr_size boundary correctly dm thin: add sanity checks to thin-pool and external snapshot creation cifs: use correct format characters page_poison: play nicely with KASAN fs/file.c: initialize init_files.resize_wait f2fs: do not use mutex lock in atomic context ocfs2: fix a panic problem caused by o2cb_ctl mm/slab.c: kmemleak no scan alien caches mm/vmalloc.c: fix kernel BUG at mm/vmalloc.c:512! mm, mempolicy: fix uninit memory access memcg: killed threads should not invoke memcg OOM killer mm,oom: don't kill global init via memory.oom.group mm, swap: bounds check swap_info array accesses to avoid NULL derefs mm/page_ext.c: fix an imbalance with kmemleak mm/cma.c: cma_declare_contiguous: correct err handling mm/sparse: fix a bad comparison perf c2c: Fix c2c report for empty numa node x86/hyperv: Fix kernel panic when kexec on HyperV iio: adc: fix warning in Qualcomm PM8xxx HK/XOADC driver scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO scsi: hisi_sas: Set PHY linkrate when disconnected libbpf: force fixdep compilation at the start of the build enic: fix build warning without CONFIG_CPUMASK_OFFSTACK net: stmmac: Avoid sometimes uninitialized Clang warnings sysctl: handle overflow for file-max include/linux/relay.h: fix percpu annotation in struct rchan gpio: gpio-omap: fix level interrupt idling net/mlx5: Avoid panic when setting vport mac, getting vport config net/mlx5: Avoid panic when setting vport rate tracing: kdb: Fix ftdump to not sleep f2fs: fix to avoid deadlock in f2fs_read_inline_dir() f2fs: fix to adapt small inline xattr space in __find_inline_xattr() h8300: use cc-cross-prefix instead of hardcoding h8300-unknown-linux- CIFS: fix POSIX lock leak and invalid ptr deref tty/serial: atmel: RS485 HD w/DMA: enable RX after TX is stopped tty/serial: atmel: Add is_half_duplex helper ext4: cleanup bh release code in ext4_ind_remove_space() arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals ANDROID: cuttlefish_defconfig: Enable CONFIG_OVERLAY_FS ANDROID: cuttlefish: enable CONFIG_NET_SCH_INGRESS=y Conflicts: drivers/usb/gadget/function/f_fs.c mm/page_alloc.c Change-Id: Ia2a8e99bfdae84d3933749f45ba86f33c5acd713 Signed-off-by: Ivaylo Georgiev <irgeorgiev@codeaurora.org> |
||
|
|
f056c90f07 |
sched/debug: Initialize sd_sysctl_cpus if !CONFIG_CPUMASK_OFFSTACK
[ Upstream commit 1ca4fa3ab604734e38e2a3000c9abf788512ffa7 ] register_sched_domain_sysctl() copies the cpu_possible_mask into sd_sysctl_cpus, but only if sd_sysctl_cpus hasn't already been allocated (ie, CONFIG_CPUMASK_OFFSTACK is set). However, when CONFIG_CPUMASK_OFFSTACK is not set, sd_sysctl_cpus is left uninitialized (all zeroes) and the kernel may fail to initialize sched_domain sysctl entries for all possible CPUs. This is visible to the user if the kernel is booted with maxcpus=n, or if ACPI tables have been modified to leave CPUs offline, and then checking for missing /proc/sys/kernel/sched_domain/cpu* entries. Fix this by separating the allocation and initialization, and adding a flag to initialize the possible CPU entries while system booting only. Tested-by: Syuuichirou Ishii <ishii.shuuichir@jp.fujitsu.com> Tested-by: Tarumizu, Kohei <tarumizu.kohei@jp.fujitsu.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masayoshi Mizuma <msys.mizuma@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190129151245.5073-1-msys.mizuma@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
|
|
07b9bee53d |
sched/debug: Show WALT specific details in /proc/sched_debug
WALT specific details can be very handy while debugging any issue on live target. Change-Id: I26f323393fd0f8d7c193f38dcea1531976d28272 Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org> |
||
|
|
e73e81975f |
sched/debug: Fix potential deadlock when writing to sched_features
The following lockdep report can be triggered by writing to /sys/kernel/debug/sched_features: ====================================================== WARNING: possible circular locking dependency detected 4.18.0-rc6-00152-gcd3f77d74ac3-dirty #18 Not tainted ------------------------------------------------------ sh/3358 is trying to acquire lock: 000000004ad3989d (cpu_hotplug_lock.rw_sem){++++}, at: static_key_enable+0x14/0x30 but task is already holding lock: 00000000c1b31a88 (&sb->s_type->i_mutex_key#3){+.+.}, at: sched_feat_write+0x160/0x428 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&sb->s_type->i_mutex_key#3){+.+.}: lock_acquire+0xb8/0x148 down_write+0xac/0x140 start_creating+0x5c/0x168 debugfs_create_dir+0x18/0x220 opp_debug_register+0x8c/0x120 _add_opp_dev+0x104/0x1f8 dev_pm_opp_get_opp_table+0x174/0x340 _of_add_opp_table_v2+0x110/0x760 dev_pm_opp_of_add_table+0x5c/0x240 dev_pm_opp_of_cpumask_add_table+0x5c/0x100 cpufreq_init+0x160/0x430 cpufreq_online+0x1cc/0xe30 cpufreq_add_dev+0x78/0x198 subsys_interface_register+0x168/0x270 cpufreq_register_driver+0x1c8/0x278 dt_cpufreq_probe+0xdc/0x1b8 platform_drv_probe+0xb4/0x168 driver_probe_device+0x318/0x4b0 __device_attach_driver+0xfc/0x1f0 bus_for_each_drv+0xf8/0x180 __device_attach+0x164/0x200 device_initial_probe+0x10/0x18 bus_probe_device+0x110/0x178 device_add+0x6d8/0x908 platform_device_add+0x138/0x3d8 platform_device_register_full+0x1cc/0x1f8 cpufreq_dt_platdev_init+0x174/0x1bc do_one_initcall+0xb8/0x310 kernel_init_freeable+0x4b8/0x56c kernel_init+0x10/0x138 ret_from_fork+0x10/0x18 -> #2 (opp_table_lock){+.+.}: lock_acquire+0xb8/0x148 __mutex_lock+0x104/0xf50 mutex_lock_nested+0x1c/0x28 _of_add_opp_table_v2+0xb4/0x760 dev_pm_opp_of_add_table+0x5c/0x240 dev_pm_opp_of_cpumask_add_table+0x5c/0x100 cpufreq_init+0x160/0x430 cpufreq_online+0x1cc/0xe30 cpufreq_add_dev+0x78/0x198 subsys_interface_register+0x168/0x270 cpufreq_register_driver+0x1c8/0x278 dt_cpufreq_probe+0xdc/0x1b8 platform_drv_probe+0xb4/0x168 driver_probe_device+0x318/0x4b0 __device_attach_driver+0xfc/0x1f0 bus_for_each_drv+0xf8/0x180 __device_attach+0x164/0x200 device_initial_probe+0x10/0x18 bus_probe_device+0x110/0x178 device_add+0x6d8/0x908 platform_device_add+0x138/0x3d8 platform_device_register_full+0x1cc/0x1f8 cpufreq_dt_platdev_init+0x174/0x1bc do_one_initcall+0xb8/0x310 kernel_init_freeable+0x4b8/0x56c kernel_init+0x10/0x138 ret_from_fork+0x10/0x18 -> #1 (subsys mutex#6){+.+.}: lock_acquire+0xb8/0x148 __mutex_lock+0x104/0xf50 mutex_lock_nested+0x1c/0x28 subsys_interface_register+0xd8/0x270 cpufreq_register_driver+0x1c8/0x278 dt_cpufreq_probe+0xdc/0x1b8 platform_drv_probe+0xb4/0x168 driver_probe_device+0x318/0x4b0 __device_attach_driver+0xfc/0x1f0 bus_for_each_drv+0xf8/0x180 __device_attach+0x164/0x200 device_initial_probe+0x10/0x18 bus_probe_device+0x110/0x178 device_add+0x6d8/0x908 platform_device_add+0x138/0x3d8 platform_device_register_full+0x1cc/0x1f8 cpufreq_dt_platdev_init+0x174/0x1bc do_one_initcall+0xb8/0x310 kernel_init_freeable+0x4b8/0x56c kernel_init+0x10/0x138 ret_from_fork+0x10/0x18 -> #0 (cpu_hotplug_lock.rw_sem){++++}: __lock_acquire+0x203c/0x21d0 lock_acquire+0xb8/0x148 cpus_read_lock+0x58/0x1c8 static_key_enable+0x14/0x30 sched_feat_write+0x314/0x428 full_proxy_write+0xa0/0x138 __vfs_write+0xd8/0x388 vfs_write+0xdc/0x318 ksys_write+0xb4/0x138 sys_write+0xc/0x18 __sys_trace_return+0x0/0x4 other info that might help us debug this: Chain exists of: cpu_hotplug_lock.rw_sem --> opp_table_lock --> &sb->s_type->i_mutex_key#3 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sb->s_type->i_mutex_key#3); lock(opp_table_lock); lock(&sb->s_type->i_mutex_key#3); lock(cpu_hotplug_lock.rw_sem); *** DEADLOCK *** 2 locks held by sh/3358: #0: 00000000a8c4b363 (sb_writers#10){.+.+}, at: vfs_write+0x238/0x318 #1: 00000000c1b31a88 (&sb->s_type->i_mutex_key#3){+.+.}, at: sched_feat_write+0x160/0x428 stack backtrace: CPU: 5 PID: 3358 Comm: sh Not tainted 4.18.0-rc6-00152-gcd3f77d74ac3-dirty #18 Hardware name: Renesas H3ULCB Kingfisher board based on r8a7795 ES2.0+ (DT) Call trace: dump_backtrace+0x0/0x288 show_stack+0x14/0x20 dump_stack+0x13c/0x1ac print_circular_bug.isra.10+0x270/0x438 check_prev_add.constprop.16+0x4dc/0xb98 __lock_acquire+0x203c/0x21d0 lock_acquire+0xb8/0x148 cpus_read_lock+0x58/0x1c8 static_key_enable+0x14/0x30 sched_feat_write+0x314/0x428 full_proxy_write+0xa0/0x138 __vfs_write+0xd8/0x388 vfs_write+0xdc/0x318 ksys_write+0xb4/0x138 sys_write+0xc/0x18 __sys_trace_return+0x0/0x4 This is because when loading the cpufreq_dt module we first acquire cpu_hotplug_lock.rw_sem lock, then in cpufreq_init(), we are taking the &sb->s_type->i_mutex_key lock. But when writing to /sys/kernel/debug/sched_features, the cpu_hotplug_lock.rw_sem lock depends on the &sb->s_type->i_mutex_key lock. To fix this bug, reverse the lock acquisition order when writing to sched_features, this way cpu_hotplug_lock.rw_sem no longer depends on &sb->s_type->i_mutex_key. Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Jiada Wang <jiada_wang@mentor.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Eugeniu Rosca <erosca@de.adit-jv.com> Cc: George G. Davis <george_davis@mentor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180731121222.26195-1-jiada_wang@mentor.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
13e091b6dd |
Merge branch 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 timer updates from Thomas Gleixner: "Early TSC based time stamping to allow better boot time analysis. This comes with a general cleanup of the TSC calibration code which grew warts and duct taping over the years and removes 250 lines of code. Initiated and mostly implemented by Pavel with help from various folks" * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) x86/kvmclock: Mark kvm_get_preset_lpj() as __init x86/tsc: Consolidate init code sched/clock: Disable interrupts when calling generic_sched_clock_init() timekeeping: Prevent false warning when persistent clock is not available sched/clock: Close a hole in sched_clock_init() x86/tsc: Make use of tsc_calibrate_cpu_early() x86/tsc: Split native_calibrate_cpu() into early and late parts sched/clock: Use static key for sched_clock_running sched/clock: Enable sched clock early sched/clock: Move sched clock initialization and merge with generic clock x86/tsc: Use TSC as sched clock early x86/tsc: Initialize cyc2ns when tsc frequency is determined x86/tsc: Calibrate tsc only once ARM/time: Remove read_boot_clock64() s390/time: Remove read_boot_clock64() timekeeping: Default boot time offset to local_clock() timekeeping: Replace read_boot_clock64() with read_persistent_wall_and_boot_offset() s390/time: Add read_persistent_wall_and_boot_offset() x86/xen/time: Output xen sched_clock time from 0 x86/xen/time: Initialize pv xen time in init_hypervisor_platform() ... |
||
|
|
67d9f6c256 |
sched/debug: Reverse the order of printing faults
Fix the order in which the private and shared numa faults are getting printed. No functional changes. Running SPECjbb2005 on a 4 node machine and comparing bops/JVM JVMS LAST_PATCH WITH_PATCH %CHANGE 16 25215.7 25375.3 0.63 1 72107 72617 0.70 Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Rik van Riel <riel@surriel.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1529514181-9842-7-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
46457ea464 |
sched/clock: Use static key for sched_clock_running
sched_clock_running may be read every time sched_clock_cpu() is called. Yet, this variable is updated only twice during boot, and never changes again, therefore it is better to make it a static key. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: steven.sistare@oracle.com Cc: daniel.m.jordan@oracle.com Cc: linux@armlinux.org.uk Cc: schwidefsky@de.ibm.com Cc: heiko.carstens@de.ibm.com Cc: john.stultz@linaro.org Cc: sboyd@codeaurora.org Cc: hpa@zytor.com Cc: douly.fnst@cn.fujitsu.com Cc: prarit@redhat.com Cc: feng.tang@intel.com Cc: pmladek@suse.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: linux-s390@vger.kernel.org Cc: boris.ostrovsky@oracle.com Cc: jgross@suse.com Cc: pbonzini@redhat.com Link: https://lkml.kernel.org/r/20180719205545.16512-25-pasha.tatashin@oracle.com |
||
|
|
8f894bf47d |
sched/debug: Use match_string() helper instead of open-coded logic
match_string() returns the index of an array for a matching string, which can be used instead of the open coded variant. Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/1527765086-19873-15-git-send-email-xieyisheng1@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
fddda2b7b5 |
proc: introduce proc_create_seq{,_data}
Variants of proc_create{,_data} that directly take a struct seq_operations
argument and drastically reduces the boilerplate code in the callers.
All trivial callers converted over.
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
||
|
|
46e0d28bdb |
Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"The main scheduler changes in this cycle were:
- NUMA balancing improvements (Mel Gorman)
- Further load tracking improvements (Patrick Bellasi)
- Various NOHZ balancing cleanups and optimizations (Peter Zijlstra)
- Improve blocked load handling, in particular we can now reduce and
eventually stop periodic load updates on 'very idle' CPUs. (Vincent
Guittot)
- On isolated CPUs offload the final 1Hz scheduler tick as well, plus
related cleanups and reorganization. (Frederic Weisbecker)
- Core scheduler code cleanups (Ingo Molnar)"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
sched/core: Update preempt_notifier_key to modern API
sched/cpufreq: Rate limits for SCHED_DEADLINE
sched/fair: Update util_est only on util_avg updates
sched/cpufreq/schedutil: Use util_est for OPP selection
sched/fair: Use util_est in LB and WU paths
sched/fair: Add util_est on top of PELT
sched/core: Remove TASK_ALL
sched/completions: Use bool in try_wait_for_completion()
sched/fair: Update blocked load when newly idle
sched/fair: Move idle_balance()
sched/nohz: Merge CONFIG_NO_HZ_COMMON blocks
sched/fair: Move rebalance_domains()
sched/nohz: Optimize nohz_idle_balance()
sched/fair: Reduce the periodic update duration
sched/nohz: Stop NOHZ stats when decayed
sched/cpufreq: Provide migration hint
sched/nohz: Clean up nohz enter/exit
sched/fair: Update blocked load from NEWIDLE
sched/fair: Add NOHZ stats balancing
sched/fair: Restructure nohz_balance_kick()
...
|
||
|
|
e9ca267096 |
sched/debug: Adjust newlines for better alignment
Scheduler debug stats include newlines that display out of alignment
when prefixed by timestamps. For example, the dmesg utility:
% echo t > /proc/sysrq-trigger
% dmesg
...
[ 83.124251]
runnable tasks:
S task PID tree-key switches prio wait-time
sum-exec sum-sleep
-----------------------------------------------------------------------------------------------------------
At the same time, some syslog utilities (like rsyslog by default) don't
like the additional newlines control characters, saving lines like this
to /var/log/messages:
Mar 16 16:02:29 localhost kernel: #012runnable tasks:#012 S task PID tree-key ...
^^^^ ^^^^
Clean these up by moving newline characters to their own SEQ_printf
invocation. This leaves the /proc/sched_debug unchanged, but brings the
entire output into alignment when prefixed:
% echo t > /proc/sysrq-trigger
% dmesg
...
[ 62.410368] runnable tasks:
[ 62.410368] S task PID tree-key switches prio wait-time sum-exec sum-sleep
[ 62.410369] -----------------------------------------------------------------------------------------------------------
[ 62.410369] I kworker/u12:0 5 1932.215593 332 120 0.000000 3.621252 0.000000 0 0 /
and no escaped control characters from rsyslog in /var/log/messages:
Mar 16 16:15:06 localhost kernel: runnable tasks:
Mar 16 16:15:06 localhost kernel: S task PID tree-key ...
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1521484555-8620-3-git-send-email-joe.lawrence@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
||
|
|
a8c024cd9b |
sched/debug: Fix per-task line continuation for console output
When the SEQ_printf() macro prints to the console, it runs a simple printk() without KERN_CONT "continued" line printing. The result of this is oddly wrapped task info, for example: % echo t > /proc/sysrq-trigger % dmesg ... runnable tasks: ... [ 29.608611] I [ 29.608613] rcu_sched 8 3252.013846 4087 120 [ 29.608614] 0.000000 29.090111 0.000000 [ 29.608615] 0 0 [ 29.608616] / Modify SEQ_printf to use pr_cont() for expected one-line results: % echo t > /proc/sysrq-trigger % dmesg ... runnable tasks: ... [ 106.716329] S cpuhp/5 37 2006.315026 14 120 0.000000 0.496893 0.000000 0 0 / Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1521484555-8620-2-git-send-email-joe.lawrence@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
7f65ea42eb |
sched/fair: Add util_est on top of PELT
The util_avg signal computed by PELT is too variable for some use-cases.
For example, a big task waking up after a long sleep period will have its
utilization almost completely decayed. This introduces some latency before
schedutil will be able to pick the best frequency to run a task.
The same issue can affect task placement. Indeed, since the task
utilization is already decayed at wakeup, when the task is enqueued in a
CPU, this can result in a CPU running a big task as being temporarily
represented as being almost empty. This leads to a race condition where
other tasks can be potentially allocated on a CPU which just started to run
a big task which slept for a relatively long period.
Moreover, the PELT utilization of a task can be updated every [ms], thus
making it a continuously changing value for certain longer running
tasks. This means that the instantaneous PELT utilization of a RUNNING
task is not really meaningful to properly support scheduler decisions.
For all these reasons, a more stable signal can do a better job of
representing the expected/estimated utilization of a task/cfs_rq.
Such a signal can be easily created on top of PELT by still using it as
an estimator which produces values to be aggregated on meaningful
events.
This patch adds a simple implementation of util_est, a new signal built on
top of PELT's util_avg where:
util_est(task) = max(task::util_avg, f(task::util_avg@dequeue))
This allows to remember how big a task has been reported by PELT in its
previous activations via f(task::util_avg@dequeue), which is the new
_task_util_est(struct task_struct*) function added by this patch.
If a task should change its behavior and it runs longer in a new
activation, after a certain time its util_est will just track the
original PELT signal (i.e. task::util_avg).
The estimated utilization of cfs_rq is defined only for root ones.
That's because the only sensible consumer of this signal are the
scheduler and schedutil when looking for the overall CPU utilization
due to FAIR tasks.
For this reason, the estimated utilization of a root cfs_rq is simply
defined as:
util_est(cfs_rq) = max(cfs_rq::util_avg, cfs_rq::util_est::enqueued)
where:
cfs_rq::util_est::enqueued = sum(_task_util_est(task))
for each RUNNABLE task on that root cfs_rq
It's worth noting that the estimated utilization is tracked only for
objects of interests, specifically:
- Tasks: to better support tasks placement decisions
- root cfs_rqs: to better support both tasks placement decisions as
well as frequencies selection
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@android.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: http://lkml.kernel.org/r/20180309095245.11071-2-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
||
|
|
325ea10c08 |
sched/headers: Simplify and clean up header usage in the scheduler
Do the following cleanups and simplifications: - sched/sched.h already includes <asm/paravirt.h>, so no need to include it in sched/core.c again. - order the <linux/sched/*.h> headers alphabetically - add all <linux/sched/*.h> headers to kernel/sched/sched.h - remove all unnecessary includes from the .c files that are already included in kernel/sched/sched.h. Finally, make all scheduler .c files use a single common header: #include "sched.h" ... which now contains a union of the relied upon headers. This makes the various .c files easier to read and easier to handle. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
97fb7a0a89 |
sched: Clean up and harmonize the coding style of the scheduler code base
A good number of small style inconsistencies have accumulated
in the scheduler core, so do a pass over them to harmonize
all these details:
- fix speling in comments,
- use curly braces for multi-line statements,
- remove unnecessary parentheses from integer literals,
- capitalize consistently,
- remove stray newlines,
- add comments where necessary,
- remove invalid/unnecessary comments,
- align structure definitions and other data types vertically,
- add missing newlines for increased readability,
- fix vertical tabulation where it's misaligned,
- harmonize preprocessor conditional block labeling
and vertical alignment,
- remove line-breaks where they uglify the code,
- add newline after local variable definitions,
No change in functionality:
md5:
1191fa0a890cfa8132156d2959d7e9e2 built-in.o.before.asm
1191fa0a890cfa8132156d2959d7e9e2 built-in.o.after.asm
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
||
|
|
1ea6c46a23 |
sched/fair: Propagate an effective runnable_load_avg
The load balancer uses runnable_load_avg as load indicator. For !cgroup this is: runnable_load_avg = \Sum se->avg.load_avg ; where se->on_rq That is, a direct sum of all runnable tasks on that runqueue. As opposed to load_avg, which is a sum of all tasks on the runqueue, which includes a blocked component. However, in the cgroup case, this comes apart since the group entities are always runnable, even if most of their constituent entities are blocked. Therefore introduce a runnable_weight which for task entities is the same as the regular weight, but for group entities is a fraction of the entity weight and represents the runnable part of the group runqueue. Then propagate this load through the PELT hierarchy to arrive at an effective runnable load avgerage -- which we should not confuse with the canonical runnable load average. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
0e2d2aaaae |
sched/fair: Rewrite PELT migration propagation
When an entity migrates in (or out) of a runqueue, we need to add (or
remove) its contribution from the entire PELT hierarchy, because even
non-runnable entities are included in the load average sums.
In order to do this we have some propagation logic that updates the
PELT tree, however the way it 'propagates' the runnable (or load)
change is (more or less):
tg->weight * grq->avg.load_avg
ge->avg.load_avg = ------------------------------
tg->load_avg
But that is the expression for ge->weight, and per the definition of
load_avg:
ge->avg.load_avg := ge->weight * ge->avg.runnable_avg
That destroys the runnable_avg (by setting it to 1) we wanted to
propagate.
Instead directly propagate runnable_sum.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
||
|
|
2a2f5d4e44 |
sched/fair: Rewrite cfs_rq->removed_*avg
Since on wakeup migration we don't hold the rq->lock for the old CPU we cannot update its state. Instead we add the removed 'load' to an atomic variable and have the next update on that CPU collect and process it. Currently we have 2 atomic variables; which already have the issue that they can be read out-of-sync. Also, two atomic ops on a single cacheline is already more expensive than an uncontended lock. Since we want to add more, convert the thing over to an explicit cacheline with a lock in. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
65d5dc47fe |
sched/debug: Remove unused variable
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
ec846ecd63 |
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar: "Three CPU hotplug related fixes and a debugging improvement" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/debug: Add debugfs knob for "sched_debug" sched/core: WARN() when migrating to an offline CPU sched/fair: Plug hole between hotplug and active_load_balance() sched/fair: Avoid newidle balance for !active CPUs |
||
|
|
9469eb01db |
sched/debug: Add debugfs knob for "sched_debug"
I'm forever late for editing my kernel cmdline, add a runtime knob to disable the "sched_debug" thing. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170907150614.142924283@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
bfb068892d |
sched/fair: replace cfs_rq->rb_leftmost
... with the generic rbtree flavor instead. No changes in semantics whatsoever. Link: http://lkml.kernel.org/r/20170719014603.19029-8-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
bbdacdfed2 |
sched/debug: Optimize sched_domain sysctl generation
Currently we unconditionally destroy all sysctl bits and regenerate them after we've rebuild the domains (even if that rebuild is a no-op). And since we unconditionally (re)build the sysctl for all possible CPUs, onlining all CPUs gets us O(n^2) time. Instead change this to only rebuild the bits for CPUs we've actually installed new domains on. Reported-by: Ofer Levi(SW) <oferle@mellanox.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
20435d84e5 |
sched/debug: Intruduce task_state_to_char() helper function
Now that we have more than one place to get the task state, intruduce the task_state_to_char() helper function to save some code. No functionality changed. Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <cj.chengjian@huawei.com> Cc: <huawei.libin@huawei.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1502095463-160172-3-git-send-email-xiexiuqi@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
e8c164954b |
sched/debug: Show task state in /proc/sched_debug
Currently we print the runnable task in /proc/sched_debug, but there is no task state information. We don't know which task is in the runqueue and which task is sleeping. Add task state in the runnable task list, like this: runnable tasks: S task PID tree-key switches prio wait-time sum-exec sum-sleep ----------------------------------------------------------------------------------------------------------- S watchdog/239 1452 -11.917445 2811 0 0.000000 8.949306 0.000000 7 0 / S migration/239 1453 20686.367740 8 0 0.000000 16215.720897 0.000000 7 0 / S ksoftirqd/239 1454 115383.841071 12 120 0.000000 0.200683 0.000000 7 0 / >R test 21287 4872.190970 407 120 0.000000 4874.911790 0.000000 7 0 /autogroup-150 R test 21288 4868.385454 401 120 0.000000 3672.341489 0.000000 7 0 /autogroup-150 R test 21289 4868.326776 384 120 0.000000 3424.934159 0.000000 7 0 /autogroup-150 Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <cj.chengjian@huawei.com> Cc: <huawei.libin@huawei.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1502095463-160172-2-git-send-email-xiexiuqi@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
74dc3384fc |
sched/debug: Use task_pid_nr_ns in /proc/$pid/sched
It appears as though the addition of the PID namespace did not update the output code for /proc/*/sched, which resulted in it providing PIDs that were not self-consistent with the /proc mount. This additionally made it trivial to detect whether a process was inside &init_pid_ns from userspace, making container detection trivial: https://github.com/jessfraz/amicontained This leads to situations such as: % unshare -pmf % mount -t proc proc /proc % head -n1 /proc/1/sched head (10047, #threads: 1) Fix this by just using task_pid_nr_ns for the output of /proc/*/sched. All of the other uses of task_pid_nr in kernel/sched/debug.c are from a sysctl context and thus don't need to be namespaced. Signed-off-by: Aleksa Sarai <asarai@suse.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Jess Frazelle <acidburn@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: cyphar@cyphar.com Link: http://lkml.kernel.org/r/20170806044141.5093-1-asarai@suse.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
48365b3884 |
sched/debug: Expose the number of RT/DL tasks that can migrate
Add the value of the rt_rq.rt_nr_migratory and dl_rq.dl_nr_migratory to the sched_debug output, for instance: rt_rq[0]: .rt_nr_running : 2 .rt_nr_migratory : 1 <--- Like this .rt_throttled : 0 .rt_time : 828.645877 .rt_runtime : 1000.000000 This is useful to debug problems related to the RT/DL schedulers. This also fixes the format of some variables, that were unsigned, rather than signed. Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-rt-users <linux-rt-users@vger.kernel.org> Link: http://lkml.kernel.org/r/7896f71cada54ee7dd8507bb666063a2e051c3d4.1498482127.git.bristot@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
f719ff9bce |
sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h>
But first update the code that uses these facilities with the new header. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
589ee62844 |
sched/headers: Prepare to remove the <linux/mm_types.h> dependency from <linux/sched.h>
Update code that relied on sched.h including various MM types for them. This will allow us to remove the <linux/mm_types.h> include from <linux/sched.h>. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
59f8c29892 |
sched/deadline: Show leftover runtime and abs deadline in /proc/*/sched
This patch allows for reading the current (leftover) runtime and absolute deadline of a SCHED_DEADLINE task through /proc/*/sched (entries dl.runtime and dl.deadline), while debugging/testing. Signed-off-by: Tommaso Cucinotta <tommaso.cucinotta@sssup.it> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Juri Lelli <juri.lelli@arm.com> Reviewed-by: Luca Abeni <luca.abeni@unitn.it> Acked-by: Daniel Bistrot de Oliveira <danielbristot@gmail.com> Cc: Juri Lelli <juri.lelli@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1477473437-10346-2-git-send-email-tommaso.cucinotta@sssup.it Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
f34d3606f7 |
Merge branch 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo: - tracepoints for basic cgroup management operations added - kernfs and cgroup path formatting functions updated to behave in the style of strlcpy() - non-critical bug fixes * 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: blkcg: Unlock blkcg_pol_mutex only once when cpd == NULL cgroup: fix error handling regressions in proc_cgroup_show() and cgroup_release_agent() cpuset: fix error handling regression in proc_cpuset_show() cgroup: add tracepoints for basic operations cgroup: make cgroup_path() and friends behave in the style of strlcpy() kernfs: remove kernfs_path_len() kernfs: make kernfs_path*() behave in the style of strlcpy() kernfs: add dummy implementation of kernfs_path_from_node() |
||
|
|
4fa8d299b4 |
sched/debug: Remove several CONFIG_SCHEDSTATS guards
Clean up the sched code by removing several of the CONFIG_SCHEDSTATS
guards, using schedstat_*() macros where needed.
Code size:
!CONFIG_SCHEDSTATS defconfig:
text data bss dec hex filename
10209818 4368184 1105920 15683922 ef5152 vmlinux.before.nostats
10209818 4368184 1105920 15683922 ef5152 vmlinux.after.nostats
CONFIG_SCHEDSTATS defconfig:
text data bss dec hex filename
10214210 4370040 1105920 15690170 ef69ba vmlinux.before.stats
10214210 4370680 1105920 15690810 ef6c3a vmlinux.after.stats
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e51e0ebe5af95ac295de720dd252e7c0d2142e4a.1466184592.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
||
|
|
20e1d4863b |
sched/debug: Rename 'schedstat_val()' -> 'schedstat_val_or_zero()'
The schedstat_val() macro's behavior is kind of surprising: when schedstat is runtime disabled, it returns zero. Rename it to schedstat_val_or_zero(). There's also a need for a similar macro which doesn't have the 'if (schedstat_enable())' check, to avoid doing the check twice. Create a new 'schedstat_val()' macro for that. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/3bb1d2367d041fee333b0dde17171e709395b675.1466184592.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
ae92882e56 |
sched/debug: Clean up schedstat macros
The schedstat_*() macros are inconsistent: most of them take a pointer and a field which the macro combines, whereas schedstat_set() takes the already combined ptr->field. The already combined ptr->field argument is actually more intuitive and easier to use, and there's no reason to require the user to split the variable up, so convert the macros to use the combined argument. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/54953ca25bb579f3a5946432dee409b0e05222c6.1466184592.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
4c737b41de |
cgroup: make cgroup_path() and friends behave in the style of strlcpy()
cgroup_path() and friends used to format the path from the end and thus the resulting path usually didn't start at the start of the passed in buffer. Also, when the buffer was too small, the partial result was truncated from the head rather than tail and there was no way to tell how long the full path would be. These make the functions less robust and more awkward to use. With recent updates to kernfs_path(), cgroup_path() and friends can be made to behave in strlcpy() style. * cgroup_path(), cgroup_path_ns[_locked]() and task_cgroup_path() now always return the length of the full path. If buffer is too small, it contains nul terminated truncated output. * All users updated accordingly. v2: cgroup_path() usage in kernel/sched/debug.c converted. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Serge Hallyn <serge.hallyn@ubuntu.com> Cc: Peter Zijlstra <peterz@infradead.org> |
||
|
|
03c041c5bf |
sched/debug: Always show 'nr_migrations'
The nr_migrations field is updated independently of CONFIG_SCHEDSTATS, so it can be displayed regardless. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/5b1b04057ae2b14d73c2d03f56582c1d38cfe066.1464994423.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
9c57259117 |
sched/debug: Fix /proc/sched_debug regression
Commit: |
||
|
|
db6ea2fb09 |
sched/debug: Print out idle balance values even on !CONFIG_SCHEDSTATS kernels
The max_idle_balance_cost and avg_idle values which are tracked and ar used to capture short idle incidents, are not associated with schedstats, however the information of these two values isn't printed out on !CONFIG_SCHEDSTATS kernels. Fix this by moving the value printout out of the CONFIG_SCHEDSTATS section. Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1462250305-4523-1-git-send-email-wanpeng.li@hotmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
ef477183d0 |
sched/debug: Add deadline scheduler bandwidth ratio to /proc/sched_debug
Playing with SCHED_DEADLINE and cpusets, I found that I was unable to create
new SCHED_DEADLINE tasks, with the error of EBUSY as if the bandwidth was
already used up. I then realized there wa no way to see what bandwidth is
used by the runqueues to debug the issue.
By adding the dl_bw->bw and dl_bw->total_bw to the output of the deadline
info in /proc/sched_debug, this allows us to see what bandwidth has been
reserved and where a problem may exist.
For example, before the issue we see the ratio of the bandwidth:
# cat /proc/sys/kernel/sched_rt_runtime_us
950000
# cat /proc/sys/kernel/sched_rt_period_us
1000000
# grep dl /proc/sched_debug
dl_rq[0]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[1]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[2]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[3]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[4]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[5]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[6]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[7]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
Note: (950000 / 1000000) << 20 == 996147
After I played with cpusets and hit the issue, the result is now:
# grep dl /proc/sched_debug
dl_rq[0]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -104857
dl_rq[1]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 104857
dl_rq[2]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 104857
dl_rq[3]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 104857
dl_rq[4]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -104857
dl_rq[5]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -104857
dl_rq[6]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -104857
dl_rq[7]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -104857
This shows that there is definitely a problem as we should never have a
negative total bandwidth.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160222212825.756849091@goodmis.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
||
|
|
3866e845ed |
sched/debug: Move sched_domain_sysctl to debug.c
The sched_domain_sysctl setup is only enabled when SCHED_DEBUG is configured. As debug.c is only compiled when SCHED_DEBUG is configured as well, move the setup of sched_domain_sysctl into that file. Note, the (un)register_sched_domain_sysctl() functions had to be changed from static to allow access to them from core.c. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Clark Williams <williams@redhat.com> Cc: Juri Lelli <juri.lelli@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160222212825.599278093@goodmis.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
d6ca41d792 |
sched/debug: Move the /sys/kernel/debug/sched_features file setup into debug.c
As /sys/kernel/debug/sched_features is only created when SCHED_DEBUG is enabled, and the file debug.c is only compiled when SCHED_DEBUG is enabled, it makes sense to move sched_feature setup into that file and get rid of the #ifdef. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Clark Williams <williams@redhat.com> Cc: Juri Lelli <juri.lelli@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160222212825.464193063@goodmis.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
cb2517653f |
sched/debug: Make schedstats a runtime tunable that is disabled by default
schedstats is very useful during debugging and performance tuning but it
incurs overhead to calculate the stats. As such, even though it can be
disabled at build time, it is often enabled as the information is useful.
This patch adds a kernel command-line and sysctl tunable to enable or
disable schedstats on demand (when it's built in). It is disabled
by default as someone who knows they need it can also learn to enable
it when necessary.
The benefits are dependent on how scheduler-intensive the workload is.
If it is then the patch reduces the number of cycles spent calculating
the stats with a small benefit from reducing the cache footprint of the
scheduler.
These measurements were taken from a 48-core 2-socket
machine with Xeon(R) E5-2670 v3 cpus although they were also tested on a
single socket machine 8-core machine with Intel i7-3770 processors.
netperf-tcp
4.5.0-rc1 4.5.0-rc1
vanilla nostats-v3r1
Hmean 64 560.45 ( 0.00%) 575.98 ( 2.77%)
Hmean 128 766.66 ( 0.00%) 795.79 ( 3.80%)
Hmean 256 950.51 ( 0.00%) 981.50 ( 3.26%)
Hmean 1024 1433.25 ( 0.00%) 1466.51 ( 2.32%)
Hmean 2048 2810.54 ( 0.00%) 2879.75 ( 2.46%)
Hmean 3312 4618.18 ( 0.00%) 4682.09 ( 1.38%)
Hmean 4096 5306.42 ( 0.00%) 5346.39 ( 0.75%)
Hmean 8192 10581.44 ( 0.00%) 10698.15 ( 1.10%)
Hmean 16384 18857.70 ( 0.00%) 18937.61 ( 0.42%)
Small gains here, UDP_STREAM showed nothing intresting and neither did
the TCP_RR tests. The gains on the 8-core machine were very similar.
tbench4
4.5.0-rc1 4.5.0-rc1
vanilla nostats-v3r1
Hmean mb/sec-1 500.85 ( 0.00%) 522.43 ( 4.31%)
Hmean mb/sec-2 984.66 ( 0.00%) 1018.19 ( 3.41%)
Hmean mb/sec-4 1827.91 ( 0.00%) 1847.78 ( 1.09%)
Hmean mb/sec-8 3561.36 ( 0.00%) 3611.28 ( 1.40%)
Hmean mb/sec-16 5824.52 ( 0.00%) 5929.03 ( 1.79%)
Hmean mb/sec-32 10943.10 ( 0.00%) 10802.83 ( -1.28%)
Hmean mb/sec-64 15950.81 ( 0.00%) 16211.31 ( 1.63%)
Hmean mb/sec-128 15302.17 ( 0.00%) 15445.11 ( 0.93%)
Hmean mb/sec-256 14866.18 ( 0.00%) 15088.73 ( 1.50%)
Hmean mb/sec-512 15223.31 ( 0.00%) 15373.69 ( 0.99%)
Hmean mb/sec-1024 14574.25 ( 0.00%) 14598.02 ( 0.16%)
Hmean mb/sec-2048 13569.02 ( 0.00%) 13733.86 ( 1.21%)
Hmean mb/sec-3072 12865.98 ( 0.00%) 13209.23 ( 2.67%)
Small gains of 2-4% at low thread counts and otherwise flat. The
gains on the 8-core machine were slightly different
tbench4 on 8-core i7-3770 single socket machine
Hmean mb/sec-1 442.59 ( 0.00%) 448.73 ( 1.39%)
Hmean mb/sec-2 796.68 ( 0.00%) 794.39 ( -0.29%)
Hmean mb/sec-4 1322.52 ( 0.00%) 1343.66 ( 1.60%)
Hmean mb/sec-8 2611.65 ( 0.00%) 2694.86 ( 3.19%)
Hmean mb/sec-16 2537.07 ( 0.00%) 2609.34 ( 2.85%)
Hmean mb/sec-32 2506.02 ( 0.00%) 2578.18 ( 2.88%)
Hmean mb/sec-64 2511.06 ( 0.00%) 2569.16 ( 2.31%)
Hmean mb/sec-128 2313.38 ( 0.00%) 2395.50 ( 3.55%)
Hmean mb/sec-256 2110.04 ( 0.00%) 2177.45 ( 3.19%)
Hmean mb/sec-512 2072.51 ( 0.00%) 2053.97 ( -0.89%)
In constract, this shows a relatively steady 2-3% gain at higher thread
counts. Due to the nature of the patch and the type of workload, it's
not a surprise that the result will depend on the CPU used.
hackbench-pipes
4.5.0-rc1 4.5.0-rc1
vanilla nostats-v3r1
Amean 1 0.0637 ( 0.00%) 0.0660 ( -3.59%)
Amean 4 0.1229 ( 0.00%) 0.1181 ( 3.84%)
Amean 7 0.1921 ( 0.00%) 0.1911 ( 0.52%)
Amean 12 0.3117 ( 0.00%) 0.2923 ( 6.23%)
Amean 21 0.4050 ( 0.00%) 0.3899 ( 3.74%)
Amean 30 0.4586 ( 0.00%) 0.4433 ( 3.33%)
Amean 48 0.5910 ( 0.00%) 0.5694 ( 3.65%)
Amean 79 0.8663 ( 0.00%) 0.8626 ( 0.43%)
Amean 110 1.1543 ( 0.00%) 1.1517 ( 0.22%)
Amean 141 1.4457 ( 0.00%) 1.4290 ( 1.16%)
Amean 172 1.7090 ( 0.00%) 1.6924 ( 0.97%)
Amean 192 1.9126 ( 0.00%) 1.9089 ( 0.19%)
Some small gains and losses and while the variance data is not included,
it's close to the noise. The UMA machine did not show anything particularly
different
pipetest
4.5.0-rc1 4.5.0-rc1
vanilla nostats-v2r2
Min Time 4.13 ( 0.00%) 3.99 ( 3.39%)
1st-qrtle Time 4.38 ( 0.00%) 4.27 ( 2.51%)
2nd-qrtle Time 4.46 ( 0.00%) 4.39 ( 1.57%)
3rd-qrtle Time 4.56 ( 0.00%) 4.51 ( 1.10%)
Max-90% Time 4.67 ( 0.00%) 4.60 ( 1.50%)
Max-93% Time 4.71 ( 0.00%) 4.65 ( 1.27%)
Max-95% Time 4.74 ( 0.00%) 4.71 ( 0.63%)
Max-99% Time 4.88 ( 0.00%) 4.79 ( 1.84%)
Max Time 4.93 ( 0.00%) 4.83 ( 2.03%)
Mean Time 4.48 ( 0.00%) 4.39 ( 1.91%)
Best99%Mean Time 4.47 ( 0.00%) 4.39 ( 1.91%)
Best95%Mean Time 4.46 ( 0.00%) 4.38 ( 1.93%)
Best90%Mean Time 4.45 ( 0.00%) 4.36 ( 1.98%)
Best50%Mean Time 4.36 ( 0.00%) 4.25 ( 2.49%)
Best10%Mean Time 4.23 ( 0.00%) 4.10 ( 3.13%)
Best5%Mean Time 4.19 ( 0.00%) 4.06 ( 3.20%)
Best1%Mean Time 4.13 ( 0.00%) 4.00 ( 3.39%)
Small improvement and similar gains were seen on the UMA machine.
The gain is small but it stands to reason that doing less work in the
scheduler is a good thing. The downside is that the lack of schedstats and
tracepoints may be surprising to experts doing performance analysis until
they find the existence of the schedstats= parameter or schedstats sysctl.
It will be automatically activated for latencytop and sleep profiling to
alleviate the problem. For tracepoints, there is a simple warning as it's
not safe to activate schedstats in the context when it's known the tracepoint
may be wanted but is unavailable.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <mgalbraith@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1454663316-22048-1-git-send-email-mgorman@techsingularity.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|