Commit Graph

2443 Commits

Author SHA1 Message Date
Pavankumar Kondeti
256be6b2c2 sched: Remove unnecessary calls to cpufreq_update_util()
The cpufreq update util calls after update_task_ravg() are
no longer needed. This is taken care in WALT when a window is
rolled over. Besides these calls were ineffectual since
SCHED_CPUFREQ_WALT wasn't passed in the flag parameter.

Change-Id: I28ac40b33662584ec9f8fff116e66a6f33a8d010
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-05-09 13:00:39 +05:30
Pavankumar Kondeti
7fa1540f0b sched/walt: Fix stale window start marker passed to the schedutil
With commit d8c5bfcc07 ("sched: Make sure window start passed to
schedutil is consistent"), the rq->load_reported_window is presented
to the governor as the window_start marker. The rq->load_reported_window
is updated when load is reported to governor only during a window rollover.
So it should be consistent with the current window start mark. But for a
just hotplugged in CPU, the rq->load_reported_window is not updated
until the next window rollover.

If the load is reported for any other reason before the next window
rollover, the window start marker passed to the schedutil would be
stale and leads to a BUG_ON() in schedutil. The recent window start marker
is cached in WALT in walt_irq_work_lastq_ws. Use this instead of
load_reported_window to fix this problem.

The rq->window_start is cached in rq->load_reported_window to filter
the utilization updates in the same window. This is not needed since
utilization updates are not sent when SCHED_CPUFREQ_WALT flag is not
set. So kill the load_reported_window maintenance.

Change-Id: Idaefcb0b9cecb15ea436ac7a66cb6da81e3852a1
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-05-09 11:32:35 +05:30
Linux Build Service Account
763878b6f2 Merge "core_ctl: Fix an issue where CPUs are left un-isolated for long time" 2018-05-03 19:59:27 -07:00
Linux Build Service Account
78667dedc6 Merge "Merge android-4.9.90 (dd1e37e) into msm-4.9" 2018-05-03 09:02:10 -07:00
Pavankumar Kondeti
f7ed52d26c core_ctl: Fix an issue where CPUs are left un-isolated for long time
When SCHED_CORE_ROTATE config is enabled, the CPUs that are
eligible for isolation are kept rotated for every system
suspend and resume cycle. cluster->set_cur holds this eligible
mask. It is also reconfigured when min_cpus tunable is changed.

The CPUs that are part of this eligible mask are only isolated
in try_to_isolate(). A CPU that is part of this mask but is busy
at that time left isolated. Since the new need is same as the
last need, eval_need() does not kick core_ctl thread next time
when the CPU becomes idle. To fix this issue, kick the core_ctl
thread when there more active CPUs than currently needed. The kicks
are rate limited by an existing tunable called offline_delay_ms.

Change-Id: I9d3815c6c6bede4b93a708ae6edb15f94d296399
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-05-02 13:52:52 +05:30
Linux Build Service Account
47401c1e27 Merge "sched: Allow __sched_setscheduler call from interrupt context if not pi" 2018-04-24 11:05:51 -07:00
Linux Build Service Account
38696730d6 Merge "sched: Add trace point to track preemption disable callers" 2018-04-24 11:05:46 -07:00
Linux Build Service Account
c068727629 Merge "sched/fair: Add bias towards previous CPU for high wakeup rate tasks" 2018-04-23 14:46:09 -07:00
Pavankumar Kondeti
97f08d47f4 sched: Add trace point to track preemption disable callers
Add trace point to track preemption disable callers to
isolate issues unrelated to scheduler and improve debug
turn around time.

Change-Id: If9303b7165167e8f79cd339929daf4afc31a61c4
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
Signed-off-by: Lingutla Chandrasekhar <clingutla@codeaurora.org>
2018-04-21 15:47:14 +05:30
Maria Yu
c1979b8f99 sched: Allow __sched_setscheduler call from interrupt context if not pi
BUG() is only needed for pi case which needed
call to a rtmutex function and grab a spin_lock.
While in pi case is a simplified logic to normalize
tasks, it is safe to be called from interrupt
context.

CRs-Fixed: 2101157
Change-Id: Icdcb9356692fc30d7b5af40f816381df5b0f0d58
Signed-off-by: Maria Yu <aiquny@codeaurora.org>
2018-04-20 10:17:14 +08:00
Blagovest Kolenichev
39b8bb4d84 Merge android-4.9.89 (960923f) into msm-4.9
* refs/heads/tmp-960923f:
  Linux 4.9.89
  usb: gadget: bdc: 64-bit pointer capability check
  usb: dwc3: Fix GDBGFIFOSPACE_TYPE values
  USB: gadget: udc: Add missing platform_device_put() on error in bdc_pci_probe()
  scsi: qla2xxx: Fix extraneous ref on sp's after adapter break
  btrfs: Fix use-after-free when cleaning up fs_devs with a single stale device
  btrfs: alloc_chunk: fix DUP stripe size handling
  scsi: sg: only check for dxfer_len greater than 256M
  scsi: sg: fix static checker warning in sg_is_valid_dxfer
  scsi: sg: fix SG_DXFER_FROM_DEV transfers
  irqchip/gic-v3-its: Ensure nr_ites >= nr_lpis
  fs/aio: Use RCU accessors for kioctx_table->table[]
  fs/aio: Add explicit RCU grace period when freeing kioctx
  lock_parent() needs to recheck if dentry got __dentry_kill'ed under it
  fs: Teach path_connected to handle nfs filesystems with multiple roots.
  drm/amdgpu/dce: Don't turn off DP sink when disconnected
  drm/amdgpu: fix prime teardown order
  ALSA: seq: Clear client entry before deleting else at closing
  ALSA: seq: Fix possible UAF in snd_seq_check_queue()
  ALSA: hda - Revert power_save option default value
  ALSA: pcm: Fix UAF in snd_pcm_oss_get_formats()
  parisc: Handle case where flush_cache_range is called with no context
  x86/mm: Fix vmalloc_fault to use pXd_large
  x86/speculation: Remove Skylake C2 from Speculation Control microcode blacklist
  x86/speculation, objtool: Annotate indirect calls/jumps for objtool on 32-bit kernels
  x86/vm86/32: Fix POPF emulation
  selftests/x86/entry_from_vm86: Add test cases for POPF
  selftests/x86: Add tests for the STR and SLDT instructions
  selftests/x86: Add tests for User-Mode Instruction Prevention
  selftests/x86/entry_from_vm86: Exit with 1 if we fail
  x86/cpufeatures: Add Intel PCONFIG cpufeature
  x86/boot/32: Fix UP boot on Quark and possibly other platforms
  net: hns: Some checkpatch.pl script & warning fixes
  ima: relax requiring a file signature for new files with zero length
  locking/locktorture: Fix num reader/writer corner cases
  rcutorture/configinit: Fix build directory error message
  ipvlan: add L2 check for packets arriving via virtual devices
  ASoC: nuc900: Fix a loop timeout test
  mac80211: remove BUG() when interface type is invalid
  mac80211_hwsim: enforce PS_MANUAL_POLL to be set after PS_ENABLED
  agp/intel: Flush all chipset writes after updating the GGTT
  powerpc/modules: Don't try to restore r2 after a sibling call
  drm/amdkfd: Fix memory leaks in kfd topology
  veth: set peer GSO values
  media: cpia2: Fix a couple off by one bugs
  media: vsp1: Prevent suspending and resuming DRM pipelines
  scsi: dh: add new rdac devices
  scsi: devinfo: apply to HP XP the same flags as Hitachi VSP
  scsi: core: scsi_get_device_flags_keyed(): Always return device flags
  bnxt_en: Don't print "Link speed -1 no longer supported" messages.
  spi: sun6i: disable/unprepare clocks on remove
  tools/usbip: fixes build with musl libc toolchain
  ath10k: fix invalid STS_CAP_OFFSET_MASK
  mwifiex: cfg80211: do not change virtual interface during scan processing
  clk: qcom: msm8916: fix mnd_width for codec_digcodec
  pwm: stmpe: Fix wrong register offset for hwpwm=2 case
  scsi: ses: don't ask for diagnostic pages repeatedly during probe
  ath10k: update tdls teardown state to target
  power: supply: ab8500_charger: Bail out in case of error in 'ab8500_charger_init_hw_registers()'
  power: supply: ab8500_charger: Fix an error handling path
  leds: pm8058: Silence pointer to integer size warning
  userns: Don't fail follow_automount based on s_user_ns
  mtd: nand: ifc: update bufnum mask for ver >= 2.0.0
  ARM: dts: omap3-n900: Fix the audio CODEC's reset pin
  ARM: dts: am335x-pepper: Fix the audio CODEC's reset pin
  net: thunderx: Set max queue count taking XDP_TX into account
  mtd: nand: fix interpretation of NAND_CMD_NONE in nand_command[_lp]()
  net: xfrm: allow clearing socket xfrm policies.
  net: ieee802154: adf7242: Fix bug if defined DEBUG
  test_firmware: fix setting old custom fw path back on exit
  sched: Stop resched_cpu() from sending IPIs to offline CPUs
  sched: Stop switched_to_rt() from sending IPIs to offline CPUs
  ARM: dts: exynos: Correct Trats2 panel reset line
  clk: meson: gxbb: fix wrong clock for SARADC/SANA
  iwlwifi: mvm: rs: don't override the rate history in the search cycle
  HID: elo: clear BTN_LEFT mapping
  video/hdmi: Allow "empty" HDMI infoframes
  drm/edid: set ELD connector type in drm_edid_to_eld()
  mwifiex: Fix invalid port issue
  perf stat: Fix bug in handling events in error state
  wil6210: fix memory access violation in wil_memcpy_from/toio_32
  wil6210: fix protection against connections during reset
  ath10k: fix compile time sanity check for CE4 buffer size
  mac80211_hwsim: use per-interface power level
  Bluetooth: 6lowpan: fix delay work init in add_peer_chan()
  Bluetooth: Avoid bt_accept_unlink() double unlinking
  clk: qcom: msm8996: Fix the vfe1 powerdomain name
  pwm: tegra: Increase precision in PWM rate calculation
  kprobes/x86: Set kprobes pages read-only
  kprobes/x86: Fix kprobe-booster not to boost far call instructions
  ALSA: hda: Add Geminilake id to SKL_PLUS
  scsi: sg: close race condition in sg_remove_sfp_usercontext()
  scsi: sg: check for valid direction before starting the request
  vfio/spapr_tce: Check kzalloc() return when preregistering memory
  vfio/powerpc/spapr_tce: Enforce IOMMU type compatibility check
  perf session: Don't rely on evlist in pipe mode
  net: fec: add phy-reset-gpios PROBE_DEFER check
  perf inject: Copy events when reordering events in pipe mode
  drivers/perf: arm_pmu: handle no platform_device
  iwlwifi: mvm: fix RX SKB header size and align it properly
  perf evsel: Return exact sub event which failed with EPERM for wildcards
  usb: gadget: dummy_hcd: Fix wrong power status bit clear/reset in dummy_hub_control()
  usb: dwc2: Make sure we disconnect the gadget state
  powerpc/nohash: Fix use of mmu_has_feature() in setup_initial_memory_limit()
  md.c:didn't unlock the mddev before return EINVAL in array_size_store
  md/raid6: Fix anomily when recovering a single device in RAID6.
  regulator: isl9305: fix array size
  v4l: vsp1: Register pipe with output WPF
  v4l: vsp1: Prevent multiple streamon race commencing pipeline early
  MIPS: r2-on-r6-emu: Clear BLTZALL and BGEZALL debugfs counters
  MIPS: r2-on-r6-emu: Fix BLEZL and BGTZL identification
  MIPS: BPF: Fix multiple problems in JIT skb access helpers.
  MIPS: BPF: Quit clobbering callee saved registers in JIT code.
  serial: imx: setup DCEDTE early and ensure DCD and RI irqs to be off
  tty: amba-pl011: Fix spurious TX interrupts
  lkdtm: turn off kcov for lkdtm_rodata_do_nothing:
  coresight: Fixes coresight DT parse to get correct output port ID.
  i40e: only register client on iWarp-capable devices
  drm/rockchip: vop: Enable pm domain before vop_initial
  drm/amdgpu: Fail fb creation from imported dma-bufs. (v2)
  drm/radeon: Fail fb creation from imported dma-bufs.
  video: ARM CLCD: fix dma allocation size
  kvm: nVMX: Disallow userspace-injected exceptions in guest mode
  kvm/svm: Setup MCG_CAP on AMD properly
  iommu/iova: Fix underflow bug in __alloc_and_insert_iova_range
  apparmor: Make path_max parameter readonly
  qed: Correct MSI-x for storage
  scsi: ses: don't get power status of SES device slot on probe
  EDAC, altera: Fix peripheral warnings for Cyclone5
  fm10k: correctly check if interface is removed
  ALSA: firewire-digi00x: handle all MIDI messages on streaming packets
  ALSA: firewire-digi00x: add support for console models of Digi00x series
  IB/hfi1: Check for QSFP presence before attempting reads
  ASoC: rt5677: Add OF device ID table
  reiserfs: Make cancel_old_flush() reliable
  ARM: dts: koelsch: Correct clock frequency of X2 DU clock input
  drm: rcar-du: Handle event when disabling CRTCs
  printk: Correctly handle preemption in console_unlock()
  rtmutex: Fix PI chain order integrity
  qed: Fix TM block ILT allocation
  net/faraday: Add missing include of of.h
  net: hns: Correct HNS RSS key set function
  powerpc: Avoid taking a data miss on every userspace instruction miss
  ARM: dts: r8a7793: Correct parent of SSI[0-9] clocks
  ARM: dts: r8a7791: Correct parent of SSI[0-9] clocks
  ARM: dts: r8a7790: Correct parent of SSI[0-9] clocks
  ARM: dts: r7s72100: fix ethernet clock parent
  NFC: pn533: change order of free_irq and dev unregistration
  NFC: nfcmrvl: double free on error path
  NFC: nfcmrvl: Include unaligned.h instead of access_ok.h
  vxlan: vxlan dev should inherit lowerdev's gso_max_size
  drm/vmwgfx: Fixes to vmwgfx_fb
  braille-console: Fix value returned by _braille_console_setup
  powerpc/mm/hugetlb: Filter out hugepage size not supported by page table layout
  PCI: Apply Cavium ACS quirk only to CN81xx/CN83xx/CN88xx devices
  bonding: refine bond_fold_stats() wrap detection
  drm/ttm: never add BO that failed to validate to the LRU list
  f2fs: relax node version check for victim data in gc
  perf trace: Handle unpaired raw_syscalls:sys_exit event
  regulator: core: Limit propagation of parent voltage count and list
  blk-throttle: make sure expire time isn't too big
  ARM: dts: silk: Correct clock of DU1
  ARM: dts: r8a7794: Correct clock of DU1
  ARM: dts: r8a7794: Add DU1 clock to device tree
  ALSA: firewire-lib: add a quirk of packet without valid EOH in CIP format
  mm: Fix false-positive VM_BUG_ON() in page_cache_{get,add}_speculative()
  bonding: make speed, duplex setting consistent with link state
  driver: (adm1275) set the m,b and R coefficients correctly for power
  scsi: be2iscsi: Check tag in beiscsi_mccq_compl_wait
  i40e/i40evf: Fix use after free in Rx cleanup path
  perf buildid: Do not assume that readlink() returns a null terminated string
  perf annotate: Fix a bug following symbolic link of a build-id file
  ARM: dts: bcm2835: add index to the ethernet alias
  usb: dwc3: make sure UX_EXIT_PX is cleared
  dmaengine: imx-sdma: add 1ms delay to ensure SDMA channel is stopped
  tcp: sysctl: Fix a race to avoid unexpected 0 window from space
  spi: omap2-mcspi: poll OMAP2_MCSPI_CHSTAT_RXS for PIO transfer
  ASoC: rcar: ssi: don't set SSICR.CKDV = 000 with SSIWSR.CONT
  PCI: hv: Lock PCI bus on device eject
  PCI: hv: Properly handle PCI bus remove
  sched: act_csum: don't mangle TCP and UDP GSO packets
  Input: qt1070 - add OF device ID table
  sysrq: Reset the watchdog timers while displaying high-resolution timers
  timers, sched_clock: Update timeout for clock wrap
  media: i2c/soc_camera: fix ov6650 sensor getting wrong clock
  scsi: ipr: Fix missed EH wakeup
  scsi: fnic: Fix for "Number of Active IOs" in fnicstats becoming negative
  x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
  solo6x10: release vb2 buffers in solo_stop_streaming()
  of: fix of_device_get_modalias returned length when truncating buffers
  batman-adv: handle race condition for claims between gateways
  zd1211rw: fix NULL-deref at probe
  s390/topology: fix typo in early topology code
  qed: Always publish VF link from leading hwfn
  ARM: dts: Adjust moxart IRQ controller and flags
  net/8021q: create device with all possible features in wanted_features
  HID: clamp input to logical range if no null state
  perf probe: Return errno when not hitting any event
  perf probe: Fix concat_probe_trace_events
  omapfb: dss: Handle return errors in dss_init_ports()
  x86/mce: Init some CPU features early
  netem: apply correct delay when rate throttling
  net: ethernet: bgmac: Allow MAC address to be specified in DTB
  ARM: bcm2835: Enable missing CMA settings for VC4 driver
  usb: misc: lvs: fix race condition in disconnect handling
  ath10k: fix fetching channel during potential radar detection
  ath10k: disallow DFS simulation if DFS channel is not enabled
  drm: Defer disabling the vblank IRQ until the next interrupt (for instant-off)
  drivers: net: xgene: Fix Rx checksum validation logic
  drivers: net: xgene: Fix wrong logical operation
  drivers: net: phy: xgene: Fix mdio write
  drivers: net: xgene: Fix hardware checksum setting
  ARM: brcmstb: Enable ZONE_DMA for non 64-bit capable peripherals
  perf tools: Make perf_event__synthesize_mmap_events() scale
  i40e: fix ethtool to get EEPROM data from X722 interface
  i40e: Acquire NVM lock before reads on all devices
  eventpoll.h: fix epoll event masks
  x86/mce: Handle broadcasted MCE gracefully with kexec
  perf sort: Fix segfault with basic block 'cycles' sort dimension
  x86/mm: Make mmap(MAP_32BIT) work correctly
  selinux: check for address length in selinux_socket_bind()
  PCI/MSI: Stop disabling MSI/MSI-X in pci_device_shutdown()
  drm/sun4i: Fix TCON clock and regmap initialization sequence
  ath10k: fix a warning during channel switch with multiple vaps
  drm/sun4i: Set drm_crtc.port to the underlying TCON's output port node
  drm/sun4i: Fix up error path cleanup for master bind function
  arm64: dts: r8a7796: Remove unit-address and reg from integrated cache
  ARM: dts: r8a7794: Remove unit-address and reg from integrated cache
  ARM: dts: r8a7793: Remove unit-address and reg from integrated cache
  ARM: dts: r8a7792: Remove unit-address and reg from integrated cache
  ARM: dts: r8a7791: Remove unit-address and reg from integrated cache
  drm: qxl: Don't alloc fbdev if emulation is not supported
  HID: reject input outside logical range only if null state is set
  staging: wilc1000: add check for kmalloc allocation failure.
  staging: speakup: Replace BUG_ON() with WARN_ON().
  perf stat: Issue a HW watchdog disable hint
  Input: tsc2007 - check for presence and power down tsc2007 during probe
  blkcg: fix double free of new_blkg in blkcg_init_queue
  staging: android: ashmem: Fix possible deadlock in ashmem_ioctl

Conflicts:
	drivers/net/wireless/ath/wil6210/main.c
	drivers/usb/dwc3/core.h

Change-Id: I2d77962cfc3dbc8b051ba51bf10b577d327ffaa9
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2018-04-17 10:45:20 -07:00
Pavankumar Kondeti
7cc02920f7 sched/fair: Add bias towards previous CPU for high wakeup rate tasks
Skip the CPU selection algorithm for high wakeup rate tasks and select
the previous CPU if it is idle.

Change-Id: I99493c8c37e091c79723ee22a5230fbb6f4e33b9
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-17 22:41:40 +05:30
Vikram Mulukutla
77ecebbcf0 sched: walt: Optimize cycle counter reads
The cycle counter read is a bit of an expensive operation and requires
locking across all CPUs in a frequency domain. Optimize this by
returning the same value if the delta between two reads is zero i.e
two reads are done in the same sched context for the same CPU.

Change-Id: I99da5a704d3652f53c8564ba7532783d3288f227
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
[pkondeti@codeaurora.org: limit the optimization to the
same CPU for the sched context]
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-17 22:31:38 +05:30
Pavankumar Kondeti
edd112fb17 sched/fair: Force CPU capacity update from higher level sched domain
A CPU capacity is updated in update_cpu_capacity(). This is called
when the CPU is doing load balance in the lowest level sched domain.
The load balance is disabled for the lowest level sched domain, when
there is only 1 CPU in the higher level domain. When there is only
1 CPU in the higher capacity cluster, the CPU capacity and the
root domain's max_cpu_capacity are not getting updated correctly.

Due to the above mentioned problem, a task running on the lower
capacity CPU never upmigrates to the higher capacity CPU thinking
it is running on the max capacity CPU. Fix this by forcing the
CPU capacity update from a higher level sched domain if it has
only 1 CPU.

Change-Id: Iddb900962ab988ffbd08bc92348bd491fa2002ad
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-17 10:58:35 +05:30
Linux Build Service Account
65d8f35b3d Merge "sched/fair: Consider only idle CPUs for active migration" 2018-04-15 08:16:32 -07:00
Linux Build Service Account
72ff2188dd Merge "sched: Fix incorrect usage of SCHED_CPUFREQ_INTERCLUSTER_MIG flag" 2018-04-12 13:55:31 -07:00
Pavankumar Kondeti
b1bd5e380a sched: Fix compilation issues with schedutil for !SCHED_WALT
sched_ravg_window and sysctl_sched_use_walt_cpu_util are WALT specific
tunables. These are used outside !SCHED_WALT in schedutil. Add stubs
for these to fix compilation with WALT disabled.

Change-Id: I3f61cf06a857d52da5eda1d51701ad7cc21598d3
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-06 06:20:03 +05:30
Pavankumar Kondeti
f9b1af1cd6 sched/fair: Bring sched_smp_overlap_capacity out of WALT
sched_smp_overlap_capacity decideds the packing on the primary
cluster on a SMP system with more than 1 cluster. Currently this
threshold is defined in WALT but it is accessed outside WALT code.
Bring it out of WALT to fix compilation issue when WALT is disabled.

Change-Id: I731e736f7a3717ff9998a1d2cdbc6709f5ce1c4e
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-06 06:19:24 +05:30
Pavankumar Kondeti
4d5dd1ca73 sched: Fix incorrect usage of SCHED_CPUFREQ_INTERCLUSTER_MIG flag
Mark the source/destination CPUs correctly for inter cluster
migration.

Change-Id: I771b9357d20cb0270465abd594fb94bb3669c936
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-05 08:26:43 +05:30
Pavankumar Kondeti
398bb7dba1 sched/fair: Fix is_packing_eligible() for !SCHED_WALT
The find_best_target() checks if packing is eligible or not on
an active CPU before selecting it as the primary candidate CPU.
The is_packing_eligible() function checks if the currently
waking task can fit on the active CPU without increasing the
OPP or not. The current implementation only works for WALT. Make
it work even when WALT is disabled.

Change-Id: Iba7a2b6aca2dd2feca6f2e910888276113969186
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-03 22:53:25 +05:30
Pavankumar Kondeti
3521322b09 sched: Fix compilation issue in task_tick_fair() for !SCHED_WALT
The misfit task count tracking is currently implemented only for
WALT. Refactor the mistfit task update in task_tick_fair() to
avoid accessing misfit method in task_struct when WALT is disabled.

Change-Id: I5ce1a7ed7edfe1f119e8a13b8b1fbe06ee0b5ab8
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-03 22:51:21 +05:30
Pavankumar Kondeti
39f8279794 sched: Move sched_boost defines out of SCHED_WALT
Currently the sched_boost is compiled only for WALT. But sched_boost
defines are accessed outside WALT in RT code. Move these definitions
outside WALT to fix compilation issues.

Change-Id: I38d006c9edb8c859aa10d0638aa271eaaa4db40d
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-03 22:51:21 +05:30
Pavankumar Kondeti
55beff90f4 sched: Fix a compilation issue in find_best_target() for !SCHED_WALT
A CPU is marked as reserved when a task is under active migration to
that CPU. The reserved CPUs are excluded from CPU selection in
find_best_target(). This reservation scheme is currently tied to WALT.
So add a stub function for is_reserved() to avoid compilation issue
when WALT is disabled.

Change-Id: I77c6a8fd3f24f1763fd97fce18574613e9e947ff
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-03 22:51:21 +05:30
Pavankumar Kondeti
5eaea3749e sched: Fix a compilation issue when WALT is disabled
find_rtg_target() is a WALT specific function. Accessing it outside
WALT code is resulting in a compilation error. Fix this by adding
a stub for find_rtg_target().

Change-Id: Iccc2c2ef1490e66f7c33b8d6d4afda18c398d438
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-03 22:51:20 +05:30
Pavankumar Kondeti
3a7f9a4dce sched: Add a stub for walt_cpu_high_irqload() for !SCHED_WALT
walt_cpu_high_irqload() is a WALT specific function that indicates
whether a CPU is busy with high irq load or not. This function
is called from CPU selection code which is not tied to WALT which
results in a compilation error. Fix this by adding a stub for
walt_cpu_high_irqload().

Change-Id: Ib0da7c178f90f5b48c168f2143281745297bb593
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-03 22:51:20 +05:30
Pavankumar Kondeti
758a8ccd6b sched/tune: Fix compilation issue when WALT is disabled
schedtune_attach() functionality is only needed when WALT is enabled.
Define a stub function for it when WALT is disabled.

Change-Id: I57a9543f53cdee1e5930756f3f06f47b2f88152e
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-03 22:51:20 +05:30
Pavankumar Kondeti
a98dc2aeac sched: Define a stub function for sched_irqload when WALT is disabled
sched_irqload() is a WALT specific function which returns the average
IRQ load on a given CPU. It is accessed in sched_cpu_util trace point
for printing the IRQ load on different CPUs evaluated in CPU selection
algorithm. This trace point is not tied to WALT, so define a stub
function for sched_irqload().

Change-Id: Ie8f5365ddd5b7a4a0dea3a190b18c54d4d3c2d4b
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-03 22:51:20 +05:30
Pavankumar Kondeti
152adabb8c sched: Get sched_task_util trace point working for !SCHED_WALT
sched_task_util trace point depends on task's mark_start for
calculating the time taken for CPU selection algorithm in
select_energy_cpu_brute(). This results in a compilation error,
since the mark_start is not available when SCHED_WALT is disabled.

sched_task_util trace point is not tied to WALT. Fix this issue by
using sched_clock() instead of mark_start. The sched_clock() is
accessed only when the trace point is enabled, so there will not be
any additional overhead.

Change-Id: Ide67741a188da13911929422c0bb1b5af2d29826
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-03 22:51:20 +05:30
Pavankumar Kondeti
448f9ac778 sched/fair: Consider only idle CPUs for active migration
A tasks that is not fit on its current CPU is actively migrated
to a higher capacity CPU from the scheduler tick path. Since
active migration involves stopping the currently running task
and migrating to a different CPU, it is better to do this only
if the target CPU is idle. The task does not have to necessarily
run on the lower capacity cluster until the next tick. When a
higher capacity CPU becomes idle, it can pull the misfit task
running on the lower capacity CPU.

Change-Id: Idda0bd6ac8cc4bc22f5eddc69236014d04708ecd
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-04-03 10:26:10 +05:30
Linux Build Service Account
35a3763fac Merge "cpufreq: schedutil: update warn_on with bug_on" 2018-03-30 15:08:58 -07:00
Pavankumar Kondeti
646fe8f4a7 sched/fair: Consider an idle CPU outside c-state as an active CPU
The find_best_target() selects an active CPU and an idle CPU as
two candidate CPUs. Whichever CPUs saves the most energy compared
to the previous CPU is selected finally. An idle CPU i.e no runnable
tasks but also outside c-state is a good candidate to run the
waking task since the task can run immediately and there is no
idle exit latency. Hence consider such CPU as an active CPU which
helps both power and performance.

Change-Id: I34f40c2dbca70995a8e6b4a8d5876f802bc000bc
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-03-28 10:34:59 +05:30
Linux Build Service Account
e3765a4b43 Merge "Merge remote-tracking branch 'dev/msm-4.9-sched' into msm-4.9" 2018-03-23 05:51:08 -07:00
Greg Kroah-Hartman
960923fdc2 Merge 4.9.89 into android-4.9
Changes in 4.9.89
	blkcg: fix double free of new_blkg in blkcg_init_queue
	Input: tsc2007 - check for presence and power down tsc2007 during probe
	perf stat: Issue a HW watchdog disable hint
	staging: speakup: Replace BUG_ON() with WARN_ON().
	staging: wilc1000: add check for kmalloc allocation failure.
	HID: reject input outside logical range only if null state is set
	drm: qxl: Don't alloc fbdev if emulation is not supported
	ARM: dts: r8a7791: Remove unit-address and reg from integrated cache
	ARM: dts: r8a7792: Remove unit-address and reg from integrated cache
	ARM: dts: r8a7793: Remove unit-address and reg from integrated cache
	ARM: dts: r8a7794: Remove unit-address and reg from integrated cache
	arm64: dts: r8a7796: Remove unit-address and reg from integrated cache
	drm/sun4i: Fix up error path cleanup for master bind function
	drm/sun4i: Set drm_crtc.port to the underlying TCON's output port node
	ath10k: fix a warning during channel switch with multiple vaps
	drm/sun4i: Fix TCON clock and regmap initialization sequence
	PCI/MSI: Stop disabling MSI/MSI-X in pci_device_shutdown()
	selinux: check for address length in selinux_socket_bind()
	x86/mm: Make mmap(MAP_32BIT) work correctly
	perf sort: Fix segfault with basic block 'cycles' sort dimension
	x86/mce: Handle broadcasted MCE gracefully with kexec
	eventpoll.h: fix epoll event masks
	i40e: Acquire NVM lock before reads on all devices
	i40e: fix ethtool to get EEPROM data from X722 interface
	perf tools: Make perf_event__synthesize_mmap_events() scale
	ARM: brcmstb: Enable ZONE_DMA for non 64-bit capable peripherals
	drivers: net: xgene: Fix hardware checksum setting
	drivers: net: phy: xgene: Fix mdio write
	drivers: net: xgene: Fix wrong logical operation
	drivers: net: xgene: Fix Rx checksum validation logic
	drm: Defer disabling the vblank IRQ until the next interrupt (for instant-off)
	ath10k: disallow DFS simulation if DFS channel is not enabled
	ath10k: fix fetching channel during potential radar detection
	usb: misc: lvs: fix race condition in disconnect handling
	ARM: bcm2835: Enable missing CMA settings for VC4 driver
	net: ethernet: bgmac: Allow MAC address to be specified in DTB
	netem: apply correct delay when rate throttling
	x86/mce: Init some CPU features early
	omapfb: dss: Handle return errors in dss_init_ports()
	perf probe: Fix concat_probe_trace_events
	perf probe: Return errno when not hitting any event
	HID: clamp input to logical range if no null state
	net/8021q: create device with all possible features in wanted_features
	ARM: dts: Adjust moxart IRQ controller and flags
	qed: Always publish VF link from leading hwfn
	s390/topology: fix typo in early topology code
	zd1211rw: fix NULL-deref at probe
	batman-adv: handle race condition for claims between gateways
	of: fix of_device_get_modalias returned length when truncating buffers
	solo6x10: release vb2 buffers in solo_stop_streaming()
	x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up
	scsi: fnic: Fix for "Number of Active IOs" in fnicstats becoming negative
	scsi: ipr: Fix missed EH wakeup
	media: i2c/soc_camera: fix ov6650 sensor getting wrong clock
	timers, sched_clock: Update timeout for clock wrap
	sysrq: Reset the watchdog timers while displaying high-resolution timers
	Input: qt1070 - add OF device ID table
	sched: act_csum: don't mangle TCP and UDP GSO packets
	PCI: hv: Properly handle PCI bus remove
	PCI: hv: Lock PCI bus on device eject
	ASoC: rcar: ssi: don't set SSICR.CKDV = 000 with SSIWSR.CONT
	spi: omap2-mcspi: poll OMAP2_MCSPI_CHSTAT_RXS for PIO transfer
	tcp: sysctl: Fix a race to avoid unexpected 0 window from space
	dmaengine: imx-sdma: add 1ms delay to ensure SDMA channel is stopped
	usb: dwc3: make sure UX_EXIT_PX is cleared
	ARM: dts: bcm2835: add index to the ethernet alias
	perf annotate: Fix a bug following symbolic link of a build-id file
	perf buildid: Do not assume that readlink() returns a null terminated string
	i40e/i40evf: Fix use after free in Rx cleanup path
	scsi: be2iscsi: Check tag in beiscsi_mccq_compl_wait
	driver: (adm1275) set the m,b and R coefficients correctly for power
	bonding: make speed, duplex setting consistent with link state
	mm: Fix false-positive VM_BUG_ON() in page_cache_{get,add}_speculative()
	ALSA: firewire-lib: add a quirk of packet without valid EOH in CIP format
	ARM: dts: r8a7794: Add DU1 clock to device tree
	ARM: dts: r8a7794: Correct clock of DU1
	ARM: dts: silk: Correct clock of DU1
	blk-throttle: make sure expire time isn't too big
	regulator: core: Limit propagation of parent voltage count and list
	perf trace: Handle unpaired raw_syscalls:sys_exit event
	f2fs: relax node version check for victim data in gc
	drm/ttm: never add BO that failed to validate to the LRU list
	bonding: refine bond_fold_stats() wrap detection
	PCI: Apply Cavium ACS quirk only to CN81xx/CN83xx/CN88xx devices
	powerpc/mm/hugetlb: Filter out hugepage size not supported by page table layout
	braille-console: Fix value returned by _braille_console_setup
	drm/vmwgfx: Fixes to vmwgfx_fb
	vxlan: vxlan dev should inherit lowerdev's gso_max_size
	NFC: nfcmrvl: Include unaligned.h instead of access_ok.h
	NFC: nfcmrvl: double free on error path
	NFC: pn533: change order of free_irq and dev unregistration
	ARM: dts: r7s72100: fix ethernet clock parent
	ARM: dts: r8a7790: Correct parent of SSI[0-9] clocks
	ARM: dts: r8a7791: Correct parent of SSI[0-9] clocks
	ARM: dts: r8a7793: Correct parent of SSI[0-9] clocks
	powerpc: Avoid taking a data miss on every userspace instruction miss
	net: hns: Correct HNS RSS key set function
	net/faraday: Add missing include of of.h
	qed: Fix TM block ILT allocation
	rtmutex: Fix PI chain order integrity
	printk: Correctly handle preemption in console_unlock()
	drm: rcar-du: Handle event when disabling CRTCs
	ARM: dts: koelsch: Correct clock frequency of X2 DU clock input
	reiserfs: Make cancel_old_flush() reliable
	ASoC: rt5677: Add OF device ID table
	IB/hfi1: Check for QSFP presence before attempting reads
	ALSA: firewire-digi00x: add support for console models of Digi00x series
	ALSA: firewire-digi00x: handle all MIDI messages on streaming packets
	fm10k: correctly check if interface is removed
	EDAC, altera: Fix peripheral warnings for Cyclone5
	scsi: ses: don't get power status of SES device slot on probe
	qed: Correct MSI-x for storage
	apparmor: Make path_max parameter readonly
	iommu/iova: Fix underflow bug in __alloc_and_insert_iova_range
	kvm/svm: Setup MCG_CAP on AMD properly
	kvm: nVMX: Disallow userspace-injected exceptions in guest mode
	video: ARM CLCD: fix dma allocation size
	drm/radeon: Fail fb creation from imported dma-bufs.
	drm/amdgpu: Fail fb creation from imported dma-bufs. (v2)
	drm/rockchip: vop: Enable pm domain before vop_initial
	i40e: only register client on iWarp-capable devices
	coresight: Fixes coresight DT parse to get correct output port ID.
	lkdtm: turn off kcov for lkdtm_rodata_do_nothing:
	tty: amba-pl011: Fix spurious TX interrupts
	serial: imx: setup DCEDTE early and ensure DCD and RI irqs to be off
	MIPS: BPF: Quit clobbering callee saved registers in JIT code.
	MIPS: BPF: Fix multiple problems in JIT skb access helpers.
	MIPS: r2-on-r6-emu: Fix BLEZL and BGTZL identification
	MIPS: r2-on-r6-emu: Clear BLTZALL and BGEZALL debugfs counters
	v4l: vsp1: Prevent multiple streamon race commencing pipeline early
	v4l: vsp1: Register pipe with output WPF
	regulator: isl9305: fix array size
	md/raid6: Fix anomily when recovering a single device in RAID6.
	md.c:didn't unlock the mddev before return EINVAL in array_size_store
	powerpc/nohash: Fix use of mmu_has_feature() in setup_initial_memory_limit()
	usb: dwc2: Make sure we disconnect the gadget state
	usb: gadget: dummy_hcd: Fix wrong power status bit clear/reset in dummy_hub_control()
	perf evsel: Return exact sub event which failed with EPERM for wildcards
	iwlwifi: mvm: fix RX SKB header size and align it properly
	drivers/perf: arm_pmu: handle no platform_device
	perf inject: Copy events when reordering events in pipe mode
	net: fec: add phy-reset-gpios PROBE_DEFER check
	perf session: Don't rely on evlist in pipe mode
	vfio/powerpc/spapr_tce: Enforce IOMMU type compatibility check
	vfio/spapr_tce: Check kzalloc() return when preregistering memory
	scsi: sg: check for valid direction before starting the request
	scsi: sg: close race condition in sg_remove_sfp_usercontext()
	ALSA: hda: Add Geminilake id to SKL_PLUS
	kprobes/x86: Fix kprobe-booster not to boost far call instructions
	kprobes/x86: Set kprobes pages read-only
	pwm: tegra: Increase precision in PWM rate calculation
	clk: qcom: msm8996: Fix the vfe1 powerdomain name
	Bluetooth: Avoid bt_accept_unlink() double unlinking
	Bluetooth: 6lowpan: fix delay work init in add_peer_chan()
	mac80211_hwsim: use per-interface power level
	ath10k: fix compile time sanity check for CE4 buffer size
	wil6210: fix protection against connections during reset
	wil6210: fix memory access violation in wil_memcpy_from/toio_32
	perf stat: Fix bug in handling events in error state
	mwifiex: Fix invalid port issue
	drm/edid: set ELD connector type in drm_edid_to_eld()
	video/hdmi: Allow "empty" HDMI infoframes
	HID: elo: clear BTN_LEFT mapping
	iwlwifi: mvm: rs: don't override the rate history in the search cycle
	clk: meson: gxbb: fix wrong clock for SARADC/SANA
	ARM: dts: exynos: Correct Trats2 panel reset line
	sched: Stop switched_to_rt() from sending IPIs to offline CPUs
	sched: Stop resched_cpu() from sending IPIs to offline CPUs
	test_firmware: fix setting old custom fw path back on exit
	net: ieee802154: adf7242: Fix bug if defined DEBUG
	net: xfrm: allow clearing socket xfrm policies.
	mtd: nand: fix interpretation of NAND_CMD_NONE in nand_command[_lp]()
	net: thunderx: Set max queue count taking XDP_TX into account
	ARM: dts: am335x-pepper: Fix the audio CODEC's reset pin
	ARM: dts: omap3-n900: Fix the audio CODEC's reset pin
	mtd: nand: ifc: update bufnum mask for ver >= 2.0.0
	userns: Don't fail follow_automount based on s_user_ns
	leds: pm8058: Silence pointer to integer size warning
	power: supply: ab8500_charger: Fix an error handling path
	power: supply: ab8500_charger: Bail out in case of error in 'ab8500_charger_init_hw_registers()'
	ath10k: update tdls teardown state to target
	scsi: ses: don't ask for diagnostic pages repeatedly during probe
	pwm: stmpe: Fix wrong register offset for hwpwm=2 case
	clk: qcom: msm8916: fix mnd_width for codec_digcodec
	mwifiex: cfg80211: do not change virtual interface during scan processing
	ath10k: fix invalid STS_CAP_OFFSET_MASK
	tools/usbip: fixes build with musl libc toolchain
	spi: sun6i: disable/unprepare clocks on remove
	bnxt_en: Don't print "Link speed -1 no longer supported" messages.
	scsi: core: scsi_get_device_flags_keyed(): Always return device flags
	scsi: devinfo: apply to HP XP the same flags as Hitachi VSP
	scsi: dh: add new rdac devices
	media: vsp1: Prevent suspending and resuming DRM pipelines
	media: cpia2: Fix a couple off by one bugs
	veth: set peer GSO values
	drm/amdkfd: Fix memory leaks in kfd topology
	powerpc/modules: Don't try to restore r2 after a sibling call
	agp/intel: Flush all chipset writes after updating the GGTT
	mac80211_hwsim: enforce PS_MANUAL_POLL to be set after PS_ENABLED
	mac80211: remove BUG() when interface type is invalid
	ASoC: nuc900: Fix a loop timeout test
	ipvlan: add L2 check for packets arriving via virtual devices
	rcutorture/configinit: Fix build directory error message
	locking/locktorture: Fix num reader/writer corner cases
	ima: relax requiring a file signature for new files with zero length
	net: hns: Some checkpatch.pl script & warning fixes
	x86/boot/32: Fix UP boot on Quark and possibly other platforms
	x86/cpufeatures: Add Intel PCONFIG cpufeature
	selftests/x86/entry_from_vm86: Exit with 1 if we fail
	selftests/x86: Add tests for User-Mode Instruction Prevention
	selftests/x86: Add tests for the STR and SLDT instructions
	selftests/x86/entry_from_vm86: Add test cases for POPF
	x86/vm86/32: Fix POPF emulation
	x86/speculation, objtool: Annotate indirect calls/jumps for objtool on 32-bit kernels
	x86/speculation: Remove Skylake C2 from Speculation Control microcode blacklist
	x86/mm: Fix vmalloc_fault to use pXd_large
	parisc: Handle case where flush_cache_range is called with no context
	ALSA: pcm: Fix UAF in snd_pcm_oss_get_formats()
	ALSA: hda - Revert power_save option default value
	ALSA: seq: Fix possible UAF in snd_seq_check_queue()
	ALSA: seq: Clear client entry before deleting else at closing
	drm/amdgpu: fix prime teardown order
	drm/amdgpu/dce: Don't turn off DP sink when disconnected
	fs: Teach path_connected to handle nfs filesystems with multiple roots.
	lock_parent() needs to recheck if dentry got __dentry_kill'ed under it
	fs/aio: Add explicit RCU grace period when freeing kioctx
	fs/aio: Use RCU accessors for kioctx_table->table[]
	irqchip/gic-v3-its: Ensure nr_ites >= nr_lpis
	scsi: sg: fix SG_DXFER_FROM_DEV transfers
	scsi: sg: fix static checker warning in sg_is_valid_dxfer
	scsi: sg: only check for dxfer_len greater than 256M
	btrfs: alloc_chunk: fix DUP stripe size handling
	btrfs: Fix use-after-free when cleaning up fs_devs with a single stale device
	scsi: qla2xxx: Fix extraneous ref on sp's after adapter break
	USB: gadget: udc: Add missing platform_device_put() on error in bdc_pci_probe()
	usb: dwc3: Fix GDBGFIFOSPACE_TYPE values
	usb: gadget: bdc: 64-bit pointer capability check
	Linux 4.9.89

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-03-22 09:54:47 +01:00
Paul E. McKenney
cce2b93fd3 sched: Stop resched_cpu() from sending IPIs to offline CPUs
[ Upstream commit a0982dfa03efca6c239c52cabebcea4afb93ea6b ]

The rcutorture test suite occasionally provokes a splat due to invoking
resched_cpu() on an offline CPU:

WARNING: CPU: 2 PID: 8 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x37/0x40
Modules linked in:
CPU: 2 PID: 8 Comm: rcu_preempt Not tainted 4.14.0-rc4+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
task: ffff902ede9daf00 task.stack: ffff96c50010c000
RIP: 0010:native_smp_send_reschedule+0x37/0x40
RSP: 0018:ffff96c50010fdb8 EFLAGS: 00010096
RAX: 000000000000002e RBX: ffff902edaab4680 RCX: 0000000000000003
RDX: 0000000080000003 RSI: 0000000000000000 RDI: 00000000ffffffff
RBP: ffff96c50010fdb8 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 00000000299f36ae R12: 0000000000000001
R13: ffffffff9de64240 R14: 0000000000000001 R15: ffffffff9de64240
FS:  0000000000000000(0000) GS:ffff902edfc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000f7d4c642 CR3: 000000001e0e2000 CR4: 00000000000006e0
Call Trace:
 resched_curr+0x8f/0x1c0
 resched_cpu+0x2c/0x40
 rcu_implicit_dynticks_qs+0x152/0x220
 force_qs_rnp+0x147/0x1d0
 ? sync_rcu_exp_select_cpus+0x450/0x450
 rcu_gp_kthread+0x5a9/0x950
 kthread+0x142/0x180
 ? force_qs_rnp+0x1d0/0x1d0
 ? kthread_create_on_node+0x40/0x40
 ret_from_fork+0x27/0x40
Code: 14 01 0f 92 c0 84 c0 74 14 48 8b 05 14 4f f4 00 be fd 00 00 00 ff 90 a0 00 00 00 5d c3 89 fe 48 c7 c7 38 89 ca 9d e8 e5 56 08 00 <0f> ff 5d c3 0f 1f 44 00 00 8b 05 52 9e 37 02 85 c0 75 38 55 48
---[ end trace 26df9e5df4bba4ac ]---

This splat cannot be generated by expedited grace periods because they
always invoke resched_cpu() on the current CPU, which is good because
expedited grace periods require that resched_cpu() unconditionally
succeed.  However, other parts of RCU can tolerate resched_cpu() acting
as a no-op, at least as long as it doesn't happen too often.

This commit therefore makes resched_cpu() invoke resched_curr() only if
the CPU is either online or is the current CPU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>

Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-22 09:17:54 +01:00
Paul E. McKenney
bac7bb1849 sched: Stop switched_to_rt() from sending IPIs to offline CPUs
[ Upstream commit 2fe2582649aa2355f79acddb86bd4d6c5363eb63 ]

The rcutorture test suite occasionally provokes a splat due to invoking
rt_mutex_lock() which needs to boost the priority of a task currently
sitting on a runqueue that belongs to an offline CPU:

WARNING: CPU: 0 PID: 12 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x37/0x40
Modules linked in:
CPU: 0 PID: 12 Comm: rcub/7 Not tainted 4.14.0-rc4+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
task: ffff9ed3de5f8cc0 task.stack: ffffbbf80012c000
RIP: 0010:native_smp_send_reschedule+0x37/0x40
RSP: 0018:ffffbbf80012fd10 EFLAGS: 00010082
RAX: 000000000000002f RBX: ffff9ed3dd9cb300 RCX: 0000000000000004
RDX: 0000000080000004 RSI: 0000000000000086 RDI: 00000000ffffffff
RBP: ffffbbf80012fd10 R08: 000000000009da7a R09: 0000000000007b9d
R10: 0000000000000001 R11: ffffffffbb57c2cd R12: 000000000000000d
R13: ffff9ed3de5f8cc0 R14: 0000000000000061 R15: ffff9ed3ded59200
FS:  0000000000000000(0000) GS:ffff9ed3dea00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000080686f0 CR3: 000000001b9e0000 CR4: 00000000000006f0
Call Trace:
 resched_curr+0x61/0xd0
 switched_to_rt+0x8f/0xa0
 rt_mutex_setprio+0x25c/0x410
 task_blocks_on_rt_mutex+0x1b3/0x1f0
 rt_mutex_slowlock+0xa9/0x1e0
 rt_mutex_lock+0x29/0x30
 rcu_boost_kthread+0x127/0x3c0
 kthread+0x104/0x140
 ? rcu_report_unblock_qs_rnp+0x90/0x90
 ? kthread_create_on_node+0x40/0x40
 ret_from_fork+0x22/0x30
Code: f0 00 0f 92 c0 84 c0 74 14 48 8b 05 34 74 c5 00 be fd 00 00 00 ff 90 a0 00 00 00 5d c3 89 fe 48 c7 c7 a0 c6 fc b9 e8 d5 b5 06 00 <0f> ff 5d c3 0f 1f 44 00 00 8b 05 a2 d1 13 02 85 c0 75 38 55 48

But the target task's priority has already been adjusted, so the only
purpose of switched_to_rt() invoking resched_curr() is to wake up the
CPU running some task that needs to be preempted by the boosted task.
But the CPU is offline, which presumably means that the task must be
migrated to some other CPU, and that this other CPU will undertake any
needed preemption at the time of migration.  Because the runqueue lock
is held when resched_curr() is invoked, we know that the boosted task
cannot go anywhere, so it is not necessary to invoke resched_curr()
in this particular case.

This commit therefore makes switched_to_rt() refrain from invoking
resched_curr() when the target CPU is offline.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-22 09:17:54 +01:00
Santosh Mardi
0ce24ae64f cpufreq: schedutil: update warn_on with bug_on
WARN_ON is used in calculating schedutil average capacity
this is called under rq->lock held, when WARN_ON is triggered
it tries to wake_up the process on the same CPU for debug
prints inturn waits for the rq->lock triggering dead lock

Update WARN_ON with BUG_ON avoiding the debug prints.

Change-Id: I8db35a2165e68765b4ab2f132a571ad00311ba25
Signed-off-by: Santosh Mardi <gsantosh@codeaurora.org>
2018-03-21 12:06:07 +05:30
Pavankumar Kondeti
d8c5bfcc07 sched: Make sure window start passed to schedutil is consistent
The scheduler sends some notifications for individual CPUs. If these
events happen close to the window boundary, the window_start passed
to schedutil may go out of order.

Change-Id: Ib6338bd4bc3b1c41c424ec959056c57c0ed7507e
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-03-21 11:33:46 +05:30
Pavankumar Kondeti
93b3f0f674 Merge remote-tracking branch 'dev/msm-4.9-sched' into msm-4.9
* origin/dev/msm-4.9-sched:
  ARM: dts: msm: Update silver CPUs's idle cost data for SDM845v2
  sched/fair: prevent possible infinite loop in sched_group_energy
  sched: Update WALT stats also in boosted_cpu_util()
  sched/fair: Add provision to control the spreading on SMP
  sched/fair: Add EAS_USE_NEED_IDLE feature
  sched/fair: Turn off FBT_STRICT_ORDER feature
  sched/fair: fix incorrect CPU selection for non latency sensitive tasks
  sched: Add task placement snapshot
  sched/fair: select the most energy-efficient CPU candidate on wake-up
  sched/fair: fix array out of bounds access in select_energy_cpu_idx()
  sched/fair: use min capacity when evaluating active cpus
  sched/fair: use min capacity when evaluating idle backup cpus
  sched/fair: use min capacity when evaluating placement energy costs
  sched/fair: introduce minimum capacity capping sched feature
  arm/topology: link arch_scale_min_freq_capacity to cpufreq
  arm64/topology: link arch_scale_min_freq_capacity to cpufreq
  sched: add arch_scale_min_freq_capacity to track minimum capacity caps
  cpufreq: add scaled minimum capacity tracking for policy changes
  arm64: enable max frequency capping
  arm: enable max frequency capping
  cpufreq: implement max frequency capping
  sched/fair: introduce an arch scaling function for max frequency capping
  cpufreq: remove max frequency capping from scale_freq_capacity()
  Revert "ANDROID: cpufreq: Max freq invariant scheduler load-tracking and cpu capacity support"
  Revert "ANDROID: arm: Enable max freq invariant scheduler load-tracking and capacity support"
  Revert "ANDROID: arm64: Enable max freq invariant scheduler load-tracking and capacity support"
  sched/fair: reduce rounding errors in energy computations
  sched/fair: re-factor energy_diff to use a single (extensible) energy_env
  sched/fair: cleanup select_energy_cpu_brute to be more consistent
  sched/fair: remove capacity tracking from energy_diff
  sched/fair: remove energy_diff tracepoint in preparation to re-factoring
  sched/fair: use *p to reference task_structs
  sched: EAS: Fix the calculation of group util in group_idle_state()
  cpufreq: Drop schedfreq governor
  sched: Sync EAS codebase to android-4.9
  sched: Move core_ctl callback from tick to WALT IRQ work
  sched: fair: Always use energy aware wakeups
  sched: Introduce new workload differentiation
  sched: Introduce a different version of freq aggregation
  sched: report group load to the cpufreq
  sched: Start reporting top task load to cpufreq
  sched: Introduce scheduler boost related placement changes
  sched: Port boost setting mechanisms to EAS
  sched: integrate core_ctl with EAS
  sched: EAS: add infrastructure for core_ctl
  sched: EAS: add core isolation support
  sched: Introduce an irq_work to report WALT load
  sched: introduce small wakee task on waker
  sched: fair: Ignore energy-diff calculations for colocated tasks
  sched: change default group upmigrate and downmigrate values
  sched: EAS: colocate related threads
  sched: cpufreq: Use per_cpu_ptr instead of this_cpu_ptr when reporting load
  sched: cpufreq: Use sched_clock instead of rq_clock when updating schedutil
  sched: cpufreq: Update cpufreq once in a WALT window
  ARM: dts: msm: Add an energy model for the SDM845 CPUs

Conflicts:
	kernel/sched/fair.c

Change-Id: Ieaeecb28e57955db3b13d6d9c1d81b204caf0fcf
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-03-21 10:32:45 +05:30
Chris Redpath
6a6dad243c sched/fair: prevent possible infinite loop in sched_group_energy
There is a race between hotplug and energy_diff which might result
in endless loop in sched_group_energy. When this happens, the end
condition cannot be detected.

We can store how many CPUs we need to visit at the beginning, and
bail out of the energy calculation if we visit more cpus than expected.

Bug: 72311797 72202633
Change-Id: I8dda75468ee1570da4071cd8165ef5131a8205d8
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Git-commit: 8a174b4749
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-03-20 07:22:20 +05:30
Pavankumar Kondeti
12912ba1db sched: Update WALT stats also in boosted_cpu_util()
The schedutil is calling twice into the scheduler while querying
the load of each CPU. The cpu_util_freq() is called first to
get the CPU un-boosted load and WALT stats. Then boosted_cpu_util()
is called to get the boosted CPU load. This results in doing the
the same calculations twice in the scheduler and printing
sched_load_to_gov trace point twice. Fix this inefficiency by
changing boosted_cpu_util() to update WALT stats along with
boosted CPU load.

Change-Id: Ia825cafca6a25c56b0edb1ae8c55e7c7277f2968
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-03-20 07:22:19 +05:30
Pavankumar Kondeti
e1856dadfd sched/fair: Add provision to control the spreading on SMP
Spreading the tasks across CPUs in different groups/clusters
may hurt power on some SMP systems. Start spreading the tasks
only if all the CPUs in the primary group/cluster are utilized
above sched_smp_overlap_capacity threshold.

When placement boost is active, the task is placed on the
least loaded CPU across all groups/clusters.

Change-Id: I5da67ed9230f651921bd901cbcc4e0cc55064155
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-03-20 07:22:19 +05:30
Pavankumar Kondeti
f07ab01332 sched/fair: Add EAS_USE_NEED_IDLE feature
The schedtune.prefer_idle flag makes scheduler to look for an idle
CPU across all clusters. Where as the need_idle flag limits the
search to the lower capacity cluster which strikes a better
balance between power and performance compared to prefer_idle.

Add EAS_USE_NEED_IDLE feature, when enabled need_idle path is
enforced for tasks that has schedtune.prefer_idle set.

Change-Id: Ic83cd9f8bae1112efa4e62de3c60a86ba9c777d7
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-03-20 07:22:18 +05:30
Pavankumar Kondeti
ec7e154d07 sched/fair: Turn off FBT_STRICT_ORDER feature
Turning off FBT_STRICT_ORDER feature selects whichever of target CPU
or backup CPU that saves the most energy. The ftrace analysis
indicate that excessive packing can be avoided by placing the
task on backup CPU when it is idle.

Change-Id: I281c599f6236aa8c9b41c56509a533048a01da91
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-03-20 07:22:18 +05:30
Pavankumar Kondeti
602c51ea9d sched/fair: fix incorrect CPU selection for non latency sensitive tasks
The Non latency sensitive tasks CPU selection targets for an active
CPU in the little cluster. The shallowest c-state CPU is stored as
a backup. However if all CPUs in the little cluster are idle, we pick
an active CPU in the BIG cluster as the target CPU. This incorrect
choice of the target CPU may not get corrected by the
select_energy_cpu_idx() depending on the energy difference between
previous CPU and target CPU.

This can be fixed easily by maintaining the same variable that tracks
maximum capacity of the traversed CPU for both idle and active CPUs.

Change-Id: I3efb8bc82ff005383163921ef2bd39fcac4589ad
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-03-20 07:22:17 +05:30
Pavankumar Kondeti
4fa21cd791 sched: Add task placement snapshot
This snapshot is taken as of msm-4.9 'commit f85a9dec59
("sched: walt: Fix cpu_capacity_orig stuck issue")'.

Change-Id: I5bc0f0648bbab48da0f13600ea2fcbd9c1b7f0e8
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-03-20 07:22:12 +05:30
Linux Build Service Account
6535a91fb1 Merge "sched/walt: Fix use after free in trace_sched_update_task_ravg()" 2018-03-16 03:15:41 -07:00
Linux Build Service Account
afdf2495c7 Merge "sched: Improve the scheduler" 2018-03-15 06:10:40 -07:00
Linux Build Service Account
210c997335 Merge "sched: Use initial_task_util load for new tasks" 2018-03-15 06:09:58 -07:00
Pavankumar Kondeti
e729cba822 sched/walt: Fix use after free in trace_sched_update_task_ravg()
commit 4d09122c18 ("sched: Fix spinlock recursion in sched_exit()")
moved freeing of task's current and previous window arrays outside
the rq->lock. These arrays can be accessed from another CPU in parallel
and end up using freed memory. For example,

CPU#0                                 CPU#1
----------------------------------    -------------------------------
sched_exit()                          try_to_wake_up()--> The task wakes
                                                          up on CPU#0
 task_rq_lock()                        set_task_cpu()
                                        fixup_busy_time() --> waiting for
					                  CPU#0's rq->lock

 task_rq_unlock()                       fixup_busy_time()-->lock acquired
 free_task_load_ptrs()
  kfree(p->ravg.curr_window_cpu)         update_task_ravg()-->called on
                                                          current of CPU#0
					  trace_sched_update_task_ravg()
					          --> access freed memory
  p->ravg.curr_window_cpu = NULL;

To fix this issue, window array pointers must be set to NULL before
freeing the memory. Since this happens outside the lock, memory barriers
are needed on write and read paths. A much simpler alternative would be
skipping update_task_ravg() trace point for tasks that are marked as dead.
The window stats of dead tasks are not updated any ways. While at it, skip
this trace point for newly created tasks for which also window stats are
not updated.

Change-Id: I4d7cb8a3cf7cf84270b09721140d35205643b7ab
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-03-15 13:43:41 +05:30
Lingutla Chandrasekhar
57eb071cd3 sched: Use initial_task_util load for new tasks
Currently, we are exposing the initial_task_util tunable but not
honoring it. Fix it by using the tunable for initial load to newly
created tasks.

Since this tunable is used in end user builds, move the tunable
out of SCHED_DEBUG.

Change-Id: I17a89b7a99d43c9cc230536ad7d9238de9833473
Signed-off-by: Lingutla Chandrasekhar <clingutla@codeaurora.org>
2018-03-12 10:57:05 +05:30