Commit Graph

98 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
10f1d14718 Merge 4.19.86 into android-4.19-q
Changes in 4.19.86
	spi: mediatek: use correct mata->xfer_len when in fifo transfer
	i2c: mediatek: modify threshold passed to i2c_get_dma_safe_msg_buf()
	tee: optee: add missing of_node_put after of_device_is_available
	Revert "OPP: Protect dev_list with opp_table lock"
	net: cdc_ncm: Signedness bug in cdc_ncm_set_dgram_size()
	idr: Fix idr_get_next race with idr_remove
	mm/memory_hotplug: don't access uninitialized memmaps in shrink_pgdat_span()
	mm/memory_hotplug: fix updating the node span
	arm64: uaccess: Ensure PAN is re-enabled after unhandled uaccess fault
	fbdev: Ditch fb_edid_add_monspecs
	bpf, x32: Fix bug for BPF_ALU64 | BPF_NEG
	bpf, x32: Fix bug with ALU64 {LSH, RSH, ARSH} BPF_X shift by 0
	bpf, x32: Fix bug with ALU64 {LSH, RSH, ARSH} BPF_K shift by 0
	bpf, x32: Fix bug for BPF_JMP | {BPF_JSGT, BPF_JSLE, BPF_JSLT, BPF_JSGE}
	net: ovs: fix return type of ndo_start_xmit function
	net: xen-netback: fix return type of ndo_start_xmit function
	ARM: dts: dra7: Enable workaround for errata i870 in PCIe host mode
	ARM: dts: omap5: enable OTG role for DWC3 controller
	net: hns3: Fix for netdev not up problem when setting mtu
	net: hns3: Fix loss of coal configuration while doing reset
	f2fs: return correct errno in f2fs_gc
	ARM: dts: sun8i: h3-h5: ir register size should be the whole memory block
	ARM: dts: sun8i: h3: bpi-m2-plus: Fix address for external RGMII Ethernet PHY
	tcp: up initial rmem to 128KB and SYN rwin to around 64KB
	SUNRPC: Fix priority queue fairness
	ACPI / LPSS: Make acpi_lpss_find_device() also find PCI devices
	ACPI / LPSS: Resume BYT/CHT I2C controllers from resume_noirq
	f2fs: keep lazytime on remount
	IB/hfi1: Error path MAD response size is incorrect
	IB/hfi1: Ensure ucast_dlid access doesnt exceed bounds
	mt76x2: fix tx power configuration for VHT mcs 9
	mt76x2: disable WLAN core before probe
	mt76: fix handling ps-poll frames
	iommu/io-pgtable-arm: Fix race handling in split_blk_unmap()
	iommu/arm-smmu-v3: Fix unexpected CMD_SYNC timeout
	kvm: arm/arm64: Fix stage2_flush_memslot for 4 level page table
	arm64/numa: Report correct memblock range for the dummy node
	ath10k: fix vdev-start timeout on error
	rtlwifi: btcoex: Use proper enumerated types for Wi-Fi only interface
	ata: ahci_brcm: Allow using driver or DSL SoCs
	PM / devfreq: Fix devfreq_add_device() when drivers are built as modules.
	PM / devfreq: Fix handling of min/max_freq == 0
	PM / devfreq: stopping the governor before device_unregister()
	ath9k: fix reporting calculated new FFT upper max
	selftests/tls: Fix recv(MSG_PEEK) & splice() test cases
	usb: gadget: udc: fotg210-udc: Fix a sleep-in-atomic-context bug in fotg210_get_status()
	usb: dwc3: gadget: Check ENBLSLPM before sending ep command
	nl80211: Fix a GET_KEY reply attribute
	irqchip/irq-mvebu-icu: Fix wrong private data retrieval
	watchdog: core: fix null pointer dereference when releasing cdev
	watchdog: renesas_wdt: stop when unregistering
	watchdog: sama5d4: fix timeout-sec usage
	watchdog: w83627hf_wdt: Support NCT6796D, NCT6797D, NCT6798D
	KVM: PPC: Inform the userspace about TCE update failures
	printk: Do not miss new messages when replaying the log
	printk: CON_PRINTBUFFER console registration is a bit racy
	dmaengine: ep93xx: Return proper enum in ep93xx_dma_chan_direction
	dmaengine: timb_dma: Use proper enum in td_prep_slave_sg
	ALSA: hda: Fix mismatch for register mask and value in ext controller.
	ext4: fix build error when DX_DEBUG is defined
	clk: keystone: Enable TISCI clocks if K3_ARCH
	sunrpc: Fix connect metrics
	x86/PCI: Apply VMD's AERSID fixup generically
	mei: samples: fix a signedness bug in amt_host_if_call()
	cxgb4: Use proper enum in cxgb4_dcb_handle_fw_update
	cxgb4: Use proper enum in IEEE_FAUX_SYNC
	powerpc/pseries: Fix DTL buffer registration
	powerpc/pseries: Fix how we iterate over the DTL entries
	powerpc/xive: Move a dereference below a NULL test
	ARM: dts: at91: sama5d4_xplained: fix addressable nand flash size
	ARM: dts: at91: at91sam9x5cm: fix addressable nand flash size
	ARM: dts: at91: sama5d2_ptc_ek: fix bootloader env offsets
	mtd: rawnand: sh_flctl: Use proper enum for flctl_dma_fifo0_transfer
	PM / hibernate: Check the success of generating md5 digest before hibernation
	tools: PCI: Fix compilation warnings
	clocksource/drivers/sh_cmt: Fixup for 64-bit machines
	clocksource/drivers/sh_cmt: Fix clocksource width for 32-bit machines
	ice: Fix forward to queue group logic
	md: allow metadata updates while suspending an array - fix
	ixgbe: Fix ixgbe TX hangs with XDP_TX beyond queue limit
	i40e: Use proper enum in i40e_ndo_set_vf_link_state
	ixgbe: Fix crash with VFs and flow director on interface flap
	IB/mthca: Fix error return code in __mthca_init_one()
	IB/rxe: avoid srq memory leak
	RDMA/hns: Bugfix for reserved qp number
	RDMA/hns: Submit bad wr when post send wr exception
	RDMA/hns: Bugfix for CM test
	RDMA/hns: Limit the size of extend sge of sq
	IB/mlx4: Avoid implicit enumerated type conversion
	rpmsg: glink: smem: Support rx peak for size less than 4 bytes
	msm/gpu/a6xx: Force of_dma_configure to setup DMA for GMU
	OPP: Return error on error from dev_pm_opp_get_opp_count()
	ACPICA: Never run _REG on system_memory and system_IO
	cpuidle: menu: Fix wakeup statistics updates for polling state
	ASoC: qdsp6: q6asm-dai: checking NULL vs IS_ERR()
	powerpc/time: Use clockevents_register_device(), fixing an issue with large decrementer
	powerpc/64s/radix: Explicitly flush ERAT with local LPID invalidation
	ata: ep93xx: Use proper enums for directions
	qed: Avoid implicit enum conversion in qed_ooo_submit_tx_buffers
	media: rc: ir-rc6-decoder: enable toggle bit for Kathrein RCU-676 remote
	media: pxa_camera: Fix check for pdev->dev.of_node
	media: rcar-vin: fix redeclaration of symbol
	media: i2c: adv748x: Support probing a single output
	ALSA: hda/sigmatel - Disable automute for Elo VuPoint
	bnxt_en: return proper error when FW returns HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED
	KVM: PPC: Book3S PR: Exiting split hack mode needs to fixup both PC and LR
	USB: serial: cypress_m8: fix interrupt-out transfer length
	usb: dwc2: disable power_down on rockchip devices
	mtd: physmap_of: Release resources on error
	cpu/SMT: State SMT is disabled even with nosmt and without "=force"
	brcmfmac: reduce timeout for action frame scan
	brcmfmac: fix full timeout waiting for action frame on-channel tx
	qtnfmac: request userspace to do OBSS scanning if FW can not
	qtnfmac: pass sgi rate info flag to wireless core
	qtnfmac: inform wireless core about supported extended capabilities
	qtnfmac: drop error reports for out-of-bounds key indexes
	clk: samsung: Use NOIRQ stage for Exynos5433 clocks suspend/resume
	clk: samsung: exynos5420: Define CLK_SECKEY gate clock only or Exynos5420
	clk: samsung: Use clk_hw API for calling clk framework from clk notifiers
	i2c: brcmstb: Allow enabling the driver on DSL SoCs
	printk: Correct wrong casting
	NFSv4.x: fix lock recovery during delegation recall
	dmaengine: ioat: fix prototype of ioat_enumerate_channels
	media: ov5640: fix framerate update
	media: cec-gpio: select correct Signal Free Time
	gfs2: slow the deluge of io error messages
	i2c: omap: use core to detect 'no zero length' quirk
	i2c: qup: use core to detect 'no zero length' quirk
	i2c: tegra: use core to detect 'no zero length' quirk
	i2c: zx2967: use core to detect 'no zero length' quirk
	Input: st1232 - set INPUT_PROP_DIRECT property
	Input: silead - try firmware reload after unsuccessful resume
	soc: fsl: bman_portals: defer probe after bman's probe
	net: hns3: Fix for rx vlan id handle to support Rev 0x21 hardware
	tc-testing: fix build of eBPF programs
	remoteproc: Check for NULL firmwares in sysfs interface
	remoteproc: qcom: q6v5: Fix a race condition on fatal crash
	kexec: Allocate decrypted control pages for kdump if SME is enabled
	x86/olpc: Fix build error with CONFIG_MFD_CS5535=m
	dmaengine: rcar-dmac: set scatter/gather max segment size
	crypto: mxs-dcp - Fix SHA null hashes and output length
	crypto: mxs-dcp - Fix AES issues
	xfrm: use correct size to initialise sp->ovec
	ACPI / SBS: Fix rare oops when removing modules
	iwlwifi: mvm: don't send keys when entering D3
	xsk: proper AF_XDP socket teardown ordering
	x86/fsgsbase/64: Fix ptrace() to read the FS/GS base accurately
	mmc: renesas_sdhi_internal_dmac: Whitelist r8a774a1
	mmc: tmio: Fix SCC error detection
	mmc: renesas_sdhi_internal_dmac: set scatter/gather max segment size
	atmel_lcdfb: support native-mode display-timings
	fbdev: sbuslib: use checked version of put_user()
	fbdev: sbuslib: integer overflow in sbusfb_ioctl_helper()
	fbdev: fix broken menu dependencies
	reset: Fix potential use-after-free in __of_reset_control_get()
	bcache: account size of buckets used in uuid write to ca->meta_sectors_written
	bcache: recal cached_dev_sectors on detach
	platform/x86: mlx-platform: Properly use mlxplat_mlxcpld_msn201x_items
	media: dw9714: Fix error handling in probe function
	media: dw9807-vcm: Fix probe error handling
	media: cx18: Don't check for address of video_dev
	mtd: spi-nor: cadence-quadspi: Use proper enum for dma_[un]map_single
	mtd: devices: m25p80: Make sure WRITE_EN is issued before each write
	x86/intel_rdt: Introduce utility to obtain CDP peer
	x86/intel_rdt: CBM overlap should also check for overlap with CDP peer
	mmc: mmci: expand startbiterr to irqmask and error check
	s390/kasan: avoid vdso instrumentation
	s390/kasan: avoid instrumentation of early C code
	s390/kasan: avoid user access code instrumentation
	proc/vmcore: Fix i386 build error of missing copy_oldmem_page_encrypted()
	backlight: lm3639: Unconditionally call led_classdev_unregister
	mfd: ti_am335x_tscadc: Keep ADC interface on if child is wakeup capable
	printk: Give error on attempt to set log buffer length to over 2G
	media: isif: fix a NULL pointer dereference bug
	GFS2: Flush the GFS2 delete workqueue before stopping the kernel threads
	media: cx231xx: fix potential sign-extension overflow on large shift
	media: venus: vdec: fix decoded data size
	ALSA: hda/ca0132 - Fix input effect controls for desktop cards
	lightnvm: pblk: fix rqd.error return value in pblk_blk_erase_sync
	lightnvm: pblk: fix incorrect min_write_pgs
	lightnvm: pblk: guarantee emeta on line close
	lightnvm: pblk: fix write amplificiation calculation
	lightnvm: pblk: guarantee mw_cunits on read buffer
	lightnvm: do no update csecs and sos on 1.2
	lightnvm: pblk: fix error handling of pblk_lines_init()
	lightnvm: pblk: consider max hw sectors supported for max_write_pgs
	x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error
	bpf: btf: Fix a missing check bug
	net: fix generic XDP to handle if eth header was mangled
	gpio: syscon: Fix possible NULL ptr usage
	spi: fsl-lpspi: Prevent FIFO under/overrun by default
	pinctrl: gemini: Mask and set properly
	spi: spidev: Fix OF tree warning logic
	ARM: 8802/1: Call syscall_trace_exit even when system call skipped
	x86/mm: Do not warn about PCI BIOS W+X mappings
	orangefs: rate limit the client not running info message
	pinctrl: gemini: Fix up TVC clock group
	scsi: arcmsr: clean up clang warning on extraneous parentheses
	hwmon: (k10temp) Support all Family 15h Model 6xh and Model 7xh processors
	hwmon: (nct6775) Fix names of DIMM temperature sources
	hwmon: (pwm-fan) Silence error on probe deferral
	hwmon: (ina3221) Fix INA3221_CONFIG_MODE macros
	hwmon: (npcm-750-pwm-fan) Change initial pwm target to 255
	selftests: forwarding: Have lldpad_app_wait_set() wait for unknown, too
	net: sched: avoid writing on noop_qdisc
	netfilter: nft_compat: do not dump private area
	misc: cxl: Fix possible null pointer dereference
	mac80211: minstrel: fix using short preamble CCK rates on HT clients
	mac80211: minstrel: fix CCK rate group streams value
	mac80211: minstrel: fix sampling/reporting of CCK rates in HT mode
	spi: rockchip: initialize dma_slave_config properly
	mlxsw: spectrum_switchdev: Check notification relevance based on upper device
	ARM: dts: omap5: Fix dual-role mode on Super-Speed port
	tcp: start receiver buffer autotuning sooner
	ACPI / LPSS: Use acpi_lpss_* instead of acpi_subsys_* functions for hibernate
	PM / devfreq: Fix static checker warning in try_then_request_governor
	tools: PCI: Fix broken pcitest compilation
	powerpc/time: Fix clockevent_decrementer initalisation for PR KVM
	mmc: tmio: fix SCC error handling to avoid false positive CRC error
	x86/resctrl: Fix rdt_find_domain() return value and checks
	Linux 4.19.86

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I40e00f0b53336aba2edc8ed6a696fdf5206fdba7
2019-11-25 10:00:42 +01:00
Rafael J. Wysocki
27ab8f1648 cpuidle: menu: Fix wakeup statistics updates for polling state
[ Upstream commit 5f26bdceb9c0a5e6c696aa2899d077cd3ae93413 ]

If the CPU exits the "polling" state due to the time limit in the
loop in poll_idle(), this is not a real wakeup and it just means
that the "polling" state selection was not adequate.  The governor
mispredicted short idle duration, but had a more suitable state been
selected, the CPU might have spent more time in it.  In fact, there
is no reason to expect that there would have been a wakeup event
earlier than the next timer in that case.

Handling such cases as regular wakeups in menu_update() may cause the
menu governor to make suboptimal decisions going forward, but ignoring
them altogether would not be correct either, because every time
menu_select() is invoked, it makes a separate new attempt to predict
the idle duration taking distinct time to the closest timer event as
input and the outcomes of all those attempts should be recorded.

For this reason, make menu_update() always assume that if the
"polling" state was exited due to the time limit, the next proper
wakeup event for the CPU would be the next timer event (not
including the tick).

Fixes: a37b969a61 "cpuidle: poll_state: Add time limit to poll_idle()"
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-24 08:20:07 +01:00
Morten Rasmussen
3e63fb2c91 ANDROID: sched, cpuidle: Track cpuidle state index in the scheduler
The idle-state of each cpu is currently pointed to by rq->idle_state but
there isn't any information in the struct cpuidle_state that can used to
look up the idle-state energy model data stored in struct
sched_group_energy. For this purpose is necessary to store the idle
state index as well. Ideally, the idle-state data should be unified.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Change-Id: Ib3d1178512735b0e314881f73fb8ccff5a69319f
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
(cherry picked from commit a732c97420e109956c20f34c70b91e6d06f5df31)
[ Fixed trivial cherry-pick conflict in kernel/sched/sched.h ]
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
2018-10-26 11:57:55 +01:00
Rafael J. Wysocki
0fc784fb09 cpuidle: governors: Consolidate PM QoS handling
There is some code duplication related to the PM QoS handling between
the existing cpuidle governors, so move that code to a common helper
function and call that from the governors.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30 23:13:00 +02:00
Rafael J. Wysocki
45f1ff59e2 cpuidle: Return nohz hint from cpuidle_select()
Add a new pointer argument to cpuidle_select() and to the ->select
cpuidle governor callback to allow a boolean value indicating
whether or not the tick should be stopped before entering the
selected state to be returned from there.

Make the ladder governor ignore that pointer (to preserve its
current behavior) and make the menu governor return 'false" through
it if:
 (1) the idle exit latency is constrained at 0, or
 (2) the selected state is a polling one, or
 (3) the expected idle period duration is within the tick period
     range.

In addition to that, the correction factor computations in the menu
governor need to take the possibility that the tick may not be
stopped into account to avoid artificially small correction factor
values.  To that end, add a mechanism to record tick wakeups, as
suggested by Peter Zijlstra, and use it to modify the menu_update()
behavior when tick wakeup occurs.  Namely, if the CPU is woken up by
the tick and the return value of tick_nohz_get_sleep_length() is not
within the tick boundary, the predicted idle duration is likely too
short, so make menu_update() try to compensate for that by updating
the governor statistics as though the CPU was idle for a long time.

Since the value returned through the new argument pointer of
cpuidle_select() is not used by its caller yet, this change by
itself is not expected to alter the functionality of the code.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2018-04-06 09:29:34 +02:00
Rafael J. Wysocki
64bdff6980 PM: cpuidle/suspend: Add s2idle usage and time state attributes
Add a new attribute group called "s2idle" under the sysfs directory
of each cpuidle state that supports the ->enter_s2idle callback
and put two new attributes, "usage" and "time", into that group to
represent the number of times the given state was requested for
suspend-to-idle and the total time spent in suspend-to-idle after
requesting that state, respectively.

That will allow diagnostic information related to suspend-to-idle
to be collected without enabling advanced debug features and
analyzing dmesg output.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-03-29 13:06:08 +02:00
Rafael J. Wysocki
822ffaa581 Merge branches 'pm-cpuidle' and 'pm-opp'
* pm-cpuidle:
  PM: cpuidle: Fix cpuidle_poll_state_init() prototype
  Documentation/ABI: update cpuidle sysfs documentation

* pm-opp:
  opp: cpu: Replace GFP_ATOMIC with GFP_KERNEL in dev_pm_opp_init_cpufreq_table
2018-02-15 12:01:53 +01:00
Rafael J. Wysocki
d7212cfb05 PM: cpuidle: Fix cpuidle_poll_state_init() prototype
Commit f859422075 (x86: PM: Make APM idle driver initialize polling
state) made apm_init() call cpuidle_poll_state_init(), but that only
is defined for CONFIG_CPU_IDLE set, so make the empty stub of it
available for CONFIG_CPU_IDLE unset too to fix the resulting build
issue.

Fixes: f859422075 (x86: PM: Make APM idle driver initialize polling state)
Cc: 4.14+ <stable@vger.kernel.org> # 4.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-02-12 11:34:22 +01:00
Prashanth Prakash
db50a74d81 cpuidle: Add new macro to enter a retention idle state
If a CPU is entering a low power idle state where it doesn't lose any
context, then there is no need to call cpu_pm_enter()/cpu_pm_exit().
Add a new macro(CPU_PM_CPU_IDLE_ENTER_RETENTION) to be used by cpuidle
drivers when they are entering retention state. By not calling
cpu_pm_enter and cpu_pm_exit we reduce the latency involved in
entering and exiting the retention idle states.

CPU_PM_CPU_IDLE_ENTER_RETENTION assumes that no state is lost and
hence CPU PM notifiers will not be called. We may need a broader
change if we need to support partial retention states effeciently.

On ARM64 based Qualcomm Server Platform we measured below overhead for
for calling cpu_pm_enter and cpu_pm_exit for retention states.

workload: stress --hdd #CPUs --hdd-bytes 32M  -t 30
        Average overhead of cpu_pm_enter - 1.2us
        Average overhead of cpu_pm_exit  - 3.1us

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-02 13:48:55 +00:00
Rafael J. Wysocki
7b01463e51 Merge branch 'pm-sleep'
* pm-sleep:
  ACPI / PM: Check low power idle constraints for debug only
  PM / s2idle: Rename platform operations structure
  PM / s2idle: Rename ->enter_freeze to ->enter_s2idle
  PM / s2idle: Rename freeze_state enum and related items
  PM / s2idle: Rename PM_SUSPEND_FREEZE to PM_SUSPEND_TO_IDLE
  ACPI / PM: Prefer suspend-to-idle over S3 on some systems
  platform/x86: intel-hid: Wake up Dell Latitude 7275 from suspend-to-idle
  PM / suspend: Define pr_fmt() in suspend.c
  PM / suspend: Use mem_sleep_labels[] strings in messages
  PM / sleep: Put pm_test under CONFIG_PM_SLEEP_DEBUG
  PM / sleep: Check pm_wakeup_pending() in __device_suspend_noirq()
  PM / core: Add error argument to dpm_show_time()
  PM / core: Split dpm_suspend_noirq() and dpm_resume_noirq()
  PM / s2idle: Rearrange the main suspend-to-idle loop
  PM / timekeeping: Print debug messages when requested
  PM / sleep: Mark suspend/hibernation start and finish
  PM / sleep: Do not print debug messages by default
  PM / suspend: Export pm_suspend_target_state
2017-09-04 00:06:02 +02:00
Rafael J. Wysocki
1b39e3f813 cpuidle: Make drivers initialize polling state
Make the drivers that want to include the polling state into their
states table initialize it explicitly and drop the initialization of
it (which in fact is conditional, but that is not obvious from the
code) from the core.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-30 03:06:33 +02:00
Rafael J. Wysocki
34c2f65b71 cpuidle: Move polling state initialization code to separate file
Move the polling state initialization code to a separate file built
conditionally on CONFIG_ARCH_HAS_CPU_RELAX to get rid of the #ifdef
in driver.c.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-30 03:06:27 +02:00
Rafael J. Wysocki
dc2251bf98 cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol
On some architectures the first (index 0) idle state is a polling
one and it doesn't really save energy, so there is the
CPUIDLE_DRIVER_STATE_START symbol allowing some pieces of
cpuidle code to avoid using that state.

However, this makes the code rather hard to follow.  It is better
to explicitly avoid the polling state, so add a new cpuidle state
flag CPUIDLE_FLAG_POLLING to mark it and make the relevant code
check that flag for the first state instead of using the
CPUIDLE_DRIVER_STATE_START symbol.

In the ACPI processor driver that cannot always rely on the state
flags (like before the states table has been set up) define
a new internal symbol ACPI_IDLE_STATE_START equivalent to the
CPUIDLE_DRIVER_STATE_START one and drop the latter.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-30 03:05:29 +02:00
Rafael J. Wysocki
28ba086ed3 PM / s2idle: Rename ->enter_freeze to ->enter_s2idle
Rename the ->enter_freeze cpuidle driver callback to ->enter_s2idle
to make it clear that it is used for entering suspend-to-idle and
rename the related functions, variables and so on accordingly.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-11 01:29:56 +02:00
Gautham R. Shenoy
9e9fc6f00a cpuidle:powernv: Add helper function to populate powernv idle states.
In the current code for powernv_add_idle_states, there is a lot of code
duplication while initializing an idle state in powernv_states table.

Add an inline helper function to populate the powernv_states[] table
for a given idle state. Invoke this for populating the "Nap",
"Fastsleep" and the stop states in powernv_add_idle_states.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31 08:32:13 +11:00
Jacob Pan
bb8313b603 cpuidle: Allow enforcing deepest idle state selection
When idle injection is used to cap power, we need to override the
governor's choice of idle states.

For this reason, make it possible the deepest idle state selection to
be enforced by setting a flag on a given CPU to achieve the maximum
potential power draw reduction.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-29 14:02:21 +01:00
Daniel Lezcano
e5f1b24587 cpuidle: governors: Remove remaining old module code
The governor's code use try_module_get() and put_module() to refcount
the governor's module. But the governors are not compiled as module.

The refcount does not prevent to switch the governor or unload
a module as they aren't compiled as modules. The code is pointless,
so remove it.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-10-21 14:49:51 +02:00
Sudeep Holla
220276e09b cpuidle: introduce CPU_PM_CPU_IDLE_ENTER macro for ARM{32, 64}
The function arm_enter_idle_state is exactly the same in both generic
ARM{32,64} CPUIdle driver and will be the same even on ARM64 backend
for ACPI processor idle driver. So we can unify it and move it to a
common place by introducing CPU_PM_CPU_IDLE_ENTER macro that can be
used in all places avoiding duplication.

This is in preparation of reuse of the generic cpuidle entry function
for ACPI LPI support on ARM64.

Suggested-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21 23:29:38 +02:00
Catalin Marinas
9bd616e3db cpuidle: Do not access cpuidle_devices when !CONFIG_CPU_IDLE
The cpuidle_devices per-CPU variable is only defined when CPU_IDLE is
enabled. Commit c8cc7d4de7 ("sched/idle: Reorganize the idle loop")
removed the #ifdef CONFIG_CPU_IDLE around cpuidle_idle_call() with the
compiler optimising away __this_cpu_read(cpuidle_devices). However, with
CONFIG_UBSAN && !CONFIG_CPU_IDLE, this optimisation no longer happens
and the kernel fails to link since cpuidle_devices is not defined.

This patch introduces an accessor function for the current CPU cpuidle
device (returning NULL when !CONFIG_CPU_IDLE) and uses it in
cpuidle_idle_call().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: 4.5+ <stable@vger.kernel.org> # 4.5+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-02 23:05:27 +02:00
Xunlei Pang
ba6a860d41 cpuidle/coupled: Remove cpuidle_device::safe_state_index
cpuidle_device::safe_state_index need to be initialized before
use, it should be the same as cpuidle_driver::safe_state_index.

We tackled this issue by removing the safe_state_index from the
cpuidle_device structure and use the one in the cpuidle_driver
structure instead.

Suggested-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-08-28 15:14:54 +02:00
Rafael J. Wysocki
ab232ba570 Merge branches 'pm-sleep' and 'pm-runtime'
* pm-sleep:
  PM / sleep: trace_device_pm_callback coverage in dpm_prepare/complete
  PM / wakeup: add a dummy wakeup_source to record statistics
  PM / sleep: Make suspend-to-idle-specific code depend on CONFIG_SUSPEND
  PM / sleep: Return -EBUSY from suspend_enter() on wakeup detection
  PM / tick: Add tracepoints for suspend-to-idle diagnostics
  PM / sleep: Fix symbol name in a comment in kernel/power/main.c
  leds / PM: fix hibernation on arm when gpio-led used with CPU led trigger
  ARM: omap-device: use SET_NOIRQ_SYSTEM_SLEEP_PM_OPS
  bus: omap_l3_noc: add missed callbacks for suspend-to-disk
  PM / sleep: Add macro to define common noirq system PM callbacks
  PM / sleep: Refine diagnostic messages in enter_state()
  PM / wakeup: validate wakeup source before activating it.

* pm-runtime:
  PM / Runtime: Update last_busy in rpm_resume
  PM / runtime: add note about re-calling in during device probe()
2015-06-19 01:18:02 +02:00
Rafael J. Wysocki
87e9b9f1d8 PM / sleep: Make suspend-to-idle-specific code depend on CONFIG_SUSPEND
Since idle_should_freeze() is defined to always return 'false'
for CONFIG_SUSPEND unset, all of the code depending on it in
cpuidle_idle_call() is not necessary in that case.

Make that code depend on CONFIG_SUSPEND too to avoid building it
when it is not going to be used.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
2015-05-19 02:44:24 +02:00
Rafael J. Wysocki
827a5aefc5 sched / idle: Call default_idle_call() from cpuidle_enter_state()
The check of the cpuidle_enter() return value against -EBUSY
made in call_cpuidle() will not be necessary any more if
cpuidle_enter_state() calls default_idle_call() directly when it
is about to return -EBUSY, so make that happen and eliminate the
check.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
2015-05-14 21:37:47 +02:00
Rafael J. Wysocki
faad384928 sched / idle: Call idle_set_state() from cpuidle_enter_state()
Introduce a wrapper function around idle_set_state() called
sched_idle_set_state() that will pass this_rq() to it as the
first argument and make cpuidle_enter_state() call the new
function before and after entering the target state.

At the same time, remove direct invocations of idle_set_state()
from call_cpuidle().

This will allow the invocation of default_idle_call() to be
moved from call_cpuidle() to cpuidle_enter_state() safely
and call_cpuidle() to be simplified a bit as a result.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
2015-05-14 21:35:10 +02:00
Bartlomiej Zolnierkiewicz
d75e4af14e cpuidle: remove state_count field from struct cpuidle_device
Thomas Schlichter reports the following issue on his Samsung NC20:

"The C-states C1 and C2 to the OS when connected to AC, and additionally
 provides the C3 C-state when disconnected from AC.  However, the number
 of C-states shown in sysfs is fixed to the number of C-states present
 at boot.
   If I boot with AC connected, I always only see the C-states up to C2
   even if I disconnect AC.

   The reason is commit 130a5f6924 (ACPI / cpuidle: remove dev->state_count
   setting).  It removes the update of dev->state_count, but sysfs uses
   exactly this variable to show the C-states.

   The fix is to use drv->state_count in sysfs.  As this is currently the
   last user of dev->state_count, this variable can be completely removed."

Remove dev->state_count as per the above.

Reported-by: Thomas Schlichter <thomas.schlichter@web.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-03 13:15:50 +02:00
Rafael J. Wysocki
ef2b22ac54 cpuidle / sleep: Use broadcast timer for states that stop local timer
Commit 3810631332 (PM / sleep: Re-implement suspend-to-idle handling)
overlooked the fact that entering some sufficiently deep idle states
by CPUs may cause their local timers to stop and in those cases it
is necessary to switch over to a broadcast timer prior to entering
the idle state.  If the cpuidle driver in use does not provide
the new ->enter_freeze callback for any of the idle states, that
problem affects suspend-to-idle too, but it is not taken into account
after the changes made by commit 3810631332.

Fix that by changing the definition of cpuidle_enter_freeze() and
re-arranging of the code in cpuidle_idle_call(), so the former does
not call cpuidle_enter() any more and the fallback case is handled
by cpuidle_idle_call() directly.

Fixes: 3810631332 (PM / sleep: Re-implement suspend-to-idle handling)
Reported-and-tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-03-05 23:13:19 +01:00
Rafael J. Wysocki
124cf9117c PM / sleep: Make it possible to quiesce timers during suspend-to-idle
The efficiency of suspend-to-idle depends on being able to keep CPUs
in the deepest available idle states for as much time as possible.
Ideally, they should only be brought out of idle by system wakeup
interrupts.

However, timer interrupts occurring periodically prevent that from
happening and it is not practical to chase all of the "misbehaving"
timers in a whack-a-mole fashion.  A much more effective approach is
to suspend the local ticks for all CPUs and the entire timekeeping
along the lines of what is done during full suspend, which also
helps to keep suspend-to-idle and full suspend reasonably similar.

The idea is to suspend the local tick on each CPU executing
cpuidle_enter_freeze() and to make the last of them suspend the
entire timekeeping.  That should prevent timer interrupts from
triggering until an IO interrupt wakes up one of the CPUs.  It
needs to be done with interrupts disabled on all of the CPUs,
though, because otherwise the suspended clocksource might be
accessed by an interrupt handler which might lead to fatal
consequences.

Unfortunately, the existing ->enter callbacks provided by cpuidle
drivers generally cannot be used for implementing that, because some
of them re-enable interrupts temporarily and some idle entry methods
cause interrupts to be re-enabled automatically on exit.  Also some
of these callbacks manipulate local clock event devices of the CPUs
which really shouldn't be done after suspending their ticks.

To overcome that difficulty, introduce a new cpuidle state callback,
->enter_freeze, that will be guaranteed (1) to keep interrupts
disabled all the time (and return with interrupts disabled) and (2)
not to touch the CPU timer devices.  Modify cpuidle_enter_freeze() to
look for the deepest available idle state with ->enter_freeze present
and to make the CPU execute that callback with suspended tick (and the
last of the online CPUs to execute it with suspended timekeeping).

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-02-15 19:40:09 +01:00
Rafael J. Wysocki
3810631332 PM / sleep: Re-implement suspend-to-idle handling
In preparation for adding support for quiescing timers in the final
stage of suspend-to-idle transitions, rework the freeze_enter()
function making the system wait on a wakeup event, the freeze_wake()
function terminating the suspend-to-idle loop and the mechanism by
which deep idle states are entered during suspend-to-idle.

First of all, introduce a simple state machine for suspend-to-idle
and make the code in question use it.

Second, prevent freeze_enter() from losing wakeup events due to race
conditions and ensure that the number of online CPUs won't change
while it is being executed.  In addition to that, make it force
all of the CPUs re-enter the idle loop in case they are in idle
states already (so they can enter deeper idle states if possible).

Next, drop cpuidle_use_deepest_state() and replace use_deepest_state
checks in cpuidle_select() and cpuidle_reflect() with a single
suspend-to-idle state check in cpuidle_idle_call().

Finally, introduce cpuidle_enter_freeze() that will simply find the
deepest idle state available to the given CPU and enter it using
cpuidle_enter().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-02-13 23:49:36 +01:00
Len Brown
62c4cf97e8 cpuidle / ACPI: remove unused CPUIDLE_FLAG_TIME_INVALID
CPUIDLE_FLAG_TIME_INVALID is no longer checked
by menu or ladder cpuidle governors, so don't
bother setting or defining it.

It was originally invented to account for the fact that
acpi_safe_halt() enables interrupts to invoke HLT.
That would allow interrupt service routines to be included
in the last_idle duration measurements made in cpuidle_enter_state(),
potentially returning a duration much larger than reality.

But menu and ladder can gracefully handle erroneously large duration
intervals without checking for CPUIDLE_FLAG_TIME_INVALID.
Further, if they don't check CPUIDLE_FLAG_TIME_INVALID, they
can also benefit from the instances when the duration interval
is not erroneously large.

Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-17 02:26:28 +01:00
Daniel Lezcano
b82b6cca48 cpuidle: Invert CPUIDLE_FLAG_TIME_VALID logic
The only place where the time is invalid is when the ACPI_CSTATE_FFH entry
method is not set. Otherwise for all the drivers, the time can be correctly
measured.

Instead of duplicating the CPUIDLE_FLAG_TIME_VALID flag in all the drivers
for all the states, just invert the logic by replacing it by the flag
CPUIDLE_FLAG_TIME_INVALID, hence we can set this flag only for the acpi idle
driver, remove the former flag from all the drivers and invert the logic with
this flag in the different governor.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-12 21:17:27 +01:00
Linus Torvalds
82abb273d8 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS updates from Ralf Baechle:
 - three fixes for 3.15 that didn't make it in time
 - limited Octeon 3 support.
 - paravirtualization support
 - improvment to platform support for Netlogix SOCs.
 - add support for powering down the Malta eval board in software
 - add many instructions to the in-kernel microassembler.
 - add support for the BPF JIT.
 - minor cleanups of the BCM47xx code.
 - large cleanup of math emu code resulting in significant code size
   reduction, better readability of the code and more accurate
   emulation.
 - improvments to the MIPS CPS code.
 - support C3 power status for the R4k count/compare clock device.
 - improvments to the GIO support for older SGI workstations.
 - increase number of supported CPUs to 256; this can be reached on
   certain embedded multithreaded ccNUMA configurations.
 - various small cleanups, updates and fixes

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (173 commits)
  MIPS: IP22/IP28: Improve GIO support
  MIPS: Octeon: Add twsi interrupt initialization for OCTEON 3XXX, 5XXX, 63XX
  DEC: Document the R4k MB ASIC mini interrupt controller
  DEC: Add self as the maintainer
  MIPS: Add microMIPS MSA support.
  MIPS: Replace calls to obsolete strict_strto call with kstrto* equivalents.
  MIPS: Replace obsolete strict_strto call with kstrto
  MIPS: BFP: Simplify code slightly.
  MIPS: Call find_vma with the mmap_sem held
  MIPS: Fix 'write_msa_##' inline macro.
  MIPS: Fix MSA toolchain support detection.
  mips: Update the email address of Geert Uytterhoeven
  MIPS: Add minimal defconfig for mips_paravirt
  MIPS: Enable build for new system 'paravirt'
  MIPS: paravirt: Add pci controller for virtio
  MIPS: Add code for new system 'paravirt'
  MIPS: Add functions for hypervisor call
  MIPS: OCTEON: Add OCTEON3 to __get_cpu_type
  MIPS: Add function get_ebase_cpunum
  MIPS: Add minimal support for OCTEON3 to c-r4k.c
  ...
2014-06-09 18:10:34 -07:00
Paul Burton
f08dbf8a61 cpuidle: declare cpuidle_dev in cpuidle.h
Declaring this allows drivers which need to initialise each struct
cpuidle_device at initialisation time to make use of the structures
already defined in cpuidle.c, rather than having to wastefully define
their own.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
2014-05-28 16:20:35 +01:00
Rafael J. Wysocki
a6220fc19a PM / suspend: Always use deepest C-state in the "freeze" sleep state
If freeze_enter() is called, we want to bypass the current cpuidle
governor and always use the deepest available (that is, not disabled)
C-state, because we want to save as much energy as reasonably possible
then and runtime latency constraints don't matter at that point, since
the system is in a sleep state anyway.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Aubrey Li <aubrey.li@linux.intel.com>
2014-05-07 01:49:28 +02:00
Rafael J. Wysocki
52c324f8a8 cpuidle: Combine cpuidle_enabled() with cpuidle_select()
Since both cpuidle_enabled() and cpuidle_select() are only called by
cpuidle_idle_call(), it is not really useful to keep them separate
and combining them will help to avoid complicating cpuidle_idle_call()
even further if governors are changed to return error codes sometimes.

This code modification shouldn't lead to any functional changes.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-05-01 00:13:47 +02:00
Daniel Lezcano
c8cc7d4de7 sched/idle: Reorganize the idle loop
Now that we have the main cpuidle function in idle.c, move some code from
the idle mainloop to this function for the sake of clarity.

That removes if then else indentation difficult to follow when looking at the
code. This patch does not change the current behavior.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: tglx@linutronix.de
Cc: rjw@rjwysocki.net
Cc: preeti@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1393832934-11625-3-git-send-email-daniel.lezcano@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-03-11 11:52:47 +01:00
Daniel Lezcano
30cdd69e2a cpuidle/idle: Move the cpuidle_idle_call function to idle.c
The cpuidle_idle_call does nothing more than calling the three individuals
function and is no longer used by any arch specific code but only in the
cpuidle framework code.

We can move this function into the idle task code to ensure better
proximity to the scheduler code.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: rjw@rjwysocki.net
Cc: preeti@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1393832934-11625-2-git-send-email-daniel.lezcano@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-03-11 11:52:45 +01:00
Daniel Lezcano
907e30f1bb idle/cpuidle: Split cpuidle_idle_call main function into smaller functions
In order to allow better integration between the cpuidle framework and the
scheduler, reducing the distance between these two sub-components will
facilitate this integration by moving part of the cpuidle code in the idle
task file and, because idle.c is in the sched directory, we have access to
the scheduler's private structures.

This patch splits the cpuidle_idle_call main entry function into 3 calls
to a newly added API:

 1. select the idle state
 2. enter the idle state
 3. reflect the idle state

The cpuidle_idle_call calls these three functions to implement the main
idle entry function.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: rjw@rjwysocki.net
Cc: preeti@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1393832934-11625-1-git-send-email-daniel.lezcano@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-03-11 11:52:45 +01:00
Viresh Kumar
f60e230f6b cpuidle: remove cpuidle_unregister_governor()
cpuidle_unregister_governor() and cpuidle_replace_governor() aren't
used anymore and can be removed. They were used by cpufreq governors
earlier, but since the governors can't be compiled as modules any
more, these two functions aren't necessary.

Suggested-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-30 01:21:24 +01:00
Viresh Kumar
6587fca230 cpuidle: fix indentation of cpumask
Use tabs for cpumask indentation in struct cpuidle_driver.

[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-30 01:21:21 +01:00
Daniel Lezcano
1a7064380e cpuidle: Add missing forward declarations of structures
Add missing forward declarations of struct cpuidle_state_kobj and
struct cpuidle_driver_kobj in cpuidle.h.

[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-07-15 02:09:47 +02:00
Daniel Lezcano
728ce22b69 cpuidle: Make cpuidle's sysfs directory dynamically allocated
The cpuidle sysfs code is designed to have a single instance of per
CPU cpuidle directory.  It is not possible to remove the sysfs entry
and create it again.  This is not a problem with the current code but
future changes will add CPU hotplug support to enable/disable the
device, so it will need to remove the sysfs entry like other
subsystems do.  That won't be possible without this change, because
the kobj is a static object which can't be reused for
kobj_init_and_add().

Add cpuidle_device_kobj to be allocated dynamically when
adding/removing a sysfs entry which is consistent with the other
cpuidle's sysfs entries.

An added benefit is that the sysfs code is now more self-contained
and the includes needed for sysfs can be moved from cpuidle.h
directly into sysfs.c so as to reduce the total number of headers
dragged along with cpuidle.h.

[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-07-15 02:09:47 +02:00
Daniel Lezcano
82467a5a88 cpuidle: simplify multiple driver support
Commit bf4d1b5 (cpuidle: support multiple drivers) introduced support
for using multiple cpuidle drivers at the same time.  It added a
couple of new APIs to register the driver per CPU, but that led to
some unnecessary code complexity related to the kernel config options
deciding whether or not the multiple driver support is enabled.  The
code has to work as it did before when the multiple driver support is
not enabled and the multiple driver support has to be compatible with
the previously existing API.

Remove the new API, not used by any driver in the tree yet (but
needed for the HMP cpuidle drivers that will be submitted soon), and
add a new cpumask pointer to the cpuidle driver structure that will
point to the mask of CPUs handled by the given driver.  That will
allow the cpuidle_[un]register_driver() API to be used for the
multiple driver support along with the cpuidle_[un]register()
functions added recently.

[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-06-11 13:42:38 +02:00
Linus Torvalds
ac4e01093f Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull idle update from Len Brown:
 "Add support for new Haswell-ULT CPU idle power states"

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  intel_idle: initial C8, C9, C10 support
  tools/power turbostat: display C8, C9, C10 residency
2013-05-11 15:23:17 -07:00
Daniel Lezcano
4c637b2175 cpuidle: make a single register function for all
The usual scheme to initialize a cpuidle driver on a SMP is:

	cpuidle_register_driver(drv);
	for_each_possible_cpu(cpu) {
		device = &per_cpu(cpuidle_dev, cpu);
		cpuidle_register_device(device);
	}

This code is duplicated in each cpuidle driver.

On UP systems, it is done this way:

	cpuidle_register_driver(drv);
	device = &per_cpu(cpuidle_dev, cpu);
	cpuidle_register_device(device);

On UP, the macro 'for_each_cpu' does one iteration:

#define for_each_cpu(cpu, mask)                 \
        for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask)

Hence, the initialization loop is the same for UP than SMP.

Beside, we saw different bugs / mis-initialization / return code unchecked in
the different drivers, the code is duplicated including bugs. After fixing all
these ones, it appears the initialization pattern is the same for everyone.

Please note, some drivers are doing dev->state_count = drv->state_count. This is
not necessary because it is done by the cpuidle_enable_device function in the
cpuidle framework. This is true, until you have the same states for all your
devices. Otherwise, the 'low level' API should be used instead with the specific
initialization for the driver.

Let's add a wrapper function doing this initialization with a cpumask parameter
for the coupled idle states and use it for all the drivers.

That will save a lot of LOC, consolidate the code, and the modifications in the
future could be done in a single place. Another benefit is the consolidation of
the cpuidle_device variable which is now in the cpuidle framework and no longer
spread accross the different arch specific drivers.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-23 13:45:22 +02:00
Daniel Lezcano
554c06ba3e cpuidle: remove en_core_tk_irqen flag
The en_core_tk_irqen flag is set in all the cpuidle driver which
means it is not necessary to specify this flag.

Remove the flag and the code related to it.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org>  # for mach-omap2/*
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-23 13:45:22 +02:00
Len Brown
86239ceb33 intel_idle: initial C8, C9, C10 support
Allow intel_idle and cpuidle to utilize C8, C9, C10
when they are present on...
"Fourth Generation Intel(R) Core(TM) Processors",
which are based on Intel(R) microarchitecture code name Haswell.

Signed-off-by: Len Brown <len.brown@intel.com>
2013-04-17 19:23:32 -04:00
Daniel Lezcano
a06df062a1 cpuidle: initialize the broadcast timer framework
The commit 89878baa73f0f1c679355006bd8632e5d78f96c2 introduced
the CPUIDLE_FLAG_TIMER_STOP flag where we specify a specific idle
state stops the local timer.

Now use this flag to check at init time if one state will need
the broadcast timer and, in this case, setup the broadcast timer
framework. That prevents multiple code duplication in the drivers.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01 01:10:28 +02:00
Daniel Lezcano
b60e6a0eb0 cpuidle : handle clockevent notify from the cpuidle framework
When a cpu enters a deep idle state, the local timers are stopped and
the time framework falls back to the timer device used as a broadcast
timer.

The different cpuidle drivers are calling clockevents_notify ENTER/EXIT
when the idle state stops the local timer.

Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle
drivers. If the flag is set, the cpuidle core code takes care of the
notification on behalf of the driver to avoid pointless code duplication.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01 01:10:27 +02:00
Len Brown
57fa8273a2 cpuidle: remove vestage definition of cpuidle_state_usage.driver_data
This field is no longer used.

Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2013-02-11 18:49:51 -05:00
Daniel Lezcano
8aef33a7cf cpuidle: remove the power_specified field in the driver
We realized that the power usage field is never filled and when it
is filled for tegra, the power_specified flag is not set causing all
of these values to be reset when the driver is initialized with
set_power_state().

However, the power_specified flag can be simply removed under the
assumption that the states are always backward sorted, which is the
case with the current code.

This change allows the menu governor select function and the
cpuidle_play_dead() to be simplified.  Moreover, the
set_power_states() function can removed as it does not make sense
any more.

Drop the power_specified flag from struct cpuidle_driver and make
the related changes as described above.

As a consequence, this also fixes the bug where on the dynamic
C-states system, the power fields are not initialized.

[rjw: Changelog]
References: https://bugzilla.kernel.org/show_bug.cgi?id=42870
References: https://bugzilla.kernel.org/show_bug.cgi?id=43349
References: https://lkml.org/lkml/2012/10/16/518
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-01-15 14:18:04 +01:00