kernel_xiaomi_tulip

Author	SHA1	Message	Date
Sultan Alsawaf	5f4b739089	simple_lmk: Report mm as freed as soon as exit_mmap() finishes exit_mmap() is responsible for freeing the vast majority of an mm's memory; in order to unblock Simple LMK faster, report an mm as freed as soon as exit_mmap() finishes. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>	2024-04-16 18:30:20 +07:00
Michal Hocko	9be5f1d1e1	cpuset, mm: fix TIF_MEMDIE check in cpuset_change_task_nodemask Commit `c0ff7453bb` ("cpuset,mm: fix no node to alloc memory when changing cpuset's mems") has added TIF_MEMDIE and PF_EXITING check but it is checking the flag on the current task rather than the given one. This doesn't make much sense and it is actually wrong. If the current task which updates the nodemask of a cpuset got killed by the OOM killer then a part of the cpuset cgroup processes would have incompatible nodemask which is surprising to say the least. The comment suggests the intention was to skip oom victim or an exiting task so we should be checking the given task. But even then it would be layering violation because it is the memory allocator to interpret the TIF_MEMDIE meaning. Simply drop both checks. All tasks in the cpuset should simply follow the same mask. Link: http://lkml.kernel.org/r/1467029719-17602-3-git-send-email-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Cc: Miao Xie <miaoxie@huawei.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-04-16 18:30:20 +07:00
Michal Hocko	88d1dfa883	freezer, oom: check TIF_MEMDIE on the correct task freezing_slow_path() is checking TIF_MEMDIE to skip OOM killed tasks. It is, however, checking the flag on the current task rather than the given one. This is really confusing because freezing() can be called also on !current tasks. It would end up working correctly for its main purpose because __refrigerator will be always called on the current task so the oom victim will never get frozen. But it could lead to surprising results when a task which is freezing a cgroup got oom killed because only part of the cgroup would get frozen. This is highly unlikely but worth fixing as the resulting code would be more clear anyway. Link: http://lkml.kernel.org/r/1467029719-17602-2-git-send-email-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Cc: Miao Xie <miaoxie@huawei.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-04-16 18:30:20 +07:00
Sultan Alsawaf	30a07a4ed8	simple_lmk: Mark victim thread group with TIF_MEMDIE The OOM killer sets the TIF_MEMDIE thread flag for its victims to alert other kernel code that the current process was killed due to memory pressure, and needs to finish whatever it's doing quickly. In the page allocator this allows victim processes to quickly allocate memory using emergency reserves. This is especially important when memory pressure is high; if all processes are taking a while to allocate memory, then our victim processes will face the same problem and can potentially get stuck in the page allocator for a while rather than die expeditiously. To ensure that victim processes die quickly, set TIF_MEMDIE for the entire victim thread group. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>	2024-04-16 18:30:20 +07:00
Sultan Alsawaf	bda0e82652	simple_lmk: Introduce Simple Low Memory Killer for Android This is a complete low memory killer solution for Android that is small and simple. Processes are killed according to the priorities that Android gives them, so that the least important processes are always killed first. Processes are killed until memory deficits are satisfied, as observed from kswapd struggling to free up pages. Simple LMK stops killing processes when kswapd finally goes back to sleep. The only tunables are the desired amount of memory to be freed per reclaim event and desired frequency of reclaim events. Simple LMK tries to free at least the desired amount of memory per reclaim and waits until all of its victims' memory is freed before proceeding to kill more processes. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>	2024-04-16 18:30:18 +07:00
Dmitry Torokhov	af970645c4	CHROMIUM: remove Android's cgroup generic permissions checks The implementation is utterly broken, resulting in all processes being allows to move tasks between sets (as long as they have access to the "tasks" attribute), and upstream is heading towards checking only capability anyway, so let's get rid of this code. BUG=b:31790445,chromium:647994 TEST=Boot android container, examine logcat Change-Id: I2f780a5992c34e52a8f2d0b3557fc9d490da2779 Signed-off-by: Dmitry Torokhov <dtor@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/394967 Reviewed-by: Ricky Zhou <rickyz@chromium.org> Reviewed-by: John Stultz <john.stultz@linaro.org> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-03-12 18:42:34 +00:00
GhostMaster69-dev	181f02c497	Revert "kernel: time: reduce ntp wakeups" This reverts commit `f90e33ca7c`. Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-02-23 15:55:56 +00:00
Daniel Borkmann	57f442207d	bpf: add BPF_J{LT,LE,SLT,SLE} instructions Currently, eBPF only understands BPF_JGT (>), BPF_JGE (>=), BPF_JSGT (s>), BPF_JSGE (s>=) instructions, this means that particularly JLT/JLE counterparts involving immediates need to be rewritten from e.g. X < [IMM] by swapping arguments into [IMM] > X, meaning the immediate first is required to be loaded into a register Y := [IMM], such that then we can compare with Y > X. Note that the destination operand is always required to be a register. This has the downside of having unnecessarily increased register pressure, meaning complex program would need to spill other registers temporarily to stack in order to obtain an unused register for the [IMM]. Loading to registers will thus also affect state pruning since we need to account for that register use and potentially those registers that had to be spilled/filled again. As a consequence slightly more stack space might have been used due to spilling, and BPF programs are a bit longer due to extra code involving the register load and potentially required spill/fills. Thus, add BPF_JLT (<), BPF_JLE (<=), BPF_JSLT (s<), BPF_JSLE (s<=) counterparts to the eBPF instruction set. Modifying LLVM to remove the NegateCC() workaround in a PoC patch at [1] and allowing it to also emit the new instructions resulted in cilium's BPF programs that are injected into the fast-path to have a reduced program length in the range of 2-3% (e.g. accumulated main and tail call sections from one of the object file reduced from 4864 to 4729 insns), reduced complexity in the range of 10-30% (e.g. accumulated sections reduced in one of the cases from 116432 to 88428 insns), and reduced stack usage in the range of 1-5% (e.g. accumulated sections from one of the object files reduced from 824 to 784b). The modification for LLVM will be incorporated in a backwards compatible way. Plan is for LLVM to have i) a target specific option to offer a possibility to explicitly enable the extension by the user (as we have with -m target specific extensions today for various CPU insns), and ii) have the kernel checked for presence of the extensions and enable them transparently when the user is selecting more aggressive options such as -march=native in a bpf target context. (Other frontends generating BPF byte code, e.g. ply can probe the kernel directly for its code generation.) [1] https://github.com/borkmann/llvm/tree/bpf-insns Change-Id: Ic56500aaeaf5f3ebdfda094ad6ef4666c82e18c5 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-02-23 15:53:13 +00:00
Alexei Starovoitov	2c179ea4b2	bpf: free up BPF_JMP \| BPF_CALL \| BPF_X opcode free up BPF_JMP \| BPF_CALL \| BPF_X opcode to be used by actual indirect call by register and use kernel internal opcode to mark call instruction into bpf_tail_call() helper. Change-Id: I1a45b8e3c13848c9689ce288d4862935ede97fa7 Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-02-23 15:53:11 +00:00
Daniel Borkmann	8e9ce5d806	bpf: remove stubs for cBPF from arch code Remove the dummy bpf_jit_compile() stubs for eBPF JITs and make that a single __weak function in the core that can be overridden similarly to the eBPF one. Also remove stale pr_err() mentions of bpf_jit_compile. Change-Id: Iac221c09e9ae0879acdd7064d710c4f7cb8f478d Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-02-23 15:53:10 +00:00
Miklos Szeredi	dd1f74b28f	libfs: support RENAME_NOREPLACE in simple_rename() This is trivial to do: - add flags argument to simple_rename() - check if flags doesn't have any other than RENAME_NOREPLACE - assign simple_rename() to .rename2 instead of .rename Filesystems converted: hugetlbfs, ramfs, bpf. Debugfs uses simple_rename() to implement debugfs_rename(), which is for debugfs instances to rename files internally, not for userspace filesystem access. For this case pass zero flags to simple_rename(). Change-Id: I1a46ece3b40b05c9f18fd13b98062d2a959b76a0 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-02-23 15:50:02 +00:00
Arjan van de Ven	f90e33ca7c	kernel: time: reduce ntp wakeups Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Signed-off-by: YousefAlgadri <yusufgadrie@gmail.com> Signed-off-by: Yousef Algadri <yusufgadrie@gmail.com> Signed-off-by: iusmac <iusico.maxim@libero.it> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-01-31 06:20:30 +00:00
Nitesh Kataria	1955a76cc9	async: Run Hi Priority WQ when system is booting Changes to run high priority workqueue when system is booting. This will help to reduce the kernel boot time. Change-Id: Ibecc2c746016268f5cbf4f4f6f876114050beb54 Signed-off-by: Vivek Kumar <vivekuma@codeaurora.org> Signed-off-by: ankusa <ankusa@codeaurora.org> Signed-off-by: Nitesh Kataria <nkataria@codeaurora.org> Signed-off-by: utsavbalar1231 <utsavbalar1231@gmail.com> Signed-off-by: Dakkshesh <dakkshesh5@gmail.com> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-01-17 15:38:21 +00:00
Paul Walmsley	64ce80af52	sched: reinitialize rq->next_balance when a CPU is hot-added Reinitialize rq->next_balance when a CPU is hot-added. Otherwise, scheduler domain rebalancing may be skipped if rq->next_balance was set to a future time when the CPU was last active, and the newly-re-added CPU is in idle_balance(). As a result, the newly-re-added CPU will remain idle with no tasks scheduled until the softlockup watchdog runs - potentially 4 seconds later. This can waste energy and reduce performance. This behavior can be observed in some SoC kernels, which use CPU hotplug to dynamically remove and add CPUs in response to load. In one case that triggered this behavior, 0. the system started with all cores enabled, running multi-threaded CPU-bound code; 1. the system entered some single-threaded code; 2. a CPU went idle and was hot-removed; 3. the system started executing a multi-threaded CPU-bound task; 4. the CPU from event 2 was re-added, to respond to the load. The time interval between events 2 and 4 was approximately 300 milliseconds. Of course, ideally CPU hotplug would not be used in this manner, but this patch does appear to fix a real bug. Nvidia folks: this patch is submitted as at least a partial fix for bug `1243368` ("[sched] Load-balancing not happening correctly after cores brought online") Change-Id: Iabac21e110402bb581b7db40c42babc951d378d0 Signed-off-by: Paul Walmsley <pwalmsley@nvidia.com> Cc: Peter Boonstoppel <pboonstoppel@nvidia.com> Reviewed-on: http://git-master/r/206918 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Amit Kamath <akamath@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Peter Boonstoppel <pboonstoppel@nvidia.com> Reviewed-by: Diwakar Tundlam <dtundlam@nvidia.com> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-01-07 19:09:39 +00:00
Gaurav Kohli	0af6983f54	kthread/smpboot: Disable irq while setting smpboot thread as running While setting smpboot thread state as running, there is possibility of irq may fire at same core and wake up the smpboot thread of same core which create self deadlock. To avoid that, protect the same with spin_lock_irqsave. Change-Id: I5eca9b27af94fee22af3bb201f26b63ed8930efe Signed-off-by: Gaurav Kohli <gkohli@codeaurora.org> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-01-07 19:07:56 +00:00
Abhijeet Dharmapurikar	5ad714c846	genirq: implement read_irq_line for interrupt lines Some drivers need to know what the status of the interrupt line is. This is especially true for drivers that register a handler with IRQF_TRIGGER_RISING \| IRQF_TRIGGER_FALLING and in the handler they need to know which edge transition it was invoked for. Provide a way for these handlers to read the logical status of the line after their handler was invoked. If the line reads high it was called for a rising edge and if the line reads low it was called for a falling edge. The irq_read_line callback in the chip allows the controller to provide the real time status of this line. Controllers that can read the status of an interrupt line should implement this by doing necessary hardware reads and return the logical state of the line. Interrupt controllers based on the slow bus architecture should conduct the transaction in this callback. The genirq code will call the chip's bus lock prior to calling irq_read_line. Obviously since the transaction would be completed before returning from irq_read_line it need not do any transactions in the bus unlock call. Change-Id: I3c8746706530bba14a373c671d22ee963b84dfab Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-01-07 19:06:46 +00:00
Joseph Lo	cd73afaaa1	CHROMIUM: PM / QoS: add min/max online cpus as PM QoS parameter Adding min/max online cpus as PM QoS parameter Based on work by: Alex Frid <afrid@nvidia.com> Gaurav Sarode <gsarode@nvidia.com> BUG=None TEST=tested on Dalmore and Venice2 Change-Id: I85593ae07861ea15aff588699a549518165ba043 Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/174695 Reviewed-by: Dylan Reid <dgreid@chromium.org> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-01-07 19:06:46 +00:00
Mukesh Ojha	76fd0a8cc5	time: Fix extra sleeptime injection when suspend fails Currently, there exists a corner case assuming when there is only one clocksource e.g RTC, and system failed to go to suspend mode. While resume rtc_resume() injects the sleeptime as timekeeping_rtc_skipresume() returned 'false' (default value of sleeptime_injected) due to which we can see mismatch in timestamps. This issue can also come in a system where more than one clocksource are present and very first suspend fails. Success case: ------------ {sleeptime_injected=false} rtc_suspend() => timekeeping_suspend() => timekeeping_resume() => (sleeptime injected) rtc_resume() Failure case: ------------ {failure in sleep path} {sleeptime_injected=false} rtc_suspend() => rtc_resume() {sleeptime injected again which was not required as the suspend failed} Fix this by handling the boolean logic properly. Change-Id: I7ac5210ec326b41f4d36bb87209b667f21f3aa50 Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Stephen Boyd <sboyd@kernel.org> Originally-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Git-repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git Git-commit: f473e5f467f6049370575390b08dc42131315d60 Signed-off-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-01-07 19:06:32 +00:00
Tyler Nijmeh	b25836ebb5	sched: Do not reduce perceived CPU capacity while idle CPUs that are idle are excellent candidates for latency sensitive or high-performance tasks. Decrementing their capacity while they are idle will result in these CPUs being chosen less, and they will prefer to schedule smaller tasks instead of large ones. Disable this. Signed-off-by: Tyler Nijmeh <tylernij@gmail.com> Signed-off-by: clarencelol <clarencekuiek@icloud.com> Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-01-07 19:05:13 +00:00
Tyler Nijmeh	93b43c69ad	sched: Enable NEXT_BUDDY for better cache locality By scheduling the last woken task first, we can increase cache locality since that task is likely to touch the same data as before. Signed-off-by: Tyler Nijmeh <tylernij@gmail.com> Signed-off-by: clarencelol <clarencekuiek@icloud.com> Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-01-07 19:04:27 +00:00
Tyler Nijmeh	4a86bd1b7b	cpufreq: schedutil: Enforce realtime priority Even the interactive governor utilizes a realtime priority. It is beneficial for schedutil to process it's workload at a >= priority than mundane tasks (KGSL/AUDIO/ETC). Signed-off-by: Tyler Nijmeh <tylernij@gmail.com> Signed-off-by: clarencelol <clarencekuiek@icloud.com> Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2024-01-07 19:04:06 +00:00
Pavankumar Kondeti	b2eac7f4bd	sched/tune: Increase the cgroup limit to 6 The schedtune cgroup controller allows upto 5 cgroups including the default/root cgroup. Until now the user space is creating only 4 additional cgroups namely, foreground, background, top-app and audio-app. Recently another cgroup called rt is created before the audio-app cgroup. Since kernel limits the cgroups to 5, the creation of audio-app cgroup is failing. Fix this by increasing the schedtune cgroup controller cgroup limit to 6. Change-Id: I13252a90dba9b8010324eda29b8901cb0b20bc21 Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 16:47:57 +00:00
Josh Choo	987ff25a8f	sched: Turn on MIN_CAPACITY_CAPPING feature Inform scheduler about capacity restrictions, such as during frequency boosting. Change-Id: Ic65bede69608acf8ca3f144f144049a4392a70f6 Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:11 +00:00
Joel Fernandes	235feafa88	FROMLIST: sched: Make iowait_boost optional in schedutil We should apply the iowait boost only if cpufreq policy has iowait boost enabled. Also make it a schedutil configuration from sysfs so it can be turned on/off if needed (by default initialize it to the policy value). For systems that don't need/want it enabled, such as those on arm64 based mobile devices that are battery operated, it saves energy when the cpufreq driver policy doesn't have it enabled (details below): Here are some results for energy measurements collected running a YouTube video for 30 seconds: Before: 8.042533 mWh After: 7.948377 mWh Energy savings is ~1.2% Bug: 38010527 Link: https://lkml.org/lkml/2017/5/19/42 Change-Id: If124076ad0c16ade369253840dedfbf870aff927 Signed-off-by: Joel Fernandes <joelaf@google.com> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:10 +00:00
Josh Choo	d62729035a	sched/fair: Add bias schedtune boosted tasks sched feature Schedtune boosted tasks are biased to higher capacity CPUs by default. Add a sched feature to enable/disable this behaviour. Change-Id: I3500675c182f3929e893dbb33850fe033db6c146 Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:10 +00:00
joshuous	5919b22cc5	sched/boost: Update functions for newer Dynamic Schedtune Boost changes We now need to pass the functions a boost slot argument. Also we rename the functions to reflect that we intend to perform sched_boost. Change-Id: I84a63aea2c9035267095762804efabf7be6c66d5 Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:08 +00:00
joshuous	8dcd76d547	sched/tune: Switch Dynamic Schedtune Boost to a slot-based tracking system Switch from a counter-based system to a slot-based system for managing multiple dynamic Schedtune boost requests. The primary limitations of the counter-based system was that it could only keep track of two boost values at a time: the current dynamic boost value and the default boost value. When more than one boost request is issued, the system would only remember the highest value of them all. Even if the task that requested the highest value had unboosted, this value is still maintained as long as there are other active boosts that are still running. A more ideal outcome would be for the system to unboost to the maximum boost value of the remaining active boosts. The slot-based system provides a solution to the problem by keeping track of the boost values of all ongoing active boosts. It ensures that the current boost value will be equal to the maximum boost value of all ongoing active boosts. This is achieved with two linked lists (active_boost_slots and available_boost_slots), which assign and keep track of boost slot numbers for each successful boost request. The boost value of each request is stored in an array (slot_boost[]), at an index value equal to the assigned boost slot number. For now we limit the number of active boost slots to 5 per Schedtune group. Change-Id: Iadc738fc919af092fd4c1b6312becf9567bc4c62 Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:07 +00:00
joshuous	183063591f	sched/stune: Rename stune_boost() to do_stune_sched_boost() To reflect that the function is to be used mainly with CAF's devices that have sched_boost. However, developers may use it as a switch to dynamically boost schedtune to the values specified in /dev/stune/*/schedtune.sched_boost. Change-Id: I5012273e5572c6091a99a6954452bed3a2501c55 Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:07 +00:00
joshuous	9563a62aad	sched/tune: Rename dynamic_boost parameter to sched_boost This was confusing to deal with given that it had the same name as the Dynamic Schedtune Boost framework. It will be more apt to call it sched_boost given that it was created to work with the sched_boost feature in CAF devices. The new tunable can be found in /dev/stune/*/schedtune.sched_boost Change-Id: Iafa3e35ef7c7991f09595ba452d8050ddc694743 Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:07 +00:00
joshuous	26302c604c	sched/tune: Track active boosts on a per-Schedtune basis It does not make sense to be unable to reset Schedtune boost for a particular Schedtune group if another Schedtune group's boost is still active. Instead of using a global count, we should use a per-Schedtune group count to keep track of active boosts taking place. Change-Id: Ic47ccd2582dbb31aa245a13d301ddf538b0d318b Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:06 +00:00
joshuous	ab9f0e7c6b	sched/tune: Reset Dynamic Schedtune Boost only if no more boosts running We will need to take care to ensure that every do_stune_boost() we call is followed eventually by a reset_stune_boost() so that stune_boost_count is managed correctly. This allows us to stack several Dynamic Schedtune Boosts and reset only when all Dynamic Schedtune Boosts have been disengaged. Change-Id: I09b739e4503930eaf0e3f14870758b21ce9868f5 Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:06 +00:00
joshuous	f621bfbd67	sched/boost: Perform SchedTune boosting when sched_boost is triggered Boost top-app SchedTune tasks using the dynamic_boost value when /proc/sys/kernel/sched_boost is activated. This is usually triggered by CAF's perf daemon. Change-Id: I23f0e7822673230288fbaeda0a7f4aa8546bf7d3 Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:05 +00:00
joshuous	3ac5e76c78	sched/boost: Re-introduce sched_boost proc from HMP We will use this in conjunction with CAF's perf daemon to somewhat replicate core_ctl's sched_boost capabilities. Credits to the developers at Codeaurora for the code. Change-Id: Ifc4f76e02eed97ac2c5fc8c9a60e56c09aed6578 Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:05 +00:00
joshuous	c634e8d7c4	sched/tune: Introduce stune_boost() function Add a simple function to activate Dynamic Schedtune Boost and use the dynamic_boost value of the SchedTune CGroup. Change-Id: I106c1ad169419a575df400fc511b4be046b52152 Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:04 +00:00
joshuous	ca4dc5a69c	sched/tune: Refactor do_stune_boost() For added flexibility and in preparation for introducing another function. Change-Id: Ic95ba54e1549b0b70222c82a5ee1e164340e3258 Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:04 +00:00
joshuous	e04e99e553	sched/tune: Create dynamic_boost SchedTune parameter Change-Id: I89b4b1cd9cbb6820e1bce4626ce64d5dcde8b975 Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:04 +00:00
joshuous	e091cd03a9	sched/tune: Rename dynamic_boost_write() to dynamic_boost() This is to reduce confusion when we create a new dynamic_boost_write() function in future patches. Change-Id: I0cef57875a193034ce4a7dab6769449c9c0cda8a Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:03 +00:00
joshuous	2d96c4126c	sched/tune: Add initial support for Dynamic SchedTune Boost Provide functions to activate and reset SchedTune boost: int do_stune_boost(char st_name, int boost); int reset_stune_boost(char st_name); Change-Id: Id3f93a63b7a94a08b124cb304bc0ffe9cc889d7a Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:02 +00:00
Joel Fernandes	1773704ee3	sched/fair: Fix issue where frequency update not skipped This patch fixes one of the infrequent conditions in commit 54b6baeca500 ("sched/fair: Skip frequency updates if CPU about to idle") where we could have skipped a frequency update. The fix is to use the correct flag which skips freq updates. Note that this is a rare issue (can show up only during CFS throttling) and even then we just do an additional frequency update which we were doing anyway before the above patch. Bug: 64689959 Change-Id: I0117442f395cea932ad56617065151bdeb9a3b53 Signed-off-by: Joel Fernandes <joelaf@google.com> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:02 +00:00
Chris Redpath	01cf171fb3	ANDROID: Move schedtune en/dequeue before schedutil update triggers CPU rq util updates happen when rq signals are updated as part of enqueue and dequeue operations. Doing these updates triggers a call to the registered util update handler, which takes schedtune boosting into account. Enqueueing the task in the correct schedtune group after this happens means that we will potentially not see the boost for an entire throttle period. Move the enqueue/dequeue operations for schedtune before the signal updates which can trigger OPP changes. Change-Id: I4236e6b194bc5daad32ff33067d4be1987996780 Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:02 +00:00
Joel Fernandes	73aaf0fe17	sched/fair: Skip frequency updates if CPU about to idle If CPU is about to idle, prevent a frequency update. With the number of schedutil governor wake ups are reduced by more than half on a test playing bluetooth audio. Test: sugov wake ups drop by more than half when playing music with screen off (476 / 1092) Bug: 64689959 Change-Id: I400026557b4134c0ac77f51c79610a96eb985b4a Signed-off-by: Joel Fernandes <joelaf@google.com> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:01 +00:00
Josh Choo	da2d1461ab	sched: Add stub function for core_ctl_set_boost Needed to load the stock qcacld kernel module. Change-Id: I6eae01471efc53874d1481c97e29894c2443412c Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:01 +00:00
Josh Choo	27a7fa13e9	sched: Add stub functions for wake_up_idle API Needed to load the stock qcacld kernel module. Change-Id: I9d63a81699ab498757dfd6dd8ee0e304a0d9b472 Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:48:01 +00:00
Joonwoo Park	6e3caa829c	sched: energy: calculate and update CPU capacity dynamically One SoC can have multiple CPU speedbins which cannot be represented with current energy model due to fixed capacity per CPU frequency steps. Provide CPU's all possible frequency steps instead of capacities along with corresponding energy costs to be able to support different speedbins. Change-Id: I96ff01372da5c383cd3172999ea1dcf95a7862ce Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> Signed-off-by: therootlord <igor_cestari@hotmail.com> [kdrag0n: added missing sched_feat(ENERGY_AWARE) check] Signed-off-by: kdrag0n <dragon@khronodragon.com> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:47:59 +00:00
Ionela Voinescu	2ba56d6d43	ANDROID: sched/fair: add idle state filter to prefer_idle case The CPU selection process for a prefer_idle task either minimizes or maximizes the CPU capacity for idle CPUs depending on the task being boosted or not. Given that we are iterating through all CPUs, additionally filter the choice by preferring a CPU in a more shallow idle state. This will provide both a faster wake-up for the task and higher energy efficiency, by allowing CPUs in deeper idle states to remain idle. Change-Id: Ic55f727a0c551adc0af8e6ee03de6a41337a571b Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:47:58 +00:00
Pavankumar Kondeti	9eb7c227d8	ANDROID: sched/fair: fix CPU selection for non latency sensitive tasks The Non latency sensitive tasks CPU selection targets for an active CPU in the little cluster. The shallowest c-state CPU is stored as a backup. However if all CPUs in the little cluster are idle, we pick an active CPU in the BIG cluster as the target CPU. This incorrect choice of the target CPU may not get corrected by the select_energy_cpu_idx() depending on the energy difference between previous CPU and target CPU. This can be fixed easily by maintaining the same variable that tracks maximum capacity of the traversed CPU for both idle and active CPUs. Change-Id: I3efb8bc82ff005383163921ef2bd39fcac4589ad Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:47:58 +00:00
Ionela Voinescu	0fecaf304a	ANDROID: sched/fair: unify spare capacity calculation Given that we have a few sites where the spare capacity of a CPU is calculated as the difference between the original capacity of the CPU and its computed new utilization, let's unify the calculation and use that value tracked with a local spare_cap variable. Change-Id: I78daece7543f78d4f74edbee5e9ceb62908af507 Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:47:58 +00:00
Ionela Voinescu	461c0cf4d2	ANDROID:sched/fair: prefer energy efficient CPUs for !prefer_idle tasks For !prefer_idle tasks we want to minimize capacity_orig to bias their scheduling towards more energy efficient CPUs. This does not happen in the current code for boosted tasks due the order of CPUs considered (from big CPUs to LITTLE CPUs), and to the shallow idle state and spare capacity maximization filters, which are used to select the best idle backup CPU and the best active CPU candidates. Let's fix this by enabling the above filters only when we are within same capacity CPUs. Taking in part each of the two cases: 1. Selection of a backup idle CPU - Non prefer_idle tasks should prefer more energy efficient CPUs when there are idle CPUs in the system, independent of the order given by the presence of a boosted margin. This is the behavior for !sysctl_sched_cstate_aware and this should be the behaviour for when sysctl_sched_cstate_aware is set as well, given that we should prefer a more efficient CPU even if it's in a deeper idle state. 2. Selection of an active target CPU: There is no reason for boosted tasks to benefit from a higher chance to be placed on big CPU which is provided by ordering CPUs from bigs to littles. The other mechanism in place set for boosted tasks (making sure we select a CPU that fits the task) is enough for a non latency sensitive case. Also, by choosing a CPU with maximum spare capacity we also cover the preference towards spreading tasks, rather than packing them, which improves the chances for tasks to get better performance due to potential reduced preemption. Therefore, prefer more energy efficient CPUs and only consider spare capacity for CPUs with equal capacity_orig. Change-Id: I3b97010e682674420015e771f0717192444a63a2 Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Reviewed-by: Patrick Bellasi <patrick.bellasi@arm.com> Reported-by: Viresh Kumar <viresh.kumar@linaro.org> Reported-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:47:57 +00:00
Ionela Voinescu	b81d92370b	ANDROID: sched/fair: remove order from CPU selection find_best_target is currently split into code handling latency sensitive tasks and code handling non-latency sensitive tasks based on the value of the prefer_idle flag. Another differentiation is done for boosted tasks, preferring to start with higher-capacity CPU when boosted, and with more efficient CPUs when not boosted. This additional differentiation is obtained by imposing an order when considering CPUs for selection. This order is determined in typical big.LITTLE systems by the start point (the CPU with the maximum or minimum capacity) and by the order of big and little CPU groups provided in the sched domain hierarchy. However, it's not guaranteed that the sched domain hierarchy will give us a sorted list of CPU groups based on their maximum capacities when dealing with systems with more than 2 capacity groups. For example, if we consider a system with three groups of CPUs (LITTLEs, mediums, bigs), the sched domain hierarchy might provide the following scheduling groups ordering for a prefer_idle-boosted task: big CPUs -> LITTLE CPUs -> medium CPUs. If the big CPUs are not idle, but there are a few LITTLEs and mediums as idle CPUs, by returning the first idle CPU, we will be incorrectly prefering a lower capacity CPU over a higher capacity CPU. In order to eliminate this reliance on assuming sched groups are ordered by capacity, let's: 1. Iterate though all candidate CPUs for all cases. 2. Minimise or maximise the capacity of the considered CPU, depending on prefer_idle and boost information. Taking in part each of the four possible cases we analyse the implementation and impact of this solution: 1. prefer_idle and boosted This type of tasks needs to favour the selection of a reserved idle CPU, and thus we still start from the biggest CPU in the system, but we iterate though all CPUs as to correctly handle the example above by maximising the capacity of the idle CPU we select. When all CPUs are active, we already iterate though all CPUs and we're able to maximise spare capacity or minimise utilisation for the considered target or backup CPU. 2. prefer_idle and !boosted For these tasks we prefer the selection of a more energy efficient CPU and therefore we start from the smallest CPUs in the system, but we iterate through all the CPUs as to select the most energy efficient idle CPU, implementation which mimics existing behaviour. When all CPUs are active, we already iterate though all CPUs and we're able to maximise spare capacity or minimise utilisation for the considered target or backup CPU. 3. !prefer_idle and boosted and 4. !prefer_idle and !boosted For these tasks we already iterate though all CPUs and we're able to maximise the energy efficiency of the selected CPU. Change-Id: I940399e22eff29453cba0e2ec52a03b17eec12ae Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Reviewed-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:47:57 +00:00
Pavankumar Kondeti	8356807fbc	ANDROID: sched: WALT: Add support for CFS_BANDWIDTH cumulative runnable average is maintained in cfs_rq along with rq so that when a cfs_rq is throttled/unthrottled, the contribution of that cfs_rq can be updated at the rq level. Implement the fixup_cumulative_runnable_avg callback for fair class to handle the cfs_rq cumulative runnable average updates when the runnable tasks demand is changed. Bug: 139071966 Change-Id: Iccd473677cf491920aa82a6fc7e0a5374e5bb27f Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Todd Kjos <tkjos@google.com> Signed-off-by: GhostMaster69-dev <rathore6375@gmail.com>	2023-12-29 14:47:56 +00:00

1 2 3 4 5 ...

24438 Commits