commit d33d26036a0274b472299d7dcdaa5fb34329f91b upstream.
rt_mutex_handle_deadlock() is called with rt_mutex::wait_lock held. In the
good case it returns with the lock held and in the deadlock case it emits a
warning and goes into an endless scheduling loop with the lock held, which
triggers the 'scheduling in atomic' warning.
Unlock rt_mutex::wait_lock in the dead lock case before issuing the warning
and dropping into the schedule for ever loop.
[ tglx: Moved unlock before the WARN(), removed the pointless comment,
massaged changelog, added Fixes tag ]
Fixes: 3d5c9340d1 ("rtmutex: Handle deadlock detection smarter")
Signed-off-by: Roland Xu <mu001999@outlook.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/ME0P300MB063599BEF0743B8FA339C2CECC802@ME0P300MB0635.AUSP300.PROD.OUTLOOK.COM
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 432efdbe7da5ecfcbc0c2180cfdbab1441752a38)
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
[ Upstream commit bccdd808902f8c677317cec47c306e42b93b849e ]
In some cases running with the test-ww_mutex code, I was seeing
odd behavior where sometimes it seemed flush_workqueue was
returning before all the work threads were finished.
Often this would cause strange crashes as the mutexes would be
freed while they were being used.
Looking at the code, there is a lifetime problem as the
controlling thread that spawns the work allocates the
"struct stress" structures that are passed to the workqueue
threads. Then when the workqueue threads are finished,
they free the stress struct that was passed to them.
Unfortunately the workqueue work_struct node is in the stress
struct. Which means the work_struct is freed before the work
thread returns and while flush_workqueue is waiting.
It seems like a better idea to have the controlling thread
both allocate and free the stress structures, so that we can
be sure we don't corrupt the workqueue by freeing the structure
prematurely.
So this patch reworks the test to do so, and with this change
I no longer see the early flush_workqueue returns.
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230922043616.19282-3-jstultz@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Disable "held locks too much" AEE dump.
This type of AEE dump is not common and may not be a real problem.
MTK-Commit-Id: 91037109341b886cd3ff9a84d3445b2fc34a1cc8
Change-Id: I5a1d95a7c676fbddfe8e5d1efeff40aa0bf1a351
Signed-off-by: Cheng Jui Wang <cheng-jui.wang@mediatek.com>
CR-Id: ALPS05382757
Feature: [Module]Lockdep
(cherry picked from commit 4587e78d4410fd1889bfc2ac87a10c3c50ac73ca)
[ Upstream commit a7ef9b28aa8d72a1656fa6f0a01bbd1493886317 ]
Though the number of lock-acquisitions is tracked as unsigned long, this
is passed as the divisor to div_s64() which interprets it as a s32,
giving nonsense values with more than 2 billion acquisitons. E.g.
acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg
-------------------------------------------------------------------------
2350439395 0.07 353.38 649647067.36 0.-32
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20200725185110.11588-1-chris@chris-wilson.co.uk
Signed-off-by: Sasha Levin <sashal@kernel.org>
Its a bad idea to do anything while the IRQ events are not match
with the CPU IRQ status. Lets fix it.
MTK-Commit-Id: b2a5405af7ff0e3e364854ded34b96608365182d
Change-Id: Ib56becc7abe66dad4de4d071964182b7caa9489d
Signed-off-by: Cheng Jui Wang <cheng-jui.wang@mediatek.com>
CR-Id: ALPS05224425
Feature: [Module]Lockdep
Improve lock monitor to differentiate between lock owner
and others automatically and enhance test cases for this
modification.
MTK-Commit-Id: ecce08578297309430203934b995c73d50f5a796
Change-Id: I5de23c5f5a0c204ebdb0cd06886c44783b477da1
CR-Id: ALPS05139516
Feature: [Module]Lockdep
Signed-off-by: Kobe Wu <kobe-cp.wu@mediatek.com>
The new method to dump task stack is not stable.
So we revert the function to the old method.
MTK-Commit-Id: 549d7b6565122f069474a20b28f988267de31d56
Change-Id: I74d615e69d0ef0c64ace3d294faa71d869edd6d3
CR-Id: ALPS05100325
Feature: [Module]Lockdep
Signed-off-by: Kobe Wu <kobe-cp.wu@mediatek.com>
commit 80c503e0e68fbe271680ab48f0fe29bc034b01b7 upstream.
The __torture_print_stats() function in locktorture.c carefully
initializes local variable "min" to statp[0].n_lock_acquired, but
then compares it to statp[i].n_lock_fail. Given that the .n_lock_fail
field should normally be zero, and given the initialization, it seems
reasonable to display the maximum and minimum number acquisitions
instead of miscomputing the maximum and minimum number of failures.
This commit therefore switches from failures to acquisitions.
And this turns out to be not only a day-zero bug, but entirely my
own fault. I hate it when that happens!
Fixes: 0af3fe1efa ("locktorture: Add a lock-torture kernel module")
Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 25016bd7f4caf5fc983bbab7403d08e64cba3004 ]
Qian Cai reported a bug when PROVE_RCU_LIST=y, and read on /proc/lockdep
triggered a warning:
[ ] DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
...
[ ] Call Trace:
[ ] lock_is_held_type+0x5d/0x150
[ ] ? rcu_lockdep_current_cpu_online+0x64/0x80
[ ] rcu_read_lock_any_held+0xac/0x100
[ ] ? rcu_read_lock_held+0xc0/0xc0
[ ] ? __slab_free+0x421/0x540
[ ] ? kasan_kmalloc+0x9/0x10
[ ] ? __kmalloc_node+0x1d7/0x320
[ ] ? kvmalloc_node+0x6f/0x80
[ ] __bfs+0x28a/0x3c0
[ ] ? class_equal+0x30/0x30
[ ] lockdep_count_forward_deps+0x11a/0x1a0
The warning got triggered because lockdep_count_forward_deps() call
__bfs() without current->lockdep_recursion being set, as a result
a lockdep internal function (__bfs()) is checked by lockdep, which is
unexpected, and the inconsistency between the irq-off state and the
state traced by lockdep caused the warning.
Apart from this warning, lockdep internal functions like __bfs() should
always be protected by current->lockdep_recursion to avoid potential
deadlocks and data inconsistency, therefore add the
current->lockdep_recursion on-and-off section to protect __bfs() in both
lockdep_count_forward_deps() and lockdep_count_backward_deps()
Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200312151258.128036-1-boqun.feng@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
The trace_hardirqs_off_time() and trace_hardirqs_on_time()
called before trace_hardirqs_off_caller and
trace_hardirqs_on_caller will generate warning when testing
current->hardirqs_enabled flag in arm architecture
Move xxx_time() to the next line of xxx_caller()
MTK-Commit-Id: 71051fa7b93e414e1f1181690bb425e8a48b283a
Change-Id: I156096a95d3b6f9a19660c779bc44c55708e1b74
CR-Id: ALPS05095852
Feature: [Module]Lockdep
Signed-off-by: Cheng Jui Wang <cheng-jui.wang@mediatek.com>
Spinlock debugging is bundled to CONFIG_DEBUG_SPINLOCK.
Lock monitor is bundled to CONFIG_LOCKDEP.
CONFIG_DEBUG_LOCK_ALLOC is the basic option to enable
CONFIG_LOCKDEP (Lock monitor). The other choices are
CONFIG_LOCK_STAT and CONFIG_PROVE_LOCKING.
MTK-Commit-Id: 58b4cd9af33bdfd797320eb147d8bb7b726667ee
Change-Id: Ie50a28a379b4174e75c018a7d0481255c7d30231
Signed-off-by: Kobe Wu <kobe-cp.wu@mediatek.com>
CR-Id: ALPS04994668
Feature: [Module]Lockdep
1. set a limitation to prevent from log too much
2. add aee dump on held lock too much
3. ignore performance warning on KASAN/UBSAN load
4. increase size of lockdep chain buffer
5. refine function to dump task stack
6. refine mt_aee_dump_held_locks
MTK-Commit-Id: 5d2ddb79008c3919483973c2e5459e1405f70a30
Change-Id: I8c5d6c467873cdb9be896089cd2d8100b16f11c1
CR-Id: ALPS04994668
Feature: [Module]Lockdep
Signed-off-by: Kobe Wu <kobe-cp.wu@mediatek.com>
Promote normal task to turbo task while turbo-task blocked.
1) Speedup binder call via infecting turbo ability to binder thread
2) Speedup sempahores owner if turbo waiting for
3) Turbo task can preempt others in futex/sempahore waiting queue
Related trace event:
a. echo 1 > /d/tracing/events/task_turbo/turbo_set/enable
- Tracking task set turbo
b. echo 1 > /d/tracing/events/task_turbo/turbo_inherit_start/enable
echo 1 > /d/tracing/events/task_turbo/ turbo_inherit_end/enable
- Tracking task start or end inherit turbo
MTK-Commit-Id: 73406f24e745b0a4c26b5d8c06e9ef6616ada80d
Change-Id: Idb3da6b5f201fa5ee60443d00ac49d7afa93a2c6
Signed-off-by: JianMin Liu <jian-min.liu@mediatek.com>
CR-Id: ALPS04791510
Feature: System Performance
refactor schedule monitor functions into four domains
- irq processing time tracer (CONFIG_MTK_SCHED_MONITOR)
- irq count status tracer (CONFIG_MTK_IRQ_COUNT_TRACER)
- irq off tracer (CONFIG_MTK_IRQ_OFF_TRACER)
- preempt off tracer (CONFIG_MTK_PREEMPT_TRACER)
The major changes
- support debug in user load
- support preempt off detection
- support default setting in device tree
- refactor /proc/mtmon architecture
- refactor debug information for aee dump
- move mtk_sched_mon.h to drivers/misc/mediatek/include/mt-plat
MTK-Commit-Id: 16ecf23a9a75a442c612422b5e7648fd5fa04394
Change-Id: Id506d0c76efc99c78ef854c3b54d83650c060a22
CR-Id: ALPS04993096
Feature: [Module]Schedule Monitor
Signed-off-by: Kobe Wu <kobe-cp.wu@mediatek.com>
sdi = per_cpu_ptr(&sp_dbg, owner_cpu);
sdi->detector_cpu = -1;
If the lock is already released, owner_cpu would be -1 and
result in kernel exception when access sdi pointer.
MTK-Commit-Id: 58a071c37566f21de74559ec28a94b6e32c011ca
Change-Id: Ic126afcbbfa267ecaffb61b7a5a93717336cd03e
CR-Id: ALPS04810076
Feature: [Module]Lockdep
Signed-off-by: Kobe Wu <kobe-cp.wu@mediatek.com>
Lockdep should not use printk when port_lock_key is held by
current task. Otherwise it will result in deadlock. Lockdep
will print warning message to trace event when port_lock_key
is already held by current task.
MTK-Commit-Id: af6dd0b10758cb9b0cb30992bd7fb7ba78b79306
Change-Id: I5df76b92df0aa525019deb2fd07eb1eda634f236
CR-Id: ALPS04715249
Feature: [Module]Lockdep
Signed-off-by: Kobe Wu <kobe-cp.wu@mediatek.com>
Because the old spinlock debugger might result in rq->lock
lockup in schedule flow. (only occurred on mt6771 Android Q)
We introduce the spinlock debugger v2 to avoid this problem.
The new version wouldnt show warnings when a task is spinning
for a spinlock. It only shows the holding time, spinning time,
and stack trace when a lock is released by the owner and above
the threshold. It wouldnt show magic number and raw_lock value.
MTK-Commit-Id: 1ef4259a053b21af07a6d33d66d629994d90c79e
Change-Id: Ib89b0b9456daa3188c8b8f05a5697849ef5db3ce
CR-Id: ALPS04677019
Feature: [Module]Lockdep
Signed-off-by: Kobe Wu <kobe-cp.wu@mediatek.com>
This is a temporary patch to disable spinlock debugger
warning on spin time. The root cause of spin time warning
on rq->lock is still under tracking.
MTK-Commit-Id: 1fa36bd2112a6ee68ebf1d0e95054a0d5c42857d
Change-Id: I375b27c2c46883eb0760c5dff532f9c6d8b7e87c
CR-Id: ALPS04670617
Feature: [Module]Lockdep
Signed-off-by: Kobe Wu <kobe-cp.wu@mediatek.com>
Reduce warning messages of spinlock debugger and lock monitor.
Only trigger aee dump once while more than one task are spinning
for the same spinlock. Remove backtrace of spinning tasks.
MTK-Commit-Id: 732bcd60d3237229bbc614c53f511284dfbe83ac
Change-Id: I25769faede5f49203e5232dd441e1123904dfa3f
CR-Id: ALPS04659845
Feature: [Module]Lockdep
Signed-off-by: Kobe Wu <kobe-cp.wu@mediatek.com>
Because a spinlock could be held by a task and other tasks are
waiting for it. If each task waiting for the spinlock triggers
an aee dump. It will make system very busy and many similar
warning messages will be shown in kernel log. This patch make
sure that aee dump will be triggered once under some conditions.
MTK-Commit-Id: 5d2505a40d384124ffd53f20b4ff3e3da5e62a27
Change-Id: I0bc3cb6416767758292c6ce63138fed15f89ab69
CR-Id: ALPS04657041
Feature: [Module]Lockdep
Signed-off-by: Kobe Wu <kobe-cp.wu@mediatek.com>
Because locks are acquired and released anytime. If we reserve
lock->owner information without using atomic operation. The
information could be invalid at that moment and results in
kernel exception when reading the information.
We will check the lock->owner information after reserving it
to a local variable. To see the information is valid or not.
MTK-Commit-Id: a314923a1dd34a4da86c1bfc4b9f03065929dbf2
Change-Id: I7150ab58bce45d68505e286e5179a04dc0c26638
CR-Id: ALPS04503017
Feature: [Module]Lockdep
Signed-off-by: Kobe Wu <kobe-cp.wu@mediatek.com>
Before we trigger aee dump in the call flow of spinning task.
Now we trigger aee dump in the call flow of lock owner.
This is helpful to find correct owner directly.
MTK-Commit-Id: ab41c7d331a3f19fd99ec4f4a81721b15f98e682
Change-Id: I8190ffc7e434774409bb3ca0597a367452b396f3
CR-Id: ALPS04644130
Feature: [Module]Lockdep
Signed-off-by: Kobe Wu <kobe-cp.wu@mediatek.com>
Trigger aee dump on lockdep warnings, then we can be aware of them.
MTK-Commit-Id: 0db5f4acc192b82a7cd15cb15ddf0c9483ab2bea
Change-Id: I3a58b83a61cee9e53874b3a48eeb340721a29d9e
CR-Id: ALPS04532107
Feature: [Module]Lockdep
Signed-off-by: Kobe Wu <kobe-cp.wu@mediatek.com>
Lockdep will trigger aee dump when a warning happened. aee dump
works based on workqueue and works_lock is needed by workqueue.
So it should not trigger aee dump when works_lock is already
held by the current task. Otherwise it will result in deadlock.
Add works_lock to checklist and check the list before trigger
aee dump.
MTK-Commit-Id: f948d020881f268a18354c1676c80d5647fbfee3
Change-Id: I0ccf127ca1a8bb928495df563459eddee97daf5e
CR-Id: ALPS04425783
Feature: [Module]Lockdep
Signed-off-by: Kobe Wu <kobe-cp.wu@mediatek.com>