Files
kernel_google_wahoo/include/linux
Jiri Olsa 7b692b9696 kretprobe: Prevent triggering kretprobe from within kprobe_flush_task
[ Upstream commit 9b38cc704e844e41d9cf74e647bff1d249512cb3 ]

Ziqian reported lockup when adding retprobe on _raw_spin_lock_irqsave.
My test was also able to trigger lockdep output:

 ============================================
 WARNING: possible recursive locking detected
 5.6.0-rc6+ #6 Not tainted
 --------------------------------------------
 sched-messaging/2767 is trying to acquire lock:
 ffffffff9a492798 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_hash_lock+0x52/0xa0

 but task is already holding lock:
 ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&(kretprobe_table_locks[i].lock));
   lock(&(kretprobe_table_locks[i].lock));

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 1 lock held by sched-messaging/2767:
  #0: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50

 stack backtrace:
 CPU: 3 PID: 2767 Comm: sched-messaging Not tainted 5.6.0-rc6+ #6
 Call Trace:
  dump_stack+0x96/0xe0
  __lock_acquire.cold.57+0x173/0x2b7
  ? native_queued_spin_lock_slowpath+0x42b/0x9e0
  ? lockdep_hardirqs_on+0x590/0x590
  ? __lock_acquire+0xf63/0x4030
  lock_acquire+0x15a/0x3d0
  ? kretprobe_hash_lock+0x52/0xa0
  _raw_spin_lock_irqsave+0x36/0x70
  ? kretprobe_hash_lock+0x52/0xa0
  kretprobe_hash_lock+0x52/0xa0
  trampoline_handler+0xf8/0x940
  ? kprobe_fault_handler+0x380/0x380
  ? find_held_lock+0x3a/0x1c0
  kretprobe_trampoline+0x25/0x50
  ? lock_acquired+0x392/0xbc0
  ? _raw_spin_lock_irqsave+0x50/0x70
  ? __get_valid_kprobe+0x1f0/0x1f0
  ? _raw_spin_unlock_irqrestore+0x3b/0x40
  ? finish_task_switch+0x4b9/0x6d0
  ? __switch_to_asm+0x34/0x70
  ? __switch_to_asm+0x40/0x70

The code within the kretprobe handler checks for probe reentrancy,
so we won't trigger any _raw_spin_lock_irqsave probe in there.

The problem is in outside kprobe_flush_task, where we call:

  kprobe_flush_task
    kretprobe_table_lock
      raw_spin_lock_irqsave
        _raw_spin_lock_irqsave

where _raw_spin_lock_irqsave triggers the kretprobe and installs
kretprobe_trampoline handler on _raw_spin_lock_irqsave return.

The kretprobe_trampoline handler is then executed with already
locked kretprobe_table_locks, and first thing it does is to
lock kretprobe_table_locks ;-) the whole lockup path like:

  kprobe_flush_task
    kretprobe_table_lock
      raw_spin_lock_irqsave
        _raw_spin_lock_irqsave ---> probe triggered, kretprobe_trampoline installed

        ---> kretprobe_table_locks locked

        kretprobe_trampoline
          trampoline_handler
            kretprobe_hash_lock(current, &head, &flags);  <--- deadlock

Adding kprobe_busy_begin/end helpers that mark code with fake
probe installed to prevent triggering of another kprobe within
this code.

Using these helpers in kprobe_flush_task, so the probe recursion
protection check is hit and the probe is never set to prevent
above lockup.

Link: http://lkml.kernel.org/r/158927059835.27680.7011202830041561604.stgit@devnote2

Fixes: ef53d9c5e4 ("kprobes: improve kretprobe scalability with hashed locking")
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "Gustavo A . R . Silva" <gustavoars@kernel.org>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Reported-by: "Ziqian SUN (Zamir)" <zsun@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-29 20:07:56 -04:00
..
2015-10-01 12:48:11 -07:00
2015-10-17 21:22:08 -07:00
2015-10-07 18:08:15 +01:00
2016-03-09 15:34:52 -08:00
2018-04-08 11:51:57 +02:00
2018-04-24 09:32:12 +02:00
2015-11-07 10:40:47 -07:00
2015-09-08 15:35:28 -07:00
2015-10-18 10:14:39 -07:00
2019-03-23 08:44:38 +01:00
2019-11-25 15:54:25 +01:00
2017-07-15 11:57:44 +02:00
2015-10-30 01:47:27 -04:00
2018-01-17 09:35:30 +01:00
2020-05-20 08:11:48 +02:00
2017-08-06 19:19:42 -07:00
2020-04-02 19:02:31 +02:00
2015-09-08 15:35:28 -07:00
2016-04-20 15:42:02 +09:00
2015-10-23 05:44:28 -07:00
2015-10-01 15:06:43 +02:00
2015-10-13 19:01:25 +02:00
2016-10-28 03:01:30 -04:00
2015-08-18 15:49:15 -07:00
2020-04-02 19:02:39 +02:00
2020-06-29 20:07:55 -04:00
2015-11-23 09:44:58 +01:00
2015-10-27 18:55:31 -07:00
2015-10-20 22:10:45 +08:00
2016-03-03 15:07:28 -08:00
2015-09-10 13:29:01 -07:00
2015-11-06 17:50:42 -08:00
2016-12-08 07:15:24 +01:00
2017-08-24 17:02:36 -07:00
2015-10-09 17:00:32 -04:00
2015-10-22 08:59:18 -07:00
2020-04-24 07:57:12 +02:00
2020-04-02 19:02:39 +02:00
2019-11-10 11:21:14 +01:00
2016-09-30 10:18:37 +02:00
2019-04-27 09:33:54 +02:00
2018-11-21 09:27:36 +01:00
2019-06-17 19:54:22 +02:00
2016-04-12 09:08:35 -07:00
2015-08-17 11:25:28 -07:00
2015-10-01 09:57:59 -07:00
2015-10-19 01:01:21 +02:00
2020-05-20 08:11:48 +02:00
2018-11-27 16:08:02 +01:00
2015-11-23 09:44:58 +01:00
2020-04-02 19:02:31 +02:00
2020-04-02 19:02:39 +02:00
2015-12-13 14:30:59 -08:00
2015-11-13 20:34:33 -05:00
2015-09-08 15:35:28 -07:00