Most of binder's memory allocations are tiny, and they're allocated
and freed extremely frequently. The latency from going through the page
allocator all the time for such small allocations ends up being quite
high, especially when the system is low on memory. Binder is
performance-critical, so this is suboptimal.
Instead of using kzalloc to allocate a struct every time, reserve caches
specifically for allocating each struct quickly.
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
[ Tashar02: Fix compilation on k5.10 binder ]
Co-authored-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
When calling binder_do_set_priority() with the same policy and priority
values as the current task, we exit early since there is nothing to do.
However, the BINDER_PRIO_PENDING state might be set and in this case we
fail to update it. A subsequent call to binder_transaction_priority()
will then read an incorrect state and save the wrong priority. Fix this
by setting thread->prio_state to BINDER_PRIO_SET on our way out.
Bug: 199309216
Fixes: cac827f2619b ("ANDROID: binder: fix race in priority restore")
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Change-Id: I21e906cf4b2ebee908af41fe101ecd458ae1991c
(cherry picked from commit 72193be6d4bd9ad29dacd998c14dff97f7a6c6c9)
When the target process is busy, incoming oneway transactions are
queued in the async_todo list. If the clients continue sending extra
oneway transactions while the target process is frozen, this queue can
become too large to accommodate new transactions. That's why binder
driver introduced ONEWAY_SPAM_DETECTION to detect this situation. It's
helpful to debug the async binder buffer exhausting issue, but the
issue itself isn't solved directly.
In real cases applications are designed to send oneway transactions
repeatedly, delivering updated inforamtion to the target process.
Typical examples are Wi-Fi signal strength and some real time sensor
data. Even if the apps might only care about the lastet information,
all outdated oneway transactions are still accumulated there until the
frozen process is thawed later. For this kind of situations, there's
no existing method to skip those outdated transactions and deliver the
latest one only.
This patch introduces a new transaction flag TF_UPDATE_TXN. To use it,
use apps can set this new flag along with TF_ONE_WAY. When such an
oneway transaction is to be queued into the async_todo list of a frozen
process, binder driver will check if any previous pending transactions
can be superseded by comparing their code, flags and target node. If
such an outdated pending transaction is found, the latest transaction
will supersede that outdated one. This effectively prevents the async
binder buffer running out and saves unnecessary binder read workloads.
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Li Li <dualli@google.com>
Link: https://lore.kernel.org/r/20220526220018.3334775-2-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 231624308
Test: manually check async binder buffer size of frozen apps
Test: stress test with kernel 4.14/4.19/5.10/5.15
(cherry picked from commit 9864bb4801331daa48514face9d0f4861e4d485b
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
char-misc-next)
Change-Id: I1c4bff1eda1ca15aaaad5bf696c8fc00be743176
During a reply, the target gets woken up and then the priority of the
replier is restored. The order is such to allow the target to process
the reply ASAP. Otherwise, we risk the sender getting scheduled out
before the wakeup happens. This strategy reduces transaction latency.
However, a subsequent transaction from the same target could be started
before the priority of the replier gets restored. At this point we save
the wrong priority and it gets reinstated at the end of the transaction.
This patch allows the incoming transaction to detect the race condition
and save the correct next priority. Additionally, the replier will abort
its pending priority restore which allows the new transaction to always
run at the desired priority.
Bug: 148101660
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Change-Id: I6fec41ae1a1342023f78212ab1f984e26f068221
(cherry picked from commit cac827f2619b280d418e546a09f25da600dafe5a)
[cmllamas: fixed trivial merge conflict]
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
Refactor binder priority functions to take in 'struct binder_thread *'
instead of just 'struct task_struct *'. This allows access to other
thread fields used in subsequent patches. In any case, the same task
reference is still available under thread->task.
There is no functional impact from this patch.
Bug: 148101660
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Change-Id: I67b599884580d957d776500e467827e5035c99f6
(cherry picked from commit 759d98484b5b51932d3d11651fa83c6bb268ce03)
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
Avoid making unnecessary stack copies of struct binder_priority and pass
the argument by reference instead. Rename 'desired_prio' to 'desired' to
match the usage in other priority functions.
There is no functional impact from this patch.
Bug: 148101660
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Change-Id: I66ff5305296e7b9dba56ed265236f2af518f66e0
(cherry picked from commit 52d85f8a16467ce0bca374f885de24918f017371)
[cmllamas: fixed conflict with vendor hook patch]
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
The setup of node_prio is always the same, so just fold this logic into
binder_transaction_priority() to avoid duplication. Let's pass the node
reference instead, which also gives access to node->inherit_rt.
There is no functional impact from this patch.
Bug: 148101660
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Change-Id: Ib390204556e69c4bc8492cd9cd873773f9cdce42
(cherry picked from commit 498bf715b77c68e54d0289fa66e3f112278f87dc)
[cmllamas: fixed conflict with vendor hook patch]
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
This reverts commit f0416df755b7a52adebea9c4714934a8bf084e89.
Reason for revert: This was a "temporary" reversion to workaround what is believed to be a user-space issue.
Change-Id: I5322aecfe57cd8237e6657525eb33975c4840059
Bug: 166779391
Signed-off-by: Todd Kjos <tkjos@google.com>
(cherry picked from commit d1c6df6dc86a04a3cabc6b5e2fc01198bcf0a29d)
[cmllamas: Resolved merge conflict with vendor hook in binder.c]
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
[ Upstream commit fe6b1869243f23a485a106c214bcfdc7aa0ed593 ]
If a memory copy function fails to copy the whole buffer,
a positive integar with the remaining bytes is returned.
In binder_translate_fd_array() this can result in an fd being
skipped due to the failed copy, but the loop continues
processing fds since the early return condition expects a
negative integer on error.
Fix by returning "ret > 0 ? -EINVAL : ret" to handle this case.
Fixes: bb4a2e48d510 ("binder: return errors from buffer copy functions")
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20211130185152.437403-2-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
Every binder operation is being logged. This impacts performance and
increases memory footprint. Since binder is critical for Android
operation, doing any logging on production builds isn't best idea.
Quick grep over Android sources revealed that only lshal and dumpsys
binaries use binder_log directory, these are not critical so breaking
them won't hurt much. Anyways, I was able to succesfully run both and
bacis functionality was still there.
Benchmarks showed significant decrease in transaction time with an avg
of 1000ns.
Also, change all DEBUG ifdefs to new define.
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
When freeing txn buffers, binder_transaction_buffer_release()
attempts to detect whether the current context is the target by
comparing current->group_leader to proc->tsk. This is an unreliable
test. Instead explicitly pass an 'is_failure' boolean.
Detecting the sender was being used as a way to tell if the
transaction failed to be sent. When cleaning up after
failing to send a transaction, there is no need to close
the fds associated with a BINDER_TYPE_FDA object. Now
'is_failure' can be used to accurately detect this case.
Fixes: 44d8047 ("binder: use standard functions to allocate fds")
Cc: stable <stable@vger.kernel.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20211015233811.3532235-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
These are initialized properly from the code just below kzalloc().
Replace it with kmalloc().
Signed-off-by: Juhyung Park <qkrwngud825@gmail.com>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
Binder code is very hot, so checking frequently to see if a debug
message should be printed is a waste of cycles. We're not debugging
binder, so just stub out the debug prints to compile them out entirely.
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
This is the 4.14.262 stable release
* tag 'v4.14.262' of git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux: (72 commits)
Linux 4.14.262
mISDN: change function names to avoid conflicts
net: udp: fix alignment problem in udp4_seq_show()
ip6_vti: initialize __ip6_tnl_parm struct in vti6_siocdevprivate
scsi: libiscsi: Fix UAF in iscsi_conn_get_param()/iscsi_conn_teardown()
ipv6: Do cleanup if attribute validation fails in multipath route
ipv6: Continue processing multipath route even if gateway attribute is invalid
phonet: refcount leak in pep_sock_accep
rndis_host: support Hytera digital radios
power: reset: ltc2952: Fix use of floating point literals
xfs: map unwritten blocks in XFS_IOC_{ALLOC,FREE}SP just like fallocate
sch_qfq: prevent shift-out-of-bounds in qfq_init_qdisc
ipv6: Check attribute length for RTA_GATEWAY when deleting multipath route
ipv6: Check attribute length for RTA_GATEWAY in multipath route
i40e: Fix incorrect netdev's real number of RX/TX queues
i40e: fix use-after-free in i40e_sync_filters_subtask()
mac80211: initialize variable have_higher_than_11mbit
RDMA/core: Don't infoleak GRH fields
ieee802154: atusb: fix uninit value in atusb_set_extended_addr
virtio_pci: Support surprise removal of virtio pci device
...
Changes in 4.14.261
HID: asus: Add depends on USB_HID to HID_ASUS Kconfig option
tee: handle lookup of shm with reference count 0
platform/x86: apple-gmux: use resource_size() with res
recordmcount.pl: fix typo in s390 mcount regex
selinux: initialize proto variable in selinux_ip_postroute_compat()
scsi: lpfc: Terminate string in lpfc_debugfs_nvmeio_trc_write()
net: usb: pegasus: Do not drop long Ethernet frames
NFC: st21nfca: Fix memory leak in device probe and remove
fsl/fman: Fix missing put_device() call in fman_port_probe
nfc: uapi: use kernel size_t to fix user-space builds
uapi: fix linux/nfc.h userspace compilation errors
xhci: Fresco FL1100 controller should not have BROKEN_MSI quirk set.
usb: gadget: f_fs: Clear ffs_eventfd in ffs_data_clear.
binder: fix async_free_space accounting for empty parcels
scsi: vmw_pvscsi: Set residual data length conditionally
Input: appletouch - initialize work before device registration
Input: spaceball - fix parsing of movement data packets
net: fix use-after-free in tw_timer_handler
sctp: use call_rcu to free endpoint
Linux 4.14.261
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I778bc28ac0835029328e2b503cb8fa241981c610
commit cfd0d84ba28c18b531648c9d4a35ecca89ad9901 upstream.
In 4.13, commit 74310e06be ("android: binder: Move buffer out of area shared with user space")
fixed a kernel structure visibility issue. As part of that patch,
sizeof(void *) was used as the buffer size for 0-length data payloads so
the driver could detect abusive clients sending 0-length asynchronous
transactions to a server by enforcing limits on async_free_size.
Unfortunately, on the "free" side, the accounting of async_free_space
did not add the sizeof(void *) back. The result was that up to 8-bytes of
async_free_space were leaked on every async transaction of 8-bytes or
less. These small transactions are uncommon, so this accounting issue
has gone undetected for several years.
The fix is to use "buffer_size" (the allocated buffer size) instead of
"size" (the logical buffer size) when updating the async_free_space
during the free operation. These are the same except for this
corner case of asynchronous transactions with payloads < 8 bytes.
Fixes: 74310e06be ("android: binder: Move buffer out of area shared with user space")
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable@vger.kernel.org # 4.14+
Link: https://lore.kernel.org/r/20211220190150.2107077-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We aren't really need those binder logs.
Change-Id: I8bf50f30139fcbd5d36ff578fe9294b31192007e
Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live>
Binder code is very hot, so checking frequently to see if a debug
message should be printed is a waste of cycles. We're not debugging
binder, so just stub out the debug prints to compile them out entirely.
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Yaroslav Furman <yaro330@gmail.com>
(cherry picked from commit 5faedf473ca0da0e46ddd1de743cfe380a22cf4e)
[panchajanya1999: Apply it to binder_alloc]
Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live>
Change-Id: I531e504850c8cf64b8a6550d6e57dc8e2522dc8d
Changes in 4.14.258
HID: add hid_is_usb() function to make it simpler for USB detection
HID: add USB_HID dependancy to hid-prodikeys
HID: add USB_HID dependancy to hid-chicony
HID: add USB_HID dependancy on some USB HID drivers
HID: wacom: fix problems when device is not a valid USB device
HID: check for valid USB device for many HID drivers
can: sja1000: fix use after free in ems_pcmcia_add_card()
nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_done
bpf: Fix the off-by-two error in range markings
nfp: Fix memory leak in nfp_cpp_area_cache_add()
seg6: fix the iif in the IPv6 socket control block
IB/hfi1: Correct guard on eager buffer deallocation
mm: bdi: initialize bdi_min_ratio when bdi is unregistered
ALSA: ctl: Fix copy of updated id with element read/write
ALSA: pcm: oss: Fix negative period/buffer sizes
ALSA: pcm: oss: Limit the period size to 16MB
ALSA: pcm: oss: Handle missing errors in snd_pcm_oss_change_params*()
tracefs: Have new files inherit the ownership of their parent
can: pch_can: pch_can_rx_normal: fix use after free
can: m_can: Disable and ignore ELO interrupt
libata: add horkage for ASMedia 1092
wait: add wake_up_pollfree()
binder: use wake_up_pollfree()
signalfd: use wake_up_pollfree()
tracefs: Set all files to the same group ownership as the mount option
block: fix ioprio_get(IOPRIO_WHO_PGRP) vs setuid(2)
qede: validate non LSO skb length
net: cdc_ncm: Allow for dwNtbOutMaxSize to be unset or zero
net: altera: set a couple error code in probe()
net: fec: only clear interrupt of handling queue in fec_enet_rx_queue()
net, neigh: clear whole pneigh_entry at alloc time
net/qla3xxx: fix an error code in ql_adapter_up()
USB: gadget: detect too-big endpoint 0 requests
USB: gadget: zero allocate endpoint 0 buffers
usb: core: config: fix validation of wMaxPacketValue entries
xhci: Remove CONFIG_USB_DEFAULT_PERSIST to prevent xHCI from runtime suspending
usb: core: config: using bit mask instead of individual bits
iio: trigger: Fix reference counting
iio: trigger: stm32-timer: fix MODULE_ALIAS
iio: stk3310: Don't return error code in interrupt handler
iio: mma8452: Fix trigger reference couting
iio: ltr501: Don't return error code in trigger handler
iio: kxsd9: Don't return error code in trigger handler
iio: itg3200: Call iio_trigger_notify_done() on error
iio: dln2-adc: Fix lockdep complaint
iio: dln2: Check return value of devm_iio_trigger_register()
iio: adc: axp20x_adc: fix charging current reporting on AXP22x
iio: accel: kxcjk-1013: Fix possible memory leak in probe and remove
irqchip/armada-370-xp: Fix return value of armada_370_xp_msi_alloc()
irqchip/armada-370-xp: Fix support for Multi-MSI interrupts
irqchip/irq-gic-v3-its.c: Force synchronisation when issuing INVALL
irqchip: nvic: Fix offset for Interrupt Priority Offsets
Linux 4.14.258
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iecbe5bcba94e422ef4f43e57c673b15fbc8706f8
commit a880b28a71e39013e357fd3adccd1d8a31bc69a8 upstream.
wake_up_poll() uses nr_exclusive=1, so it's not guaranteed to wake up
all exclusive waiters. Yet, POLLFREE *must* wake up all waiters. epoll
and aio poll are fortunately not affected by this, but it's very
fragile. Thus, the new function wake_up_pollfree() has been introduced.
Convert binder to use wake_up_pollfree().
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: f5cb779ba163 ("ANDROID: binder: remove waitqueue when thread exits.")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211209010455.42744-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c21a80ca0684ec2910344d72556c816cb8940c01 upstream.
This is a partial revert of commit
29bc22ac5e5b ("binder: use euid from cred instead of using task").
Setting sender_euid using proc->cred caused some Android system test
regressions that need further investigation. It is a partial
reversion because subsequent patches rely on proc->cred.
Fixes: 29bc22ac5e5b ("binder: use euid from cred instead of using task")
Cc: stable@vger.kernel.org # 4.4+
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Change-Id: I9b1769a3510fed250bb21859ef8beebabe034c66
Link: https://lore.kernel.org/r/20211112180720.2858135-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 52f88693378a58094c538662ba652aff0253c4fe upstream.
Since binder was integrated with selinux, it has passed
'struct task_struct' associated with the binder_proc
to represent the source and target of transactions.
The conversion of task to SID was then done in the hook
implementations. It turns out that there are race conditions
which can result in an incorrect security context being used.
Fix by using the 'struct cred' saved during binder_open and pass
it to the selinux subsystem.
Cc: stable@vger.kernel.org # 5.14 (need backport for earlier stables)
Fixes: 79af73079d ("Add security hooks to binder and implement the hooks for SELinux.")
Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 29bc22ac5e5bc63275e850f0c8fc549e3d0e306b upstream.
Save the 'struct cred' associated with a binder process
at initial open to avoid potential race conditions
when converting to an euid.
Set a transaction's sender_euid from the 'struct cred'
saved at binder_open() instead of looking up the euid
from the binder proc's 'struct task'. This ensures
the euid is associated with the security context that
of the task that opened binder.
Cc: stable@vger.kernel.org # 4.4+
Fixes: 457b9a6f09 ("Staging: android: add binder driver")
Signed-off-by: Todd Kjos <tkjos@google.com>
Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Suggested-by: Jann Horn <jannh@google.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is a partial revert of commit
29bc22ac5e5b ("binder: use euid from cred instead of using task").
Setting sender_euid using proc->cred caused some Android system test
regressions that need further investigation. It is a partial
reversion because subsequent patches rely on proc->cred.
Fixes: 29bc22ac5e5b ("binder: use euid from cred instead of using task")
Cc: stable@vger.kernel.org # 4.4+
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Change-Id: I9b1769a3510fed250bb21859ef8beebabe034c66
Link: https://lore.kernel.org/r/20211112180720.2858135-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 200688826
(cherry picked from commit c21a80ca0684ec2910344d72556c816cb8940c01
git: //git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git char-misc-linus)
Signed-off-by: Todd Kjos <tkjos@google.com>
commit 52f88693378a58094c538662ba652aff0253c4fe upstream.
Since binder was integrated with selinux, it has passed
'struct task_struct' associated with the binder_proc
to represent the source and target of transactions.
The conversion of task to SID was then done in the hook
implementations. It turns out that there are race conditions
which can result in an incorrect security context being used.
Fix by using the 'struct cred' saved during binder_open and pass
it to the selinux subsystem.
Cc: stable@vger.kernel.org # 5.14 (need backport for earlier stables)
Fixes: 79af73079d ("Add security hooks to binder and implement the hooks for SELinux.")
Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Change-Id: Id7157515d2b08f11683aeb8ad9b8f1da075d34e7
Bug: 200688826
[ tkjos@ fixed minor conflict ]
Signed-off-by: Todd Kjos <tkjos@google.com>
commit 29bc22ac5e5bc63275e850f0c8fc549e3d0e306b upstream.
Save the 'struct cred' associated with a binder process
at initial open to avoid potential race conditions
when converting to an euid.
Set a transaction's sender_euid from the 'struct cred'
saved at binder_open() instead of looking up the euid
from the binder proc's 'struct task'. This ensures
the euid is associated with the security context that
of the task that opened binder.
Cc: stable@vger.kernel.org # 4.4+
Fixes: 457b9a6f09 ("Staging: android: add binder driver")
Signed-off-by: Todd Kjos <tkjos@google.com>
Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Suggested-by: Jann Horn <jannh@google.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Change-Id: I91922e7f359df5901749f1b09094c3c68d45aed4
Bug: 200688826
Signed-off-by: Todd Kjos <tkjos@google.com>
In binder, using GFP_HIGHMEM will result in the allocated memory
not to be mapped in the kernel's virtual address space.
This prevents the kernel from being capable of directly
referring it.
Change-Id: I952dbc8ae205e47fa00ddf186ef306903f623367
Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live>
With freezable cgroups and their recent utilization in Android, it's
possible for some of Simple LMK's victims to be frozen at the time that
they're selected for killing. The forced SIGKILL used for killing
victims can only wake up processes containing TASK_WAKEKILL and/or
TASK_INTERRUPTIBLE, not TASK_UNINTERRUPTIBLE, which is the state used on
frozen tasks. In order to wake frozen tasks from their uninterruptible
slumber so that they can die, we must thaw them. Leaving victims frozen
can otherwise make them take an indefinite amount of time to process our
SIGKILL and thus free memory.
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
There are two problems with the current uninterruptible wait used in the
reclaim thread: the hung task detector is upset about an uninterruptible
thread being asleep for so long, and killing processes can generate I/O.
Since killing a process can generate I/O, the reclaim thread should
participate in system-wide suspend operations. This neatly solves the
hung task detector issue since wait_event_freezable() puts the current
process into an interruptible sleep.
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
When async binder buffer got exhausted, some normal oneway transactions
will also be discarded and may cause system or application failures. By
that time, the binder debug information we dump may not be relevant to
the root cause. And this issue is difficult to debug if without the
backtrace of the thread sending spam.
This change will send BR_ONEWAY_SPAM_SUSPECT to userspace when oneway
spamming is detected, request to dump current backtrace. Oneway spamming
will be reported only once when exceeding the threshold (target process
dips below 80% of its oneway space, and current process is responsible for
either more than 50 transactions, or more than 50% of the oneway space).
And the detection will restart when the async buffer has returned to a
healthy state.
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Hang Lu <hangl@codeaurora.org>
Link: https://lore.kernel.org/r/1617961246-4502-3-git-send-email-hangl@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: celtare21 <celtare21@gmail.com>
Releasing the procs lock while freezing a binder context allows for
other processes to modify the process list while the scan is still
ongoing.
Don't release the process locks during the scan operatoin, but store
matching processes in a dynamic array and process them at a later phase.
Signed-off-by: Marco Ballesio <balejs@google.com>
Bug: 176996063
Test: verified that all contexts are correctly frozen and unfrozen
Change-Id: Iea527e3b9188b04303f8b9b08b404e0c062a0189
(cherry picked from commit e145fb2fa8cf981429b516d1135ce8bb5aa0390a)
Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live>
binder freeze stops at the first context found for any pid, but
multiple ones are possible with the result that a process might end
up with inconsistent context states after freezing or unfreezing its
binder.
Freeze or unfreeze all contexts in a process upon a BINDER_FREEZE
ioctl.
Bug: 176996063
Test: verified that all contexts in a specific process with multiple
binders are frozen or unfrozen.
Signed-off-by: Marco Ballesio <balejs@google.com>
Change-Id: If0822e078e830e9fde10cc17b99e39ec7cf358d5
(cherry picked from commit 99b0d0666f491fbd44dec2a2be17b4fe85b6ea42)
Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live>