kernel_google_wahoo

Author	SHA1	Message	Date
Ross Zwisler	a5f728e5c7	ext4: introduce jbd2_inode dirty range scoping and use it on ext4 jbd2: introduce jbd2_inode dirty range scoping: Currently both journal_submit_inode_data_buffers() and journal_finish_inode_data_buffers() operate on the entire address space of each of the inodes associated with a given journal entry. The consequence of this is that if we have an inode where we are constantly appending dirty pages we can end up waiting for an indefinite amount of time in journal_finish_inode_data_buffers() while we wait for all the pages under writeback to be written out. The easiest way to cause this type of workload is do just dd from /dev/zero to a file until it fills the entire filesystem. This can cause journal_finish_inode_data_buffers() to wait for the duration of the entire dd operation. We can improve this situation by scoping each of the inode dirty ranges associated with a given transaction. We do this via the jbd2_inode structure so that the scoping is contained within jbd2 and so that it follows the lifetime and locking rules for that structure. This allows us to limit the writeback & wait in journal_submit_inode_data_buffers() and journal_finish_inode_data_buffers() respectively to the dirty range for a given struct jdb2_inode, keeping us from waiting forever if the inode in question is still being appended to. ext4: use jbd2_inode dirty range scoping: Use the newly introduced jbd2_inode dirty range scoping to prevent us from waiting forever when trying to complete a journal transaction. jbd2: Introduce jbd2_inode next dirty range scoping: Distinguish between the current dirty range and the next dirty range. Signed-off-by: Ross Zwisler <zwisler@google.com> Change-Id: Idd339a5f4edbcd16e16fe4a861eb28c3ac7a08bd	2024-01-07 14:05:43 +00:00
stic-server-open	084504d01b	ext4: Stop trim mechanism after receiving SIGUSR1 signal * Same idea as xiaomi f2fs Change-Id: I48ae59f6cd6548d491bfc81898bb19e795138255	2024-01-07 14:05:42 +00:00
Yumi Yukimura	1a775072f6	fs: proc: Add PROC_CMDLINE_APPEND_ANDROID_FORCE_NORMAL_BOOT For Android 9 launch A/B devices migrating to Android 10 style system-as-root, `androidboot.force_normal_boot=1` must be passed in cmdline when booting into normal or charger mode. However, it is not always possible for one to modify the bootloader to adhere to these changes. As a workaround, one can use the presence of the `skip_initramfs` flag in cmdline to to decide whether to append the new flag to cmdline on the kernel side. Co-authored-by: jabashque <jabashque@gmail.com> Change-Id: Ia00ea2c54e2a7d2275e552837039033adb98d0ff	2023-11-27 14:06:01 +08:00
Sebastiano Barezzi	b6ce64b2b3	init: Add CONFIG_INITRAMFS_IGNORE_SKIP_FLAG * Ignoring an ignore flag, yikes * Also replace skip_initramf with want_initramf (omitting last letter for Magisk since it binary patches that out of kernel, I'm not even sure why we're supporting that mess) Co-authored-by: Erfan Abdi <erfangplus@gmail.com> Change-Id: Ifdf726f128bc66bf860bbb71024f94f56879710f	2023-11-27 14:05:53 +08:00
Miklos Szeredi	04fc80eccc	libfs: support RENAME_NOREPLACE in simple_rename() This is trivial to do: - add flags argument to simple_rename() - check if flags doesn't have any other than RENAME_NOREPLACE - assign simple_rename() to .rename2 instead of .rename Filesystems converted: hugetlbfs, ramfs, bpf. Debugfs uses simple_rename() to implement debugfs_rename(), which is for debugfs instances to rename files internally, not for userspace filesystem access. For this case pass zero flags to simple_rename(). Change-Id: I1a46ece3b40b05c9f18fd13b98062d2a959b76a0 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexei Starovoitov <ast@kernel.org>	2023-11-16 20:18:22 +08:00
Dongliang Mu	d9ca709886	UPSTREAM: f2fs: fix UAF in f2fs_available_free_memory if2fs_fill_super -> f2fs_build_segment_manager -> create_discard_cmd_control -> f2fs_start_discard_thread It invokes kthread_run to create a thread and run issue_discard_thread. However, if f2fs_build_node_manager fails, the control flow goes to free_nm and calls f2fs_destroy_node_manager. This function will free sbi->nm_info. However, if issue_discard_thread accesses sbi->nm_info after the deallocation, but before the f2fs_stop_discard_thread, it will cause UAF(Use-after-free). -> f2fs_destroy_segment_manager -> destroy_discard_cmd_control -> f2fs_stop_discard_thread Fix this by stopping discard thread before f2fs_destroy_node_manager. Note that, the commit d6d2b491a82e1 introduces the call of f2fs_available_free_memory into issue_discard_thread. Cc: stable@vger.kernel.org Fixes: d6d2b491a82e ("f2fs: allow to change discard policy based on cached discard cmds") Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> (cherry picked from commit 5429c9dbc9025f9a166f64e22e3a69c94fd5b29b) Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: If121b453455b11b2aded8ba8a3899faad431dbd3	2023-03-07 20:13:50 +08:00
Sahitya Tummala	7365248712	f2fs: allow to change discard policy based on cached discard cmds With the default DPOLICY_BG discard thread is ioaware, which prevents the discard thread from issuing the discard commands. On low RAM setups, it is observed that these discard commands in the cache are consuming high memory. This patch aims to relax the memory pressure on the system due to f2fs pending discard cmds by changing the policy to DPOLICY_FORCE based on the nm_i->ram_thresh configured. Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-03-07 20:13:50 +08:00
Miklos Szeredi	865302291c	fuse: fix pipe buffer lifetime for direct_io commit 0c4bcfdecb1ac0967619ee7ff44871d93c08c909 upstream. In FOPEN_DIRECT_IO mode, fuse_file_write_iter() calls fuse_direct_write_iter(), which normally calls fuse_direct_io(), which then imports the write buffer with fuse_get_user_pages(), which uses iov_iter_get_pages() to grab references to userspace pages instead of actually copying memory. On the filesystem device side, these pages can then either be read to userspace (via fuse_dev_read()), or splice()d over into a pipe using fuse_dev_splice_read() as pipe buffers with &nosteal_pipe_buf_ops. This is wrong because after fuse_dev_do_read() unlocks the FUSE request, the userspace filesystem can mark the request as completed, causing write() to return. At that point, the userspace filesystem should no longer have access to the pipe buffer. Fix by copying pages coming from the user address space to new pipe buffers. Reported-by: Jann Horn <jannh@google.com> Fixes: `c3021629a0` ("fuse: support splice() reading from fuse device") Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-03-07 20:13:12 +08:00
Eric W. Biederman	8a616a83eb	userns: Don't fail follow_automount based on s_user_ns [ Upstream commit bbc3e471011417598e598707486f5d8814ec9c01 ] When vfs_submount was added the test to limit automounts from filesystems that with s_user_ns != &init_user_ns accidentially left in follow_automount. The test was never about any security concerns and was always about how do we implement this for filesystems whose s_user_ns != &init_user_ns. At the moment this check makes no difference as there are no filesystems that both set FS_USERNS_MOUNT and implement d_automount. Remove this check now while I am thinking about it so there will not be odd booby traps for someone who does want to make this combination work. vfs_submount still needs improvements to allow this combination to work, and vfs_submount contains a check that presents a warning. The autofs4 filesystem could be modified to set FS_USERNS_MOUNT and it would need not work on this code path, as userspace performs the mounts. Fixes: 93faccbbfa95 ("fs: Better permission checking for submounts") Fixes: aeaa4a79ff6a ("fs: Call d_automount with the filesystems creds") Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Change-Id: I1707ab45c9b3b23ba9c06bfb4738fc85b8f9e166	2022-11-15 21:35:32 +01:00
Eric W. Biederman	62b004c983	fs: Better permission checking for submounts commit 93faccbbfa958a9668d3ab4e30f38dd205cee8d8 upstream. To support unprivileged users mounting filesystems two permission checks have to be performed: a test to see if the user allowed to create a mount in the mount namespace, and a test to see if the user is allowed to access the specified filesystem. The automount case is special in that mounting the original filesystem grants permission to mount the sub-filesystems, to any user who happens to stumble across the their mountpoint and satisfies the ordinary filesystem permission checks. Attempting to handle the automount case by using override_creds almost works. It preserves the idea that permission to mount the original filesystem is permission to mount the sub-filesystem. Unfortunately using override_creds messes up the filesystems ordinary permission checks. Solve this by being explicit that a mount is a submount by introducing vfs_submount, and using it where appropriate. vfs_submount uses a new mount internal mount flags MS_SUBMOUNT, to let sget and friends know that a mount is a submount so they can take appropriate action. sget and sget_userns are modified to not perform any permission checks on submounts. follow_automount is modified to stop using override_creds as that has proven problemantic. do_mount is modified to always remove the new MS_SUBMOUNT flag so that we know userspace will never by able to specify it. autofs4 is modified to stop using current_real_cred that was put in there to handle the previous version of submount permission checking. cifs is modified to pass the mountpoint all of the way down to vfs_submount. debugfs is modified to pass the mountpoint all of the way down to trace_automount by adding a new parameter. To make this change easier a new typedef debugfs_automount_t is introduced to capture the type of the debugfs automount function. Fixes: 069d5ac9ae0d ("autofs: Fix automounts by using current_real_cred()->uid") Fixes: aeaa4a79ff6a ("fs: Call d_automount with the filesystems creds") Reviewed-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Change-Id: I09cb1f35368fb8dc4a64b5ac5a35c9d2843ef95b	2022-11-15 21:35:32 +01:00
Eric W. Biederman	9a6beaf980	mnt: Move the FS_USERNS_MOUNT check into sget_userns Allowing a filesystem to be mounted by other than root in the initial user namespace is a filesystem property not a mount namespace property and as such should be checked in filesystem specific code. Move the FS_USERNS_MOUNT test into super.c:sget_userns(). Acked-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Change-Id: I5da9f5ce3e7b85379a771617e3238817b777eab4	2022-11-15 21:35:32 +01:00
Jeff Layton	967a853746	locks: sprinkle some tracepoints around the file locking code Add some tracepoints around the POSIX locking code. These were useful when tracking down problems when handling the race between setlk and close. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Change-Id: I270eda634890d21399ccf939ad6d03b7d201a148	2022-11-15 21:35:32 +01:00
Jeff Layton	d75f807e40	locks: rename __posix_lock_file to posix_lock_inode ...a more descriptive name and we can drop the double underscore prefix. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Change-Id: Iafb3bd86e5791d9c36bff3be7a876fa8aeb98afa	2022-11-15 21:35:32 +01:00
Eric W. Biederman	3686b47158	autofs: Fix automounts by using current_real_cred()->uid Seth Forshee reports that in 4.8-rcN some automounts are failing because the requesting the automount changed. The relevant call path is: follow_automount() ->d_automount autofs4_d_automount autofs4_mount_wait autofs4_wait In autofs4_wait wq_uid and wq_gid are set to current_uid() and current_gid respectively. With follow_automount now overriding creds uid that we export to userspace changes and that breaks existing setups. To remove the regression set wq_uid and wq_gid from current_real_cred()->uid and current_real_cred()->gid respectively. This restores the current behavior as current->real_cred is identical to current->cred except when override creds are used. Cc: stable@vger.kernel.org Fixes: aeaa4a79ff6a ("fs: Call d_automount with the filesystems creds") Reported-by: Seth Forshee <seth.forshee@canonical.com> Tested-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Change-Id: I3ec133334218ec9bd108b18c92fd852104f56926	2022-11-15 21:35:32 +01:00
Eric W. Biederman	8658a41bc9	fs: Call d_automount with the filesystems creds Seth Forshee reported a mount regression in nfs autmounts with "fs: Add user namespace member to struct super_block". It turns out that the assumption that current->cred is something reasonable during mount while necessary to improve support of unprivileged mounts is wrong in the automount path. To fix the existing filesystems override current->cred with the init_cred before calling d_automount and restore current->cred after d_automount completes. To support unprivileged mounts would require a more nuanced cred selection, so fail on unprivileged mounts for the time being. As none of the filesystems that currently set FS_USERNS_MOUNT implement d_automount this check is only good for preventing future problems. Fixes: 6e4eab577a0c ("fs: Add user namespace member to struct super_block") Tested-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Change-Id: I972485e9da3f2883e4ec9b38da3374e0993b1af6	2022-11-15 21:35:32 +01:00
Vaibhav Jain	6f47e535e7	UPSTREAM: kernfs: Check KERNFS_HAS_RELEASE before calling kernfs_release_file() Recently started seeing a kernel oops when a module tries removing a memory mapped sysfs bin_attribute. On closer investigation the root cause seems to be kernfs_release_file() trying to call kernfs_op.release() callback that's NULL for such sysfs bin_attributes. The oops occurs when kernfs_release_file() is called from kernfs_drain_open_files() to cleanup any open handles with active memory mappings. The patch fixes this by checking for flag KERNFS_HAS_RELEASE before calling kernfs_release_file() in function kernfs_drain_open_files(). On ppc64-le arch with cxl module the oops back-trace is of the form below: [ 861.381126] Unable to handle kernel paging request for instruction fetch [ 861.381360] Faulting instruction address: 0x00000000 [ 861.381428] Oops: Kernel access of bad area, sig: 11 [#1] .... [ 861.382481] NIP: 0000000000000000 LR: c000000000362c60 CTR: 0000000000000000 .... Call Trace: [c000000f1680b750] [c000000000362c34] kernfs_drain_open_files+0x104/0x1d0 (unreliable) [c000000f1680b790] [c00000000035fa00] __kernfs_remove+0x260/0x2c0 [c000000f1680b820] [c000000000360da0] kernfs_remove_by_name_ns+0x60/0xe0 [c000000f1680b8b0] [c0000000003638f4] sysfs_remove_bin_file+0x24/0x40 [c000000f1680b8d0] [c00000000062a164] device_remove_bin_file+0x24/0x40 [c000000f1680b8f0] [d000000009b7b22c] cxl_sysfs_afu_remove+0x144/0x170 [cxl] [c000000f1680b940] [d000000009b7c7e4] cxl_remove+0x6c/0x1a0 [cxl] [c000000f1680b990] [c00000000052f694] pci_device_remove+0x64/0x110 [c000000f1680b9d0] [c0000000006321d4] device_release_driver_internal+0x1f4/0x2b0 [c000000f1680ba20] [c000000000525cb0] pci_stop_bus_device+0xa0/0xd0 [c000000f1680ba60] [c000000000525e80] pci_stop_and_remove_bus_device+0x20/0x40 [c000000f1680ba90] [c00000000004a6c4] pci_hp_remove_devices+0x84/0xc0 [c000000f1680bad0] [c00000000004a688] pci_hp_remove_devices+0x48/0xc0 [c000000f1680bb10] [c0000000009dfda4] eeh_reset_device+0xb0/0x290 [c000000f1680bbb0] [c000000000032b4c] eeh_handle_normal_event+0x47c/0x530 [c000000f1680bc60] [c000000000032e64] eeh_handle_event+0x174/0x350 [c000000f1680bd10] [c000000000033228] eeh_event_handler+0x1e8/0x1f0 [c000000f1680bdc0] [c0000000000d384c] kthread+0x14c/0x190 [c000000f1680be30] [c00000000000b5a0] ret_from_kernel_thread+0x5c/0xbc Fixes: f83f3c515654 ("kernfs: fix locking around kernfs_ops->release() callback") Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 966fa72a716ceafc69de901a31f7cc1f52b35f81) Bug: 111308141 Test: modified lmkd to use PSI and tested using lmkd_unit_test Change-Id: I9ca5cbacd1e204a742e5616e6e101339d8719cdf Signed-off-by: Suren Baghdasaryan <surenb@google.com>	2022-11-15 21:35:32 +01:00
Tejun Heo	d730ee7a52	UPSTREAM: kernfs: fix locking around kernfs_ops->release() callback The release callback may be called from two places - file release operation and kernfs open file draining. kernfs_open_file->mutex is used to synchronize the two callsites. This unfortunately leads to possible circular locking because of->mutex is used to protect the usual kernfs operations which may use locking constructs which are held while removing and thus draining kernfs files. @of->mutex is for synchronizing concurrent kernfs access operations and all we need here is synchronization between the releaes and drain paths. As the drain path has to grab kernfs_open_file_mutex anyway, let's use the mutex to synchronize the release operation instead. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-and-tested-by: Tony Lindgren <tony@atomide.com> Fixes: 0e67db2f9fe9 ("kernfs: add kernfs_ops->open/release() callbacks") Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit f83f3c515654474e19c7fc86e3b06564bb5cb4d4) Bug: 111308141 Test: modified lmkd to use PSI and tested using lmkd_unit_test Change-Id: I75253c2aa8924987e9342d94e8bae445d6c8f5be Signed-off-by: Suren Baghdasaryan <surenb@google.com>	2022-11-15 21:35:32 +01:00
Serge E. Hallyn	2255e907e8	kernfs: kernfs_sop_show_path: don't return 0 after seq_dentry call Our caller expects 0 on success, not >0. This fixes a bug in the patch cgroup, kernfs: make mountinfo show properly scoped path for cgroup namespaces where /sys does not show up in mountinfo, breaking criu. Thanks for catching this, Andrei. Reported-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com> Signed-off-by: Tejun Heo <tj@kernel.org> Change-Id: I3cf5886bf7a77a943a6540c4b224dd0ca805dca6	2022-11-15 21:35:32 +01:00
Sami Tolvanen	506c56dd29	BACKPORT: ANDROID: fs: logfs: fix filler function type Bug: 67506682 Change-Id: If2659b91e250cbd9f1a4a028ff43caf71b8306dd Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit 2be0847f6f9f4effe639f2caeb88bb6f16838332) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>	2022-11-15 21:35:31 +01:00
Sami Tolvanen	7f6144bfa3	BACKPORT: ANDROID: fs: gfs2: fix filler function type Bug: 67506682 Change-Id: I50a3f85965de6e041d0f40e7bf9c2ced15ccfd49 Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit 920c7fd62c25da3acd3e16f3808d324a08e4a453) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>	2022-11-15 21:35:31 +01:00
Sami Tolvanen	a2fa993d6f	BACKPORT: ANDROID: fs: exofs: fix filler function type Bug: 67506682 Change-Id: I42f297bfe07a1b7916790415f35ad4f2574ceec7 Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit 39cdeb137340baed325e695c2ded3ec8d0abda2b) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>	2022-11-15 21:35:30 +01:00
Sami Tolvanen	17132f8d28	BACKPORT: ANDROID: fs: afs: fix filler function type Bug: 67506682 Change-Id: I76d208c8606ee5af144891d14bd309912d4d788d Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit 53f4adf6788d71d322f810efb271a5658f44d193) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>	2022-11-15 21:35:30 +01:00
Sami Tolvanen	0b86f818e9	BACKPORT: fs: nfs: fix filler function type Bug: 67506682 Change-Id: I04d4b1b9ab0720a4f342d6617dd132de8654b94c Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit b73b94a7df67bc8f113f5e9619f9dfa4061d4f8a) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>	2022-11-15 21:35:30 +01:00
Sami Tolvanen	076965b279	BACKPORT: mm: fix filler function type mismatch Bug: 67506682 Change-Id: I6f615164ccd86b407540ada9bbcb39d910395db9 Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit 4fd840d1743308b2ef470534523009dd99b3ce2b) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>	2022-11-15 21:35:30 +01:00
Miklos Szeredi	adcfa14c25	BACKPORT: vfs: pass type instead of fn to do_{loop,iter}_readv_writev() Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Bug: 67506682 Change-Id: I919a90715ed71d6caf02b1333dbfec5e7e3ad52b (cherry picked from commit 0f78d06ac1e9b470cbd8f913ee1688c8b2c8feb3) Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit 04676269a0b4e1fd20a7ceaaef878fa3131517ea) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>	2022-11-15 21:35:30 +01:00
Kees Cook	55b75ea79b	BACKPORT: treewide: Fix function prototypes for module_param_call() Several function prototypes for the set/get functions defined by module_param_call() have a slightly wrong argument types. This fixes those in an effort to clean up the calls when running under type-enforced compiler instrumentation for CFI. This is the result of running the following semantic patch: @match_module_param_call_function@ declarer name module_param_call; identifier _name, _set_func, _get_func; expression _arg, _mode; @@ module_param_call(_name, _set_func, _get_func, _arg, _mode); @fix_set_prototype depends on match_module_param_call_function@ identifier match_module_param_call_function._set_func; identifier _val, _param; type _val_type, _param_type; @@ int _set_func( -_val_type _val +const char * _val , -_param_type _param +const struct kernel_param * _param ) { ... } @fix_get_prototype depends on match_module_param_call_function@ identifier match_module_param_call_function._get_func; identifier _val, _param; type _val_type, _param_type; @@ int _get_func( -_val_type _val +char * _val , -_param_type _param +const struct kernel_param * _param ) { ... } Two additional by-hand changes are included for places where the above Coccinelle script didn't notice them: drivers/platform/x86/thinkpad_acpi.c fs/lockd/svc.c Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jessica Yu <jeyu@kernel.org> Bug: 67506682 Change-Id: I2c9c0ee8ed28065e63270a52c155e5e7d2791295 (cherry picked from commit e4dca7b7aa08b22893c45485d222b5807c1375ae) Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit 24da2c84bd7dcdf2b56fa8d3b2f833656ee60a01) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>	2022-11-15 21:35:30 +01:00
Michal Marek	9f11a2683c	BACKPORT: kbuild: Allow to specify composite modules with modname-m This allows to write drm-$(CONFIG_AGP) += drm_agpsupport.o without having to handle CONFIG_AGP=y vs. CONFIG_AGP=m. Only support this syntax for modules, since built-in code depending on something modular cannot work and init/Makefile actually relies on the current semantics. There are a few drivers which adapted to the current semantics out of necessity; these are fixed to also work when the respective subsystem is modular. Acked-by: Peter Chen <peter.chen@freescale.com> [chipidea] Signed-off-by: Michal Marek <mmarek@suse.com> Change-Id: Ibd0f7006c9d3b87f2b77e59bdc51c06cb361e9a0 (cherry picked from commit cf4f21938e13ea1533ebdcb21c46f1d998a44ee8) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>	2022-11-15 21:35:30 +01:00
David Howells	0271a9e9bd	UPSTREAM: Make anon_inodes unconditional Make the anon_inodes facility unconditional so that it can be used by core VFS code. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> (cherry picked from commit dadd2299ab61fc2b55b95b7b3a8f674cdd3b69c9) Bug: 135608568 Test: test program using syscall(__NR_sys_pidfd_open,..) and poll() Change-Id: I2f97bda4f360d8d05bbb603de839717b3d8067ae Signed-off-by: Suren Baghdasaryan <surenb@google.com>	2022-11-15 21:35:29 +01:00
Christian Brauner	a6dfa182c4	BACKPORT: signal: add pidfd_send_signal() syscall The kill() syscall operates on process identifiers (pid). After a process has exited its pid can be reused by another process. If a caller sends a signal to a reused pid it will end up signaling the wrong process. This issue has often surfaced and there has been a push to address this problem [1]. This patch uses file descriptors (fd) from proc/<pid> as stable handles on struct pid. Even if a pid is recycled the handle will not change. The fd can be used to send signals to the process it refers to. Thus, the new syscall pidfd_send_signal() is introduced to solve this problem. Instead of pids it operates on process fds (pidfd). /* prototype and argument /* long pidfd_send_signal(int pidfd, int sig, siginfo_t info, unsigned int flags); / syscall number 424 / The syscall number was chosen to be 424 to align with Arnd's rework in his y2038 to minimize merge conflicts (cf. [25]). In addition to the pidfd and signal argument it takes an additional siginfo_t and flags argument. If the siginfo_t argument is NULL then pidfd_send_signal() is equivalent to kill(<positive-pid>, <signal>). If it is not NULL pidfd_send_signal() is equivalent to rt_sigqueueinfo(). The flags argument is added to allow for future extensions of this syscall. It currently needs to be passed as 0. Failing to do so will cause EINVAL. / pidfd_send_signal() replaces multiple pid-based syscalls / The pidfd_send_signal() syscall currently takes on the job of rt_sigqueueinfo(2) and parts of the functionality of kill(2), Namely, when a positive pid is passed to kill(2). It will however be possible to also replace tgkill(2) and rt_tgsigqueueinfo(2) if this syscall is extended. / sending signals to threads (tid) and process groups (pgid) / Specifically, the pidfd_send_signal() syscall does currently not operate on process groups or threads. This is left for future extensions. In order to extend the syscall to allow sending signal to threads and process groups appropriately named flags (e.g. PIDFD_TYPE_PGID, and PIDFD_TYPE_TID) should be added. This implies that the flags argument will determine what is signaled and not the file descriptor itself. Put in other words, grouping in this api is a property of the flags argument not a property of the file descriptor (cf. [13]). Clarification for this has been requested by Eric (cf. [19]). When appropriate extensions through the flags argument are added then pidfd_send_signal() can additionally replace the part of kill(2) which operates on process groups as well as the tgkill(2) and rt_tgsigqueueinfo(2) syscalls. How such an extension could be implemented has been very roughly sketched in [14], [15], and [16]. However, this should not be taken as a commitment to a particular implementation. There might be better ways to do it. Right now this is intentionally left out to keep this patchset as simple as possible (cf. [4]). / naming / The syscall had various names throughout iterations of this patchset: - procfd_signal() - procfd_send_signal() - taskfd_send_signal() In the last round of reviews it was pointed out that given that if the flags argument decides the scope of the signal instead of different types of fds it might make sense to either settle for "procfd_" or "pidfd_" as prefix. The community was willing to accept either (cf. [17] and [18]). Given that one developer expressed strong preference for the "pidfd_" prefix (cf. [13]) and with other developers less opinionated about the name we should settle for "pidfd_" to avoid further bikeshedding. The "_send_signal" suffix was chosen to reflect the fact that the syscall takes on the job of multiple syscalls. It is therefore intentional that the name is not reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer because it might imply that pidfd_send_signal() is a replacement for kill(2), and not the latter because it is a hassle to remember the correct spelling - especially for non-native speakers - and because it is not descriptive enough of what the syscall actually does. The name "pidfd_send_signal" makes it very clear that its job is to send signals. / zombies / Zombies can be signaled just as any other process. No special error will be reported since a zombie state is an unreliable state (cf. [3]). However, this can be added as an extension through the @flags argument if the need ever arises. / cross-namespace signals / The patch currently enforces that the signaler and signalee either are in the same pid namespace or that the signaler's pid namespace is an ancestor of the signalee's pid namespace. This is done for the sake of simplicity and because it is unclear to what values certain members of struct siginfo_t would need to be set to (cf. [5], [6]). / compat syscalls / It became clear that we would like to avoid adding compat syscalls (cf. [7]). The compat syscall handling is now done in kernel/signal.c itself by adding __copy_siginfo_from_user_generic() which lets us avoid compat syscalls (cf. [8]). It should be noted that the addition of __copy_siginfo_from_user_any() is caused by a bug in the original implementation of rt_sigqueueinfo(2) (cf. 12). With upcoming rework for syscall handling things might improve significantly (cf. [11]) and __copy_siginfo_from_user_any() will not gain any additional callers. / testing / This patch was tested on x64 and x86. / userspace usage / An asciinema recording for the basic functionality can be found under [9]. With this patch a process can be killed via: #define _GNU_SOURCE #include <errno.h> #include <fcntl.h> #include <signal.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/stat.h> #include <sys/syscall.h> #include <sys/types.h> #include <unistd.h> static inline int do_pidfd_send_signal(int pidfd, int sig, siginfo_t info, unsigned int flags) { #ifdef __NR_pidfd_send_signal return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags); #else return -ENOSYS; #endif } int main(int argc, char argv[]) { int fd, ret, saved_errno, sig; if (argc < 3) exit(EXIT_FAILURE); fd = open(argv[1], O_DIRECTORY \| O_CLOEXEC); if (fd < 0) { printf("%s - Failed to open \"%s\"\n", strerror(errno), argv[1]); exit(EXIT_FAILURE); } sig = atoi(argv[2]); printf("Sending signal %d to process %s\n", sig, argv[1]); ret = do_pidfd_send_signal(fd, sig, NULL, 0); saved_errno = errno; close(fd); errno = saved_errno; if (ret < 0) { printf("%s - Failed to send signal %d to process %s\n", strerror(errno), sig, argv[1]); exit(EXIT_FAILURE); } exit(EXIT_SUCCESS); } / Q&A * Given that it seems the same questions get asked again by people who are * late to the party it makes sense to add a Q&A section to the commit * message so it's hopefully easier to avoid duplicate threads. * * For the sake of progress please consider these arguments settled unless * there is a new point that desperately needs to be addressed. Please make * sure to check the links to the threads in this commit message whether * this has not already been covered. / Q-01: (Florian Weimer [20], Andrew Morton [21]) What happens when the target process has exited? A-01: Sending the signal will fail with ESRCH (cf. [22]). Q-02: (Andrew Morton [21]) Is the task_struct pinned by the fd? A-02: No. A reference to struct pid is kept. struct pid - as far as I understand - was created exactly for the reason to not require to pin struct task_struct (cf. [22]). Q-03: (Andrew Morton [21]) Does the entire procfs directory remain visible? Just one entry within it? A-03: The same thing that happens right now when you hold a file descriptor to /proc/<pid> open (cf. [22]). Q-04: (Andrew Morton [21]) Does the pid remain reserved? A-04: No. This patchset guarantees a stable handle not that pids are not recycled (cf. [22]). Q-05: (Andrew Morton [21]) Do attempts to signal that fd return errors? A-05: See {Q,A}-01. Q-06: (Andrew Morton [22]) Is there a cleaner way of obtaining the fd? Another syscall perhaps. A-06: Userspace can already trivially retrieve file descriptors from procfs so this is something that we will need to support anyway. Hence, there's no immediate need to add another syscalls just to make pidfd_send_signal() not dependent on the presence of procfs. However, adding a syscalls to get such file descriptors is planned for a future patchset (cf. [22]). Q-07: (Andrew Morton [21] and others) This fd-for-a-process sounds like a handy thing and people may well think up other uses for it in the future, probably unrelated to signals. Are the code and the interface designed to permit such future applications? A-07: Yes (cf. [22]). Q-08: (Andrew Morton [21] and others) Now I think about it, why a new syscall? This thing is looking rather like an ioctl? A-08: This has been extensively discussed. It was agreed that a syscall is preferred for a variety or reasons. Here are just a few taken from prior threads. Syscalls are safer than ioctl()s especially when signaling to fds. Processes are a core kernel concept so a syscall seems more appropriate. The layout of the syscall with its four arguments would require the addition of a custom struct for the ioctl() thereby causing at least the same amount or even more complexity for userspace than a simple syscall. The new syscall will replace multiple other pid-based syscalls (see description above). The file-descriptors-for-processes concept introduced with this syscall will be extended with other syscalls in the future. See also [22], [23] and various other threads already linked in here. Q-09: (Florian Weimer [24]) What happens if you use the new interface with an O_PATH descriptor? A-09: pidfds opened as O_PATH fds cannot be used to send signals to a process (cf. [2]). Signaling processes through pidfds is the equivalent of writing to a file. Thus, this is not an operation that operates "purely at the file descriptor level" as required by the open(2) manpage. See also [4]. / References */ [1]: https://lore.kernel.org/lkml/20181029221037.87724-1-dancol@google.com/ [2]: https://lore.kernel.org/lkml/874lbtjvtd.fsf@oldenburg2.str.redhat.com/ [3]: https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6fhva@brauner.io/ [4]: https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru2ku@brauner.io/ [5]: https://lore.kernel.org/lkml/20181121213946.GA10795@mail.hallyn.com/ [6]: https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6nv4@brauner.io/ [7]: https://lore.kernel.org/lkml/36323361-90BD-41AF-AB5B-EE0D7BA02C21@amacapital.net/ [8]: https://lore.kernel.org/lkml/87tvjxp8pc.fsf@xmission.com/ [9]: https://asciinema.org/a/IQjuCHew6bnq1cr78yuMv16cy [11]: https://lore.kernel.org/lkml/F53D6D38-3521-4C20-9034-5AF447DF62FF@amacapital.net/ [12]: https://lore.kernel.org/lkml/87zhtjn8ck.fsf@xmission.com/ [13]: https://lore.kernel.org/lkml/871s6u9z6u.fsf@xmission.com/ [14]: https://lore.kernel.org/lkml/20181206231742.xxi4ghn24z4h2qki@brauner.io/ [15]: https://lore.kernel.org/lkml/20181207003124.GA11160@mail.hallyn.com/ [16]: https://lore.kernel.org/lkml/20181207015423.4miorx43l3qhppfz@brauner.io/ [17]: https://lore.kernel.org/lkml/CAGXu5jL8PciZAXvOvCeCU3wKUEB_dU-O3q0tDw4uB_ojMvDEew@mail.gmail.com/ [18]: https://lore.kernel.org/lkml/20181206222746.GB9224@mail.hallyn.com/ [19]: https://lore.kernel.org/lkml/20181208054059.19813-1-christian@brauner.io/ [20]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/ [21]: https://lore.kernel.org/lkml/20181228152012.dbf0508c2508138efc5f2bbe@linux-foundation.org/ [22]: https://lore.kernel.org/lkml/20181228233725.722tdfgijxcssg76@brauner.io/ [23]: https://lwn.net/Articles/773459/ [24]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/ [25]: https://lore.kernel.org/lkml/CAK8P3a0ej9NcJM8wXNPbcGUyOUZYX+VLoDFdbenW3s3114oQZw@mail.gmail.com/ Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Jann Horn <jannh@google.com> Cc: Andy Lutomirsky <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Florian Weimer <fweimer@redhat.com> Signed-off-by: Christian Brauner <christian@brauner.io> Reviewed-by: Tycho Andersen <tycho@tycho.ws> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Serge Hallyn <serge@hallyn.com> Acked-by: Aleksa Sarai <cyphar@cyphar.com> (cherry picked from commit 3eb39f47934f9d5a3027fe00d906a45fe3a15fad) Conflicts: arch/x86/entry/syscalls/syscall_32.tbl - trivial manual merge arch/x86/entry/syscalls/syscall_64.tbl - trivial manual merge include/linux/proc_fs.h - trivial manual merge include/linux/syscalls.h - trivial manual merge include/uapi/asm-generic/unistd.h - trivial manual merge kernel/signal.c - struct kernel_siginfo does not exist in 4.14 kernel/sys_ni.c - cond_syscall is used instead of COND_SYSCALL arch/x86/entry/syscalls/syscall_32.tbl arch/x86/entry/syscalls/syscall_64.tbl (1. manual merges because of 4.14 differences 2. change prepare_kill_siginfo() to use struct siginfo instead of kernel_siginfo 3. use copy_from_user() instead of copy_siginfo_from_user() in copy_siginfo_from_user_any() 4. replaced COND_SYSCALL with cond_syscall 5. Removed __ia32_sys_pidfd_send_signal in arch/x86/entry/syscalls/syscall_32.tbl. 6. Replaced __x64_sys_pidfd_send_signal with sys_pidfd_send_signal in arch/x86/entry/syscalls/syscall_64.tbl.) Bug: 135608568 Test: test program using syscall(__NR_pidfd_send_signal,..) to send SIGKILL Change-Id: I34da11c63ac8cafb0353d9af24c820cef519ec27 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: electimon <electimon@gmail.com>	2022-11-15 21:35:29 +01:00
Jan Kara	ea1f75de62	ext4: enable quota enforcement based on mount options When quota information is stored in quota files, we enable only quota accounting on mount and enforcement is enabled only in response to Q_QUOTAON quotactl. To make ext4 behavior consistent with XFS, we add a possibility to enable quota enforcement on mount by specifying corresponding quota mount option (usrquota, grpquota, prjquota). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Change-Id: Ibf840723e020e4826eb09c7abae47df58f98e3a5	2022-10-23 23:26:01 -04:00
Li Xi	fbea7a0312	ext4: adds project ID support Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Reviewed-by: Jan Kara <jack@suse.cz> Change-Id: Iadddd9c7dbb2b2889270f1a5b8da3a408d22dd62	2022-10-23 23:25:59 -04:00
Li Xi	24d40233a7	ext4: add project quota support This patch adds mount options for enabling/disabling project quota accounting and enforcement. A new specific inode is also used for project quota accounting. [ Includes fix from Dan Carpenter to crrect error checking from dqget(). ] Signed-off-by: Li Xi <lixi@ddn.com> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Reviewed-by: Jan Kara <jack@suse.cz> Change-Id: Ic62742dc96a357f9dcd974b524d448697ba3e64a	2022-10-23 23:25:57 -04:00
Andreas Gruenbacher	4b7c149859	gfs2: Invalid security labels of inodes when they go invalid When gfs2 releases the glock of an inode, it must invalidate all information cached for that inode, including the page cache and acls. Use the new security_inode_invalidate_secctx hook to also invalidate security labels in that case. These items will be reread from disk when needed after reacquiring the glock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Cc: cluster-devel@redhat.com [PM: fixed spelling errors and description line lengths] Signed-off-by: Paul Moore <pmoore@redhat.com>	2022-03-04 20:19:53 +01:00
Aditya Kali	a680922ccf	kernfs: define kernfs_node_dentry Add a new kernfs api is added to lookup the dentry for a particular kernfs path. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Chatur27 <jasonbright2709@gmail.com>	2022-03-04 20:16:57 +01:00
Eric W. Biederman	88fe1ffae2	fs: Add user namespace member to struct super_block Start marking filesystems with a user namespace owner, s_user_ns. In this change this is only used for permission checks of who may mount a filesystem. Ultimately s_user_ns will be used for translating ids and checking capabilities for filesystems mounted from user namespaces. The default policy for setting s_user_ns is implemented in sget(), which arranges for s_user_ns to be set to current_user_ns() and to ensure that the mounter of the filesystem has CAP_SYS_ADMIN in that user_ns. The guts of sget are split out into another function sget_userns(). The function sget_userns calls alloc_super with the specified user namespace or it verifies the existing superblock that was found has the expected user namespace, and fails with EBUSY when it is not. This failing prevents users with the wrong privileges mounting a filesystem. The reason for the split of sget_userns from sget is that in some cases such as mount_ns and kernfs_mount_ns a different policy for permission checking of mounts and setting s_user_ns is necessary, and the existence of sget_userns() allows those policies to be implemented. The helper mount_ns is expected to be used for filesystems such as proc and mqueuefs which present per namespace information. The function mount_ns is modified to call sget_userns instead of sget to ensure the user namespace owner of the namespace whose information is presented by the filesystem is used on the superblock. For sysfs and cgroup the appropriate permission checks are already in place, and kernfs_mount_ns is modified to call sget_userns so that the init_user_ns is the only user namespace used. For the cgroup filesystem cgroup namespace mounts are bind mounts of a subset of the full cgroup filesystem and as such s_user_ns must be the same for all of them as there is only a single superblock. Mounts of sysfs that vary based on the network namespace could in principle change s_user_ns but it keeps the analysis and implementation of kernfs simpler if that is not supported, and at present there appear to be no benefits from supporting a different s_user_ns on any sysfs mount. Getting the details of setting s_user_ns correct has been a long process. Thanks to Pavel Tikhorirorv who spotted a leak in sget_userns. Thanks to Seth Forshee who has kept the work alive. Thanks-to: Seth Forshee <seth.forshee@canonical.com> Thanks-to: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Chatur27 <jasonbright2709@gmail.com>	2022-03-04 20:16:56 +01:00
Eric W. Biederman	cd02cea58b	vfs: Pass data, ns, and ns->userns to mount_ns Today what is normally called data (the mount options) is not passed to fill_super through mount_ns. Pass the mount options and the namespace separately to mount_ns so that filesystems such as proc that have mount options, can use mount_ns. Pass the user namespace to mount_ns so that the standard permission check that verifies the mounter has permissions over the namespace can be performed in mount_ns instead of in each filesystems .mount method. Thus removing the duplication between mqueuefs and proc in terms of permission checks. The extra permission check does not currently affect the rpc_pipefs filesystem and the nfsd filesystem as those filesystems do not currently allow unprivileged mounts. Without unpvileged mounts it is guaranteed that the caller has already passed capable(CAP_SYS_ADMIN) which guarantees extra permission check will pass. Update rpc_pipefs and the nfsd filesystem to ensure that the network namespace reference is always taken in fill_super and always put in kill_sb so that the logic is simpler and so that errors originating inside of fill_super do not cause a network namespace leak. Acked-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Chatur27 <jasonbright2709@gmail.com>	2022-03-04 20:16:56 +01:00
Tejun Heo	d29c1b5be3	kernfs: make kernfs_path() behave in the style of strlcpy() kernfs_path() functions always return the length of the full path but the path content is undefined if the length is larger than the provided buffer. This makes its behavior different from strlcpy() and requires error handling in all its users even when they don't care about truncation. In addition, the implementation can actully be simplified by making it behave properly in strlcpy() style. * Update kernfs_path_from_node_locked() to always fill up the buffer with path. If the buffer is not large enough, the output is truncated and terminated. * kernfs_path() no longer needs error handling. Make it a simple inline wrapper around kernfs_path_from_node(). * sysfs_warn_dup()'s use of kernfs_path() doesn't need error handling. Updated accordingly. * cgroup_path()'s use of kernfs_path() updated to retain the old behavior. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com> Signed-off-by: Chatur27 <jasonbright2709@gmail.com>	2022-03-04 20:16:56 +01:00
Serge Hallyn	ae9e4b19c4	kernfs_path_from_node_locked: don't overwrite nlen We've calculated @len to be the bytes we need for '/..' entries from @kn_from to the common ancestor, and calculated @nlen to be the extra bytes we need to get from the common ancestor to @kn_to. We use them as such at the end. But in the loop copying the actual entries, we overwrite @nlen. Use a temporary variable for that instead. Without this, the return length, when the buffer is large enough, is wrong. (When the buffer is NULL or too small, the returned value is correct. The buffer contents are also correct.) Interestingly, no callers of this function are affected by this as of yet. However the upcoming cgroup_show_path() will be. Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com> Signed-off-by: Chatur27 <jasonbright2709@gmail.com>	2022-03-04 20:16:56 +01:00
Tejun Heo	2beb4520c2	kernfs: implement kernfs_walk_and_get() Implement kernfs_walk_and_get() which is similar to kernfs_find_and_get() but can walk a path instead of just a name. v2: Use strlcpy() instead of strlen() + memcpy() as suggested by David. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: Chatur27 <jasonbright2709@gmail.com>	2022-03-04 20:16:49 +01:00
Johannes Weiner	2a1d14a66d	BACKPORT: fs: kernfs: add poll file operation Patch series "psi: pressure stall monitors", v3. Android is adopting psi to detect and remedy memory pressure that results in stuttering and decreased responsiveness on mobile devices. Psi gives us the stall information, but because we're dealing with latencies in the millisecond range, periodically reading the pressure files to detect stalls in a timely fashion is not feasible. Psi also doesn't aggregate its averages at a high enough frequency right now. This patch series extends the psi interface such that users can configure sensitive latency thresholds and use poll() and friends to be notified when these are breached. As high-frequency aggregation is costly, it implements an aggregation method that is optimized for fast, short-interval averaging, and makes the aggregation frequency adaptive, such that high-frequency updates only happen while monitored stall events are actively occurring. With these patches applied, Android can monitor for, and ward off, mounting memory shortages before they cause problems for the user. For example, using memory stall monitors in userspace low memory killer daemon (lmkd) we can detect mounting pressure and kill less important processes before device becomes visibly sluggish. In our memory stress testing psi memory monitors produce roughly 10x less false positives compared to vmpressure signals. Having ability to specify multiple triggers for the same psi metric allows other parts of Android framework to monitor memory state of the device and act accordingly. The new interface is straightforward. The user opens one of the pressure files for writing and writes a trigger description into the file descriptor that defines the stall state - some or full, and the maximum stall time over a given window of time. E.g.: /* Signal when stall time exceeds 100ms of a 1s window */ char trigger[] = "full 100000 1000000"; fd = open("/proc/pressure/memory"); write(fd, trigger, sizeof(trigger)); while (poll() >= 0) { ... } close(fd); When the monitored stall state is entered, psi adapts its aggregation frequency according to what the configured time window requires in order to emit event signals in a timely fashion. Once the stalling subsides, aggregation reverts back to normal. The trigger is associated with the open file descriptor. To stop monitoring, the user only needs to close the file descriptor and the trigger is discarded. Patches 1-4 prepare the psi code for polling support. Patch 5 implements the adaptive polling logic, the pressure growth detection optimized for short intervals, and hooks up write() and poll() on the pressure files. The patches were developed in collaboration with Johannes Weiner. This patch (of 5): Kernfs has a standardized poll/notification mechanism for waking all pollers on all fds when a filesystem node changes. To allow polling for custom events, add a .poll callback that can override the default. This is in preparation for pollable cgroup pressure files which have per-fd trigger configurations. Link: http://lkml.kernel.org/r/20190124211518.244221-2-surenb@google.com Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> (cherry picked from commit: 147e1a97c4a0bdd43f55a582a9416bb9092563a9) Conflicts: fs/kernfs/file.c include/linux/kernfs.h 1. replaced __poll_t with unsigned int. 2. replaced kernfs_dentry_node() with dentry->d_fsdata 3. replaced EPOLLERR/EPOLLPRI with POLLERR/POLLPRI (values are the same) Bug: 127712811 Test: lmkd in PSI mode Change-Id: Ic2bed334d05aec62f4e695f263893c3057921c55 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Chatur27 <jasonbright2709@gmail.com>	2022-03-04 20:16:49 +01:00
Tejun Heo	509c0be293	UPSTREAM: kernfs: add kernfs_ops->open/release() callbacks Add ->open/release() methods to kernfs_ops. ->open() is called when the file is opened and ->release() when the file is either released or severed. These callbacks can be used, for example, to manage persistent caching objects over multiple seq_file iterations. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Acked-by: Zefan Li <lizefan@huawei.com> (cherry picked from commit 0e67db2f9fe91937e798e3d7d22c50a8438187e1) Bug: 111308141 Test: modified lmkd to use PSI and tested using lmkd_unit_test Change-Id: Id06e9d5c6da1280bcdd4dc86309dcfaf52b8f9a4 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Chatur27 <jasonbright2709@gmail.com>	2022-03-04 20:16:49 +01:00
Aditya Kali	8ecfcb612c	kernfs: Add API to generate relative kernfs path The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Chatur27 <jasonbright2709@gmail.com>	2022-03-04 20:16:49 +01:00
Andrey Vagin	293b50a101	kernel: add a helper to get an owning user namespace for a namespace Return -EPERM if an owning user namespace is outside of a process current user namespace. v2: In a first version ns_get_owner returned ENOENT for init_user_ns. This special cases was removed from this version. There is nothing outside of init_user_ns, so we can return EPERM. v3: rename ns->get_owner() to ns->owner(). get_* usually means that it grabs a reference. Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Andrei Vagin <avagin@openvz.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Chatur27 <jasonbright2709@gmail.com>	2022-03-04 20:16:49 +01:00
Serge E. Hallyn	215cbf575a	cgroup, kernfs: make mountinfo show properly scoped path for cgroup namespaces Patch summary: When showing a cgroupfs entry in mountinfo, show the path of the mount root dentry relative to the reader's cgroup namespace root. Short explanation (courtesy of mkerrisk): If we create a new cgroup namespace, then we want both /proc/self/cgroup and /proc/self/mountinfo to show cgroup paths that are correctly virtualized with respect to the cgroup mount point. Previous to this patch, /proc/self/cgroup shows the right info, but /proc/self/mountinfo does not. Long version: When a uid 0 task which is in freezer cgroup /a/b, unshares a new cgroup namespace, and then mounts a new instance of the freezer cgroup, the new mount will be rooted at /a/b. The root dentry field of the mountinfo entry will show '/a/b'. cat > /tmp/do1 << EOF mount -t cgroup -o freezer freezer /mnt grep freezer /proc/self/mountinfo EOF unshare -Gm bash /tmp/do1 > 330 160 0:34 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,freezer > 355 133 0:34 /a/b /mnt rw,relatime - cgroup freezer rw,freezer The task's freezer cgroup entry in /proc/self/cgroup will simply show '/': grep freezer /proc/self/cgroup 9:freezer:/ If instead the same task simply bind mounts the /a/b cgroup directory, the resulting mountinfo entry will again show /a/b for the dentry root. However in this case the task will find its own cgroup at /mnt/a/b, not at /mnt: mount --bind /sys/fs/cgroup/freezer/a/b /mnt 130 25 0:34 /a/b /mnt rw,nosuid,nodev,noexec,relatime shared:21 - cgroup cgroup rw,freezer In other words, there is no way for the task to know, based on what is in mountinfo, which cgroup directory is its own. Example (by mkerrisk): First, a little script to save some typing and verbiage: echo -e "\t/proc/self/cgroup:\t$(cat /proc/self/cgroup \| grep freezer)" cat /proc/self/mountinfo \| grep freezer \| awk '{print "\tmountinfo:\t\t" $4 "\t" $5}' Create cgroup, place this shell into the cgroup, and look at the state of the /proc files: 2653 2653 # Our shell 14254 # cat(1) /proc/self/cgroup: 10:freezer:/a/b mountinfo: / /sys/fs/cgroup/freezer Create a shell in new cgroup and mount namespaces. The act of creating a new cgroup namespace causes the process's current cgroups directories to become its cgroup root directories. (Here, I'm using my own version of the "unshare" utility, which takes the same options as the util-linux version): Look at the state of the /proc files: /proc/self/cgroup: 10:freezer:/ mountinfo: / /sys/fs/cgroup/freezer The third entry in /proc/self/cgroup (the pathname of the cgroup inside the hierarchy) is correctly virtualized w.r.t. the cgroup namespace, which is rooted at /a/b in the outer namespace. However, the info in /proc/self/mountinfo is not for this cgroup namespace, since we are seeing a duplicate of the mount from the old mount namespace, and the info there does not correspond to the new cgroup namespace. However, trying to create a new mount still doesn't show us the right information in mountinfo: # propagating to other mountns /proc/self/cgroup: 7:freezer:/ mountinfo: /a/b /mnt/freezer The act of creating a new cgroup namespace caused the process's current freezer directory, "/a/b", to become its cgroup freezer root directory. In other words, the pathname directory of the directory within the newly mounted cgroup filesystem should be "/", but mountinfo wrongly shows us "/a/b". The consequence of this is that the process in the cgroup namespace cannot correctly construct the pathname of its cgroup root directory from the information in /proc/PID/mountinfo. With this patch, the dentry root field in mountinfo is shown relative to the reader's cgroup namespace. So the same steps as above: /proc/self/cgroup: 10:freezer:/a/b mountinfo: / /sys/fs/cgroup/freezer /proc/self/cgroup: 10:freezer:/ mountinfo: /../.. /sys/fs/cgroup/freezer /proc/self/cgroup: 10:freezer:/ mountinfo: / /mnt/freezer cgroup.clone_children freezer.parent_freezing freezer.state tasks cgroup.procs freezer.self_freezing notify_on_release 3164 2653 # First shell that placed in this cgroup 3164 # Shell started by 'unshare' 14197 # cat(1) Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com> Tested-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Chatur27 <jasonbright2709@gmail.com>	2022-03-04 20:16:47 +01:00
Aditya Kali	66bbf8312d	cgroup: introduce cgroup namespaces Introduce the ability to create new cgroup namespace. The newly created cgroup namespace remembers the cgroup of the process at the point of creation of the cgroup namespace (referred as cgroupns-root). The main purpose of cgroup namespace is to virtualize the contents of /proc/self/cgroup file. Processes inside a cgroup namespace are only able to see paths relative to their namespace root (unless they are moved outside of their cgroupns-root, at which point they will see a relative path from their cgroupns-root). For a correctly setup container this enables container-tools (like libcontainer, lxc, lmctfy, etc.) to create completely virtualized containers without leaking system level cgroup hierarchy to the task. This patch only implements the 'unshare' part of the cgroupns. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Chatur27 <jasonbright2709@gmail.com> Change-Id: Ifd2df9f562baa90b0fe7c986f86967602657c640	2022-03-04 20:16:44 +01:00
Michael Bestas	438071e031	Merge remote-tracking branch 'common/android-4.4-p' into android-msm-wahoo-4.4 * common/android-4.4-p: Linux 4.4.302 Input: i8042 - Fix misplaced backport of "add ASUS Zenbook Flip to noselftest list" KVM: x86: Fix misplaced backport of "work around leak of uninitialized stack contents" Revert "tc358743: fix register i2c_rd/wr function fix" Revert "drm/radeon/ci: disable mclk switching for high refresh rates (v2)" Bluetooth: MGMT: Fix misplaced BT_HS check ipv4: tcp: send zero IPID in SYNACK messages ipv4: raw: lock the socket in raw_bind() hwmon: (lm90) Reduce maximum conversion rate for G781 drm/msm: Fix wrong size calculation net-procfs: show net devices bound packet types ipv4: avoid using shared IP generator for connected sockets net: fix information leakage in /proc/net/ptype ipv6_tunnel: Rate limit warning messages scsi: bnx2fc: Flush destroy_work queue before calling bnx2fc_interface_put() USB: core: Fix hang in usb_kill_urb by adding memory barriers usb-storage: Add unusual-devs entry for VL817 USB-SATA bridge tty: Add support for Brainboxes UC cards. tty: n_gsm: fix SW flow control encoding/handling serial: stm32: fix software flow control transfer PM: wakeup: simplify the output logic of pm_show_wakelocks() udf: Fix NULL ptr deref when converting from inline format udf: Restore i_lenAlloc when inode expansion fails scsi: zfcp: Fix failed recovery on gone remote port with non-NPIV FCP devices s390/hypfs: include z/VM guests with access control group set Bluetooth: refactor malicious adv data check can: bcm: fix UAF of bcm op Linux 4.4.301 drm/i915: Flush TLBs before releasing backing store Linux 4.4.300 lib82596: Fix IRQ check in sni_82596_probe bcmgenet: add WOL IRQ check net_sched: restore "mpu xxx" handling dmaengine: at_xdmac: Fix at_xdmac_lld struct definition dmaengine: at_xdmac: Fix lld view setting dmaengine: at_xdmac: Print debug message after realeasing the lock dmaengine: at_xdmac: Don't start transactions at tx_submit level netns: add schedule point in ops_exit_list() net: axienet: fix number of TX ring slots for available check net: axienet: Wait for PhyRstCmplt after core reset af_unix: annote lockless accesses to unix_tot_inflight & gc_in_progress parisc: pdc_stable: Fix memory leak in pdcs_register_pathentries net/fsl: xgmac_mdio: Fix incorrect iounmap when removing module powerpc/fsl/dts: Enable WA for erratum A-009885 on fman3l MDIO buses ext4: don't use the orphan list when migrating an inode ext4: Fix BUG_ON in ext4_bread when write quota data ext4: set csum seed in tmp inode while migrating to extents ubifs: Error path in ubifs_remount_rw() seems to wrongly free write buffers power: bq25890: Enable continuous conversion for ADC at charging scsi: sr: Don't use GFP_DMA MIPS: Octeon: Fix build errors using clang i2c: designware-pci: Fix to change data types of hcnt and lcnt parameters ALSA: seq: Set upper limit of processed events w1: Misuse of get_user()/put_user() reported by sparse i2c: mpc: Correct I2C reset procedure powerpc/smp: Move setup_profiling_timer() under CONFIG_PROFILING i2c: i801: Don't silently correct invalid transfer size powerpc/btext: add missing of_node_put powerpc/cell: add missing of_node_put powerpc/powernv: add missing of_node_put powerpc/6xx: add missing of_node_put parisc: Avoid calling faulthandler_disabled() twice serial: core: Keep mctrl register state and cached copy in sync serial: pl010: Drop CR register reset on set_termios dm space map common: add bounds check to sm_ll_lookup_bitmap() dm btree: add a defensive bounds check to insert_at() net: mdio: Demote probed message to debug print btrfs: remove BUG_ON(!eie) in find_parent_nodes btrfs: remove BUG_ON() in find_parent_nodes() ACPICA: Executer: Fix the REFCLASS_REFOF case in acpi_ex_opcode_1A_0T_1R() ACPICA: Utilities: Avoid deleting the same object twice in a row um: registers: Rename function names to avoid conflicts and build problems ath9k: Fix out-of-bound memcpy in ath9k_hif_usb_rx_stream usb: hub: Add delay for SuperSpeed hub resume to let links transit to U0 media: saa7146: hexium_gemini: Fix a NULL pointer dereference in hexium_attach() media: igorplugusb: receiver overflow should be reported net: bonding: debug: avoid printing debug logs when bond is not notifying peers iwlwifi: mvm: synchronize with FW after multicast commands media: m920x: don't use stack on USB reads media: saa7146: hexium_orion: Fix a NULL pointer dereference in hexium_attach() floppy: Add max size check for user space request mwifiex: Fix skb_over_panic in mwifiex_usb_recv() HSI: core: Fix return freed object in hsi_new_client media: b2c2: Add missing check in flexcop_pci_isr: usb: gadget: f_fs: Use stream_open() for endpoint files ar5523: Fix null-ptr-deref with unexpected WDCMSG_TARGET_START reply fs: dlm: filter user dlm messages for kernel locks Bluetooth: Fix debugfs entry leak in hci_register_dev() RDMA/cxgb4: Set queue pair state when being queried mips: bcm63xx: add support for clk_set_parent() mips: lantiq: add support for clk_set_parent() misc: lattice-ecp3-config: Fix task hung when firmware load failed ASoC: samsung: idma: Check of ioremap return value dmaengine: pxa/mmp: stop referencing config->slave_id RDMA/core: Let ib_find_gid() continue search even after empty entry char/mwave: Adjust io port register size ALSA: oss: fix compile error when OSS_DEBUG is enabled powerpc/prom_init: Fix improper check of prom_getprop() ALSA: hda: Add missing rwsem around snd_ctl_remove() calls ALSA: PCM: Add missing rwsem around snd_ctl_remove() calls ALSA: jack: Add missing rwsem around snd_ctl_remove() calls ext4: avoid trim error on fs with small groups net: mcs7830: handle usb read errors properly pcmcia: fix setting of kthread task states can: xilinx_can: xcan_probe(): check for error irq can: softing: softing_startstop(): fix set but not used variable warning spi: spi-meson-spifc: Add missing pm_runtime_disable() in meson_spifc_probe ppp: ensure minimum packet size in ppp_write() pcmcia: rsrc_nonstatic: Fix a NULL pointer dereference in nonstatic_find_mem_region() pcmcia: rsrc_nonstatic: Fix a NULL pointer dereference in __nonstatic_find_io_region() usb: ftdi-elan: fix memory leak on device disconnect media: msi001: fix possible null-ptr-deref in msi001_probe() media: saa7146: mxb: Fix a NULL pointer dereference in mxb_attach() media: dib8000: Fix a memleak in dib8000_init() floppy: Fix hang in watchdog when disk is ejected serial: amba-pl011: do not request memory region twice drm/amdgpu: Fix a NULL pointer dereference in amdgpu_connector_lcd_native_mode() arm64: dts: qcom: msm8916: fix MMC controller aliases netfilter: bridge: add support for pppoe filtering tty: serial: atmel: Call dma_async_issue_pending() tty: serial: atmel: Check return code of dmaengine_submit() crypto: qce - fix uaf on qce_ahash_register_one Bluetooth: stop proccessing malicious adv data Bluetooth: cmtp: fix possible panic when cmtp_init_sockets() fails PCI: Add function 1 DMA alias quirk for Marvell 88SE9125 SATA controller can: softing_cs: softingcs_probe(): fix memleak on registration failure media: stk1160: fix control-message timeouts media: pvrusb2: fix control-message timeouts media: dib0700: fix undefined behavior in tuner shutdown media: em28xx: fix control-message timeouts media: mceusb: fix control-message timeouts rtc: cmos: take rtc_lock while reading from CMOS nfc: llcp: fix NULL error pointer dereference on sendmsg() after failed bind() HID: uhid: Fix worker destroying device without any protection rtlwifi: rtl8192cu: Fix WARNING when calling local_irq_restore() with interrupts enabled media: uvcvideo: fix division by zero at stream start drm/i915: Avoid bitwise vs logical OR warning in snb_wm_latency_quirk() can: gs_usb: gs_can_start_xmit(): zero-initialize hf->{flags,reserved} can: gs_usb: fix use of uninitialized variable, detach device on reception of invalid USB data mfd: intel-lpss: Fix too early PM enablement in the ACPI ->probe() USB: Fix "slab-out-of-bounds Write" bug in usb_hcd_poll_rh_status USB: core: Fix bug in resuming hub's handling of wakeup requests Bluetooth: bfusb: fix division by zero in send path Linux 4.4.299 power: reset: ltc2952: Fix use of floating point literals mISDN: change function names to avoid conflicts net: udp: fix alignment problem in udp4_seq_show() ip6_vti: initialize __ip6_tnl_parm struct in vti6_siocdevprivate scsi: libiscsi: Fix UAF in iscsi_conn_get_param()/iscsi_conn_teardown() phonet: refcount leak in pep_sock_accep rndis_host: support Hytera digital radios xfs: map unwritten blocks in XFS_IOC_{ALLOC,FREE}SP just like fallocate sch_qfq: prevent shift-out-of-bounds in qfq_init_qdisc i40e: Fix incorrect netdev's real number of RX/TX queues mac80211: initialize variable have_higher_than_11mbit ieee802154: atusb: fix uninit value in atusb_set_extended_addr Bluetooth: btusb: Apply QCA Rome patches for some ATH3012 models bpf, test: fix ld_abs + vlan push/pop stress test Linux 4.4.298 net: fix use-after-free in tw_timer_handler Input: spaceball - fix parsing of movement data packets Input: appletouch - initialize work before device registration scsi: vmw_pvscsi: Set residual data length conditionally usb: gadget: f_fs: Clear ffs_eventfd in ffs_data_clear. xhci: Fresco FL1100 controller should not have BROKEN_MSI quirk set. uapi: fix linux/nfc.h userspace compilation errors nfc: uapi: use kernel size_t to fix user-space builds selinux: initialize proto variable in selinux_ip_postroute_compat() recordmcount.pl: fix typo in s390 mcount regex platform/x86: apple-gmux: use resource_size() with res Linux 4.4.297 phonet/pep: refuse to enable an unbound pipe hamradio: improve the incomplete fix to avoid NPD hamradio: defer ax25 kfree after unregister_netdev ax25: NPD bug when detaching AX25 device xen/blkfront: fix bug in backported patch ARM: 9169/1: entry: fix Thumb2 bug in iWMMXt exception handling ALSA: drivers: opl3: Fix incorrect use of vp->state ALSA: jack: Check the return value of kstrdup() hwmon: (lm90) Fix usage of CONFIG2 register in detect function drivers: net: smc911x: Check for error irq bonding: fix ad_actor_system option setting to default qlcnic: potential dereference null pointer of rx_queue->page_ring IB/qib: Fix memory leak in qib_user_sdma_queue_pkts() HID: holtek: fix mouse probing can: kvaser_usb: get CAN clock frequency from device net: usb: lan78xx: add Allied Telesis AT29M2-AF Conflicts: drivers/usb/gadget/function/f_fs.c Change-Id: I54140777477cbab1b4c6b7d77558e92ca2b30e96	2022-02-09 19:41:39 +02:00
Greg Kroah-Hartman	875c0cc811	Merge 4.4.302 into android-4.4-p Changes in 4.4.302 can: bcm: fix UAF of bcm op Bluetooth: refactor malicious adv data check s390/hypfs: include z/VM guests with access control group set scsi: zfcp: Fix failed recovery on gone remote port with non-NPIV FCP devices udf: Restore i_lenAlloc when inode expansion fails udf: Fix NULL ptr deref when converting from inline format PM: wakeup: simplify the output logic of pm_show_wakelocks() serial: stm32: fix software flow control transfer tty: n_gsm: fix SW flow control encoding/handling tty: Add support for Brainboxes UC cards. usb-storage: Add unusual-devs entry for VL817 USB-SATA bridge USB: core: Fix hang in usb_kill_urb by adding memory barriers scsi: bnx2fc: Flush destroy_work queue before calling bnx2fc_interface_put() ipv6_tunnel: Rate limit warning messages net: fix information leakage in /proc/net/ptype ipv4: avoid using shared IP generator for connected sockets net-procfs: show net devices bound packet types drm/msm: Fix wrong size calculation hwmon: (lm90) Reduce maximum conversion rate for G781 ipv4: raw: lock the socket in raw_bind() ipv4: tcp: send zero IPID in SYNACK messages Bluetooth: MGMT: Fix misplaced BT_HS check Revert "drm/radeon/ci: disable mclk switching for high refresh rates (v2)" Revert "tc358743: fix register i2c_rd/wr function fix" KVM: x86: Fix misplaced backport of "work around leak of uninitialized stack contents" Input: i8042 - Fix misplaced backport of "add ASUS Zenbook Flip to noselftest list" Linux 4.4.302 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I5191d3cb4df0fa8de60170d2fedf4a3c51380fdf	2022-02-03 10:00:04 +01:00
Jan Kara	0f28e1a57b	udf: Fix NULL ptr deref when converting from inline format commit 7fc3b7c2981bbd1047916ade327beccb90994eee upstream. udf_expand_file_adinicb() calls directly ->writepage to write data expanded into a page. This however misses to setup inode for writeback properly and so we can crash on inode->i_wb dereference when submitting page for IO like: BUG: kernel NULL pointer dereference, address: 0000000000000158 #PF: supervisor read access in kernel mode ... <TASK> __folio_start_writeback+0x2ac/0x350 __block_write_full_page+0x37d/0x490 udf_expand_file_adinicb+0x255/0x400 [udf] udf_file_write_iter+0xbe/0x1b0 [udf] new_sync_write+0x125/0x1c0 vfs_write+0x28e/0x400 Fix the problem by marking the page dirty and going through the standard writeback path to write the page. Strictly speaking we would not even have to write the page but we want to catch e.g. ENOSPC errors early. Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com> CC: stable@vger.kernel.org Fixes: `52ebea749a` ("writeback: make backing_dev_info host cgroup-specific bdi_writebacks") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-03 09:27:52 +01:00
Jan Kara	f25e032aa6	udf: Restore i_lenAlloc when inode expansion fails commit ea8569194b43f0f01f0a84c689388542c7254a1f upstream. When we fail to expand inode from inline format to a normal format, we restore inode to contain the original inline formatting but we forgot to set i_lenAlloc back. The mismatch between i_lenAlloc and i_size was then causing further problems such as warnings and lost data down the line. Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com> CC: stable@vger.kernel.org Fixes: `7e49b6f248` ("udf: Convert UDF to new truncate calling sequence") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-03 09:27:52 +01:00
Greg Kroah-Hartman	22784fcab2	Merge 4.4.300 into android-4.4-p Changes in 4.4.300 Bluetooth: bfusb: fix division by zero in send path USB: core: Fix bug in resuming hub's handling of wakeup requests USB: Fix "slab-out-of-bounds Write" bug in usb_hcd_poll_rh_status mfd: intel-lpss: Fix too early PM enablement in the ACPI ->probe() can: gs_usb: fix use of uninitialized variable, detach device on reception of invalid USB data can: gs_usb: gs_can_start_xmit(): zero-initialize hf->{flags,reserved} drm/i915: Avoid bitwise vs logical OR warning in snb_wm_latency_quirk() media: uvcvideo: fix division by zero at stream start rtlwifi: rtl8192cu: Fix WARNING when calling local_irq_restore() with interrupts enabled HID: uhid: Fix worker destroying device without any protection nfc: llcp: fix NULL error pointer dereference on sendmsg() after failed bind() rtc: cmos: take rtc_lock while reading from CMOS media: mceusb: fix control-message timeouts media: em28xx: fix control-message timeouts media: dib0700: fix undefined behavior in tuner shutdown media: pvrusb2: fix control-message timeouts media: stk1160: fix control-message timeouts can: softing_cs: softingcs_probe(): fix memleak on registration failure PCI: Add function 1 DMA alias quirk for Marvell 88SE9125 SATA controller Bluetooth: cmtp: fix possible panic when cmtp_init_sockets() fails Bluetooth: stop proccessing malicious adv data crypto: qce - fix uaf on qce_ahash_register_one tty: serial: atmel: Check return code of dmaengine_submit() tty: serial: atmel: Call dma_async_issue_pending() netfilter: bridge: add support for pppoe filtering arm64: dts: qcom: msm8916: fix MMC controller aliases drm/amdgpu: Fix a NULL pointer dereference in amdgpu_connector_lcd_native_mode() serial: amba-pl011: do not request memory region twice floppy: Fix hang in watchdog when disk is ejected media: dib8000: Fix a memleak in dib8000_init() media: saa7146: mxb: Fix a NULL pointer dereference in mxb_attach() media: msi001: fix possible null-ptr-deref in msi001_probe() usb: ftdi-elan: fix memory leak on device disconnect pcmcia: rsrc_nonstatic: Fix a NULL pointer dereference in __nonstatic_find_io_region() pcmcia: rsrc_nonstatic: Fix a NULL pointer dereference in nonstatic_find_mem_region() ppp: ensure minimum packet size in ppp_write() spi: spi-meson-spifc: Add missing pm_runtime_disable() in meson_spifc_probe can: softing: softing_startstop(): fix set but not used variable warning can: xilinx_can: xcan_probe(): check for error irq pcmcia: fix setting of kthread task states net: mcs7830: handle usb read errors properly ext4: avoid trim error on fs with small groups ALSA: jack: Add missing rwsem around snd_ctl_remove() calls ALSA: PCM: Add missing rwsem around snd_ctl_remove() calls ALSA: hda: Add missing rwsem around snd_ctl_remove() calls powerpc/prom_init: Fix improper check of prom_getprop() ALSA: oss: fix compile error when OSS_DEBUG is enabled char/mwave: Adjust io port register size RDMA/core: Let ib_find_gid() continue search even after empty entry dmaengine: pxa/mmp: stop referencing config->slave_id ASoC: samsung: idma: Check of ioremap return value misc: lattice-ecp3-config: Fix task hung when firmware load failed mips: lantiq: add support for clk_set_parent() mips: bcm63xx: add support for clk_set_parent() RDMA/cxgb4: Set queue pair state when being queried Bluetooth: Fix debugfs entry leak in hci_register_dev() fs: dlm: filter user dlm messages for kernel locks ar5523: Fix null-ptr-deref with unexpected WDCMSG_TARGET_START reply usb: gadget: f_fs: Use stream_open() for endpoint files media: b2c2: Add missing check in flexcop_pci_isr: HSI: core: Fix return freed object in hsi_new_client mwifiex: Fix skb_over_panic in mwifiex_usb_recv() floppy: Add max size check for user space request media: saa7146: hexium_orion: Fix a NULL pointer dereference in hexium_attach() media: m920x: don't use stack on USB reads iwlwifi: mvm: synchronize with FW after multicast commands net: bonding: debug: avoid printing debug logs when bond is not notifying peers media: igorplugusb: receiver overflow should be reported media: saa7146: hexium_gemini: Fix a NULL pointer dereference in hexium_attach() usb: hub: Add delay for SuperSpeed hub resume to let links transit to U0 ath9k: Fix out-of-bound memcpy in ath9k_hif_usb_rx_stream um: registers: Rename function names to avoid conflicts and build problems ACPICA: Utilities: Avoid deleting the same object twice in a row ACPICA: Executer: Fix the REFCLASS_REFOF case in acpi_ex_opcode_1A_0T_1R() btrfs: remove BUG_ON() in find_parent_nodes() btrfs: remove BUG_ON(!eie) in find_parent_nodes net: mdio: Demote probed message to debug print dm btree: add a defensive bounds check to insert_at() dm space map common: add bounds check to sm_ll_lookup_bitmap() serial: pl010: Drop CR register reset on set_termios serial: core: Keep mctrl register state and cached copy in sync parisc: Avoid calling faulthandler_disabled() twice powerpc/6xx: add missing of_node_put powerpc/powernv: add missing of_node_put powerpc/cell: add missing of_node_put powerpc/btext: add missing of_node_put i2c: i801: Don't silently correct invalid transfer size powerpc/smp: Move setup_profiling_timer() under CONFIG_PROFILING i2c: mpc: Correct I2C reset procedure w1: Misuse of get_user()/put_user() reported by sparse ALSA: seq: Set upper limit of processed events i2c: designware-pci: Fix to change data types of hcnt and lcnt parameters MIPS: Octeon: Fix build errors using clang scsi: sr: Don't use GFP_DMA power: bq25890: Enable continuous conversion for ADC at charging ubifs: Error path in ubifs_remount_rw() seems to wrongly free write buffers ext4: set csum seed in tmp inode while migrating to extents ext4: Fix BUG_ON in ext4_bread when write quota data ext4: don't use the orphan list when migrating an inode powerpc/fsl/dts: Enable WA for erratum A-009885 on fman3l MDIO buses net/fsl: xgmac_mdio: Fix incorrect iounmap when removing module parisc: pdc_stable: Fix memory leak in pdcs_register_pathentries af_unix: annote lockless accesses to unix_tot_inflight & gc_in_progress net: axienet: Wait for PhyRstCmplt after core reset net: axienet: fix number of TX ring slots for available check netns: add schedule point in ops_exit_list() dmaengine: at_xdmac: Don't start transactions at tx_submit level dmaengine: at_xdmac: Print debug message after realeasing the lock dmaengine: at_xdmac: Fix lld view setting dmaengine: at_xdmac: Fix at_xdmac_lld struct definition net_sched: restore "mpu xxx" handling bcmgenet: add WOL IRQ check lib82596: Fix IRQ check in sni_82596_probe Linux 4.4.300 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ic6c59dd0f4ed703fff49584b3774d39e4548af4a	2022-01-27 08:54:26 +01:00

1 2 3 4 5 ...

46720 Commits