Backing files were sometimes put twice before, this fixes it so backing
files sent in response to lookups are closed exactly once always
Test: fuse_test pases, Android no longer throws a double close
Bug: 273737310
Change-Id: Ifa75ffd846185cfabfd1f5bad504078d955c99ed
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Signed-off-by: Alistair Delva <adelva@google.com>
Previously errors from the daemon in FUSE_CANONICAL_PATH were simply
ignored. In order to block inotifys, it is useful to be able to return
errors from this opcode.
Bug: 238619640
Test: inotify no longer works on /storage/emulated/0/Android/media but
does on child folders
Change-Id: I1c65814f4ad0ccef330bca9764c2db15c71bf2be
Signed-off-by: Paul Lawrence <paullawrence@google.com>
These patches extend FUSE to be able to act as a stacked filesystem. This
allows pure passthrough, where the fuse file system simply reflects the lower
filesystem, and also allows optional pre and post filtering in BPF and/or the
userspace daemon as needed. This can dramatically reduce or even eliminate
transitions to and from userspace.
See https://lwn.net/Articles/915717/
Note that this patch set has been extensively tested in
common-android13-5.10
This is a squash of these changes cherry-picked from common-android13-5.10
ANDROID: fuse-bpf: Make compile and pass test
ANDROID: fuse-bpf: set error_in to ENOENT in negative lookup
ANDROID: fuse-bpf: Add ability to run ranges of tests to fuse_test
ANDROID: fuse-bpf: Add test for lookup postfilter
ANDROID: fuse-bpf: readddir postfilter fixes
ANDROID: fix kernelci error in fs/fuse/dir.c
ANDROID: fuse-bpf: Fix RCU/reference issue
ANDROID: fuse-bpf: Always call revalidate for backing
ANDROID: fuse-bpf: Adjust backing handle funcs
ANDROID: fuse-bpf: Fix revalidate error path and backing handling
ANDROID: fuse-bpf: Fix use of get_fuse_inode
ANDROID: fuse: Don't use readdirplus w/ nodeid 0
ANDROID: fuse-bpf: Introduce readdirplus test case for fuse bpf
ANDROID: fuse-bpf: Make sure force_again flag is false by default
ANDROID: fuse-bpf: Make inodes with backing_fd reachable for regular FUSE fuse_iget
Revert "ANDROID: fuse-bpf: use target instead of parent inode to execute backing revalidate"
ANDROID: fuse-bpf: use target instead of parent inode to execute backing revalidate
ANDROID: fuse-bpf: Fix misuse of args.out_args
ANDROID: fuse-bpf: Fix non-fusebpf build
ANDROID: fuse-bpf: Use fuse_bpf_args in uapi
ANDROID: fuse-bpf: Fix read_iter
ANDROID: fuse-bpf: Use cache and refcount
ANDROID: fuse-bpf: Rename iocb_fuse to iocb_orig
ANDROID: fuse-bpf: Fix fixattr in rename
ANDROID: fuse-bpf: Fix readdir
ANDROID: fuse-bpf: Fix lseek return value for offset 0
ANDROID: fuse-bpf: fix read_iter and write_iter
ANDROID: fuse-bpf: fix special devices
ANDROID: fuse-bpf: support FUSE_LSEEK
ANDROID: fuse-bpf: Add support for FUSE_COPY_FILE_RANGE
ANDROID: fuse-bpf: Report errors to finalize
ANDROID: fuse-bpf: Avoid reusing uint64_t for file
ANDROID: fuse-bpf: Fix CONFIG_FUSE_BPF typo in FUSE_FSYNCDIR
ANDROID: fuse-bpf: Move fd operations to be synchronous
ANDROID: fuse-bpf: Invalidate if lower is unhashed
ANDROID: fuse-bpf: Move bpf earlier in fuse_permission
ANDROID: fuse-bpf: Update attributes on file write
ANDROID: fuse: allow mounting with no userspace daemon
ANDROID: fuse-bpf: Support FUSE_STATFS
ANDROID: fuse-bpf: Fix filldir
ANDROID: fuse-bpf: fix fuse_create_open_finalize
ANDROID: fuse: add bpf support for removexattr
ANDROID: fuse-bpf: Fix truncate
ANDROID: fuse-bpf: Support inotify
ANDROID: fuse-bpf: Make compile with CONFIG_FUSE but no CONFIG_FUSE_BPF
ANDROID: fuse-bpf: Fix perms on readdir
ANDROID: fuse: Fix umasking in backing
ANDROID: fs/fuse: Backing move returns EXDEV if TO not backed
ANDROID: bpf-fuse: Fix Setattr
ANDROID: fuse-bpf: Check if mkdir dentry setup
ANDROID: fuse-bpf: Close backing fds in fuse_dentry_revalidate
ANDROID: fuse-bpf: Close backing-fd on both paths
ANDROID: fuse-bpf: Partial fix for mmap'd files
ANDROID: fuse-bpf: Restore a missing const
ANDROID: Add fuse-bpf self tests
ANDROID: Add FUSE_BPF to gki_defconfig
ANDROID: fuse-bpf v1
ANDROID: fuse: Move functions in preparation for fuse-bpf
Bug: 202785178
Test: test_fuse passes on linux.
On cuttlefish,
atest android.scopedstorage.cts.host.ScopedStorageHostTest
passes with fuse-bpf enabled and disabled
Change-Id: Idb099c281f9b39ff2c46fa3ebc63e508758416ee
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Signed-off-by: Daniel Rosenberg <drosen@google.com>
When a new inode is created, send its security context to server along with
creation request (FUSE_CREAT, FUSE_MKNOD, FUSE_MKDIR and FUSE_SYMLINK).
This gives server an opportunity to create new file and set security
context (possibly atomically). In all the configurations it might not be
possible to set context atomically.
Like nfs and ceph, use security_dentry_init_security() to dermine security
context of inode and send it with create, mkdir, mknod, and symlink
requests.
Following is the information sent to server.
fuse_sectx_header, fuse_secctx, xattr_name, security_context
- struct fuse_secctx_header
This contains total number of security contexts being sent and total
size of all the security contexts (including size of
fuse_secctx_header).
- struct fuse_secctx
This contains size of security context which follows this structure.
There is one fuse_secctx instance per security context.
- xattr name string
This string represents name of xattr which should be used while setting
security context.
- security context
This is the actual security context whose size is specified in
fuse_secctx struct.
Also add the FUSE_SECURITY_CTX flag for the `flags` field of the
fuse_init_out struct. When this flag is set the kernel will append the
security context for a newly created inode to the request (create, mkdir,
mknod, and symlink). The server is responsible for ensuring that the inode
appears atomically (preferrably) with the requested security context.
For example, If the server is using SELinux and backed by a "real" linux
file system that supports extended attributes it can write the security
context value to /proc/thread-self/attr/fscreate before making the syscall
to create the inode.
This patch is based on patch from Chirantan Ekbote <chirantan@chromium.org>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Bug: 242358627
(cherry picked from commit 3e2b6fdbdc9ab5a02d9d5676a005f30780b97553)
[Yu-Li: Resolved minor conflict in fs/fuse/inode.c, include/uapi/linux/fuse.h]
Signed-off-by: Yu-Li Lin <yulilin@google.com>
Change-Id: I9caf481354b2a2ef5fd26835e68ac40a83ffebd5
Changes in 5.15.11
reset: tegra-bpmp: Revert Handle errors in BPMP response
KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabled
KVM: selftests: Make sure kvm_create_max_vcpus test won't hit RLIMIT_NOFILE
KVM: downgrade two BUG_ONs to WARN_ON_ONCE
x86/kvm: remove unused ack_notifier callbacks
KVM: X86: Fix tlb flush for tdp in kvm_invalidate_pcid()
mac80211: fix rate control for retransmitted frames
mac80211: fix regression in SSN handling of addba tx
mac80211: mark TX-during-stop for TX in in_reconfig
mac80211: send ADDBA requests using the tid/queue of the aggregation session
mac80211: validate extended element ID is present
firmware: arm_scpi: Fix string overflow in SCPI genpd driver
bpf: Fix kernel address leakage in atomic fetch
bpf, selftests: Add test case for atomic fetch on spilled pointer
bpf: Fix signed bounds propagation after mov32
bpf: Make 32->64 bounds propagation slightly more robust
bpf, selftests: Add test case trying to taint map value pointer
bpf: Fix kernel address leakage in atomic cmpxchg's r0 aux reg
bpf, selftests: Update test case for atomic cmpxchg on r0 with pointer
vduse: fix memory corruption in vduse_dev_ioctl()
vduse: check that offset is within bounds in get_config()
virtio_ring: Fix querying of maximum DMA mapping size for virtio device
vdpa: check that offsets are within bounds
s390/entry: fix duplicate tracking of irq nesting level
recordmcount.pl: look for jgnop instruction as well as bcrl on s390
arm64: dts: ten64: remove redundant interrupt declaration for gpio-keys
ceph: fix up non-directory creation in SGID directories
dm btree remove: fix use after free in rebalance_children()
audit: improve robustness of the audit queue handling
btrfs: convert latest_bdev type to btrfs_device and rename
btrfs: use latest_dev in btrfs_show_devname
btrfs: update latest_dev when we create a sprout device
btrfs: remove stale comment about the btrfs_show_devname
scsi: ufs: core: Retry START_STOP on UNIT_ATTENTION
drm/i915/hdmi: convert intel_hdmi_to_dev to intel_hdmi_to_i915
drm/i915/hdmi: Turn DP++ TMDS output buffers back on in encoder->shutdown()
pinctrl: amd: Fix wakeups when IRQ is shared with SCI
arm64: dts: rockchip: remove mmc-hs400-enhanced-strobe from rk3399-khadas-edge
arm64: dts: rockchip: fix rk3308-roc-cc vcc-sd supply
arm64: dts: rockchip: fix rk3399-leez-p710 vcc3v3-lan supply
arm64: dts: rockchip: fix audio-supply for Rock Pi 4
arm64: dts: rockchip: fix poweroff on helios64
dmaengine: idxd: add halt interrupt support
dmaengine: idxd: fix calling wq quiesce inside spinlock
mac80211: track only QoS data frames for admission control
tee: amdtee: fix an IS_ERR() vs NULL bug
ceph: fix duplicate increment of opened_inodes metric
ceph: initialize pathlen variable in reconnect_caps_cb
ARM: socfpga: dts: fix qspi node compatible
arm64: dts: imx8mq: remove interconnect property from lcdif
clk: Don't parent clks until the parent is fully registered
soc: imx: Register SoC device only on i.MX boards
iwlwifi: mvm: don't crash on invalid rate w/o STA
virtio: always enter drivers/virtio/
virtio/vsock: fix the transport to work with VMADDR_CID_ANY
vdpa: Consider device id larger than 31
Revert "drm/fb-helper: improve DRM fbdev emulation device names"
selftests: net: Correct ping6 expected rc from 2 to 1
s390/kexec_file: fix error handling when applying relocations
sch_cake: do not call cake_destroy() from cake_init()
inet_diag: fix kernel-infoleak for UDP sockets
netdevsim: don't overwrite read only ethtool parms
selftests: icmp_redirect: pass xfail=0 to log_test()
net: hns3: fix use-after-free bug in hclgevf_send_mbx_msg
net: hns3: fix race condition in debugfs
selftests: Add duplicate config only for MD5 VRF tests
selftests: Fix raw socket bind tests with VRF
selftests: Fix IPv6 address bind tests
dmaengine: idxd: fix missed completion on abort path
dmaengine: st_fdma: fix MODULE_ALIAS
drm: simpledrm: fix wrong unit with pixel clock
net/sched: sch_ets: don't remove idle classes from the round-robin list
selftests/net: toeplitz: fix udp option
net: dsa: mv88e6xxx: Unforce speed & duplex in mac_link_down()
selftest/net/forwarding: declare NETIFS p9 p10
mptcp: never allow the PM to close a listener subflow
drm/ast: potential dereference of null pointer
drm/i915/display: Fix an unsigned subtraction which can never be negative.
mac80211: agg-tx: don't schedule_and_wake_txq() under sta->lock
cfg80211: Acquire wiphy mutex on regulatory work
mac80211: fix lookup when adding AddBA extension element
net: stmmac: fix tc flower deletion for VLAN priority Rx steering
flow_offload: return EOPNOTSUPP for the unsupported mpls action type
rds: memory leak in __rds_conn_create()
ice: Use div64_u64 instead of div_u64 in adjfine
ice: Don't put stale timestamps in the skb
drm/amd/display: Set exit_optimized_pwr_state for DCN31
drm/amd/pm: fix a potential gpu_metrics_table memory leak
mptcp: remove tcp ulp setsockopt support
mptcp: clear 'kern' flag from fallback sockets
mptcp: fix deadlock in __mptcp_push_pending()
soc/tegra: fuse: Fix bitwise vs. logical OR warning
igb: Fix removal of unicast MAC filters of VFs
igbvf: fix double free in `igbvf_probe`
igc: Fix typo in i225 LTR functions
ixgbe: Document how to enable NBASE-T support
ixgbe: set X550 MDIO speed before talking to PHY
netdevsim: Zero-initialize memory for new map's value in function nsim_bpf_map_alloc
net/packet: rx_owner_map depends on pg_vec
net: stmmac: dwmac-rk: fix oob read in rk_gmac_setup
sfc_ef100: potential dereference of null pointer
dsa: mv88e6xxx: fix debug print for SPEED_UNFORCED
net: Fix double 0x prefix print in SKB dump
net/smc: Prevent smc_release() from long blocking
net: systemport: Add global locking for descriptor lifecycle
sit: do not call ipip6_dev_free() from sit_init_net()
afs: Fix mmap
arm64: kexec: Fix missing error code 'ret' warning in load_other_segments()
bpf: Fix extable fixup offset.
bpf, selftests: Fix racing issue in btf_skc_cls_ingress test
powerpc/85xx: Fix oops when CONFIG_FSL_PMC=n
USB: gadget: bRequestType is a bitfield, not a enum
Revert "usb: early: convert to readl_poll_timeout_atomic()"
KVM: x86: Drop guest CPUID check for host initiated writes to MSR_IA32_PERF_CAPABILITIES
tty: n_hdlc: make n_hdlc_tty_wakeup() asynchronous
USB: NO_LPM quirk Lenovo USB-C to Ethernet Adapher(RTL8153-04)
usb: dwc2: fix STM ID/VBUS detection startup delay in dwc2_driver_probe
PCI/MSI: Clear PCI_MSIX_FLAGS_MASKALL on error
PCI/MSI: Mask MSI-X vectors only on success
usb: xhci-mtk: fix list_del warning when enable list debug
usb: xhci: Extend support for runtime power management for AMD's Yellow carp.
usb: cdnsp: Fix incorrect status for control request
usb: cdnsp: Fix incorrect calling of cdnsp_died function
usb: cdnsp: Fix issue in cdnsp_log_ep trace event
usb: cdnsp: Fix lack of spin_lock_irqsave/spin_lock_restore
usb: typec: tcpm: fix tcpm unregister port but leave a pending timer
usb: gadget: u_ether: fix race in setting MAC address in setup phase
USB: serial: cp210x: fix CP2105 GPIO registration
USB: serial: option: add Telit FN990 compositions
selinux: fix sleeping function called from invalid context
btrfs: fix memory leak in __add_inode_ref()
btrfs: fix double free of anon_dev after failure to create subvolume
btrfs: check WRITE_ERR when trying to read an extent buffer
btrfs: fix missing blkdev_put() call in btrfs_scan_one_device()
zonefs: add MODULE_ALIAS_FS
iocost: Fix divide-by-zero on donation from low hweight cgroup
serial: 8250_fintek: Fix garbled text for console
timekeeping: Really make sure wall_to_monotonic isn't positive
cifs: sanitize multiple delimiters in prepath
locking/rtmutex: Fix incorrect condition in rtmutex_spin_on_owner()
riscv: dts: unleashed: Add gpio card detect to mmc-spi-slot
riscv: dts: unmatched: Add gpio card detect to mmc-spi-slot
perf inject: Fix segfault due to close without open
perf inject: Fix segfault due to perf_data__fd() without open
libata: if T_LENGTH is zero, dma direction should be DMA_NONE
powerpc/module_64: Fix livepatching for RO modules
drm/amdgpu: correct register access for RLC_JUMP_TABLE_RESTORE
drm/amdgpu: don't override default ECO_BITs setting
drm/amd/pm: fix reading SMU FW version from amdgpu_firmware_info on YC
Revert "can: m_can: remove support for custom bit timing"
can: m_can: make custom bittiming fields const
can: m_can: pci: use custom bit timings for Elkhart Lake
ARM: dts: imx6ull-pinfunc: Fix CSI_DATA07__ESAI_TX0 pad name
xsk: Do not sleep in poll() when need_wakeup set
mptcp: add missing documented NL params
bpf, x64: Factor out emission of REX byte in more cases
bpf: Fix extable address check.
USB: core: Make do_proc_control() and do_proc_bulk() killable
media: mxl111sf: change mutex_init() location
fuse: annotate lock in fuse_reverse_inval_entry()
ovl: fix warning in ovl_create_real()
scsi: scsi_debug: Don't call kcalloc() if size arg is zero
scsi: scsi_debug: Fix type in min_t to avoid stack OOB
scsi: scsi_debug: Sanity check block descriptor length in resp_mode_select()
io-wq: remove spurious bit clear on task_work addition
io-wq: check for wq exit after adding new worker task_work
rcu: Mark accesses to rcu_state.n_force_qs
io-wq: drop wqe lock before creating new worker
bus: ti-sysc: Fix variable set but not used warning for reinit_modules
selftests/damon: test debugfs file reads/writes with huge count
Revert "xsk: Do not sleep in poll() when need_wakeup set"
xen/blkfront: harden blkfront against event channel storms
xen/netfront: harden netfront against event channel storms
xen/console: harden hvc_xen against event channel storms
xen/netback: fix rx queue stall detection
xen/netback: don't queue unlimited number of packages
Linux 5.15.11
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I20c400f64f45729c6f833c31ee18eb4b92f5ed89
Changes in 5.15.10
nfc: fix segfault in nfc_genl_dump_devices_done
hwmon: (corsair-psu) fix plain integer used as NULL pointer
RDMA: Fix use-after-free in rxe_queue_cleanup
RDMA/mlx5: Fix releasing unallocated memory in dereg MR flow
mtd: rawnand: Fix nand_erase_op delay
mtd: rawnand: Fix nand_choose_best_timings() on unsupported interface
inet: use #ifdef CONFIG_SOCK_RX_QUEUE_MAPPING consistently
dt-bindings: media: nxp,imx7-mipi-csi2: Drop bad if/then schema
clk: qcom: sm6125-gcc: Swap ops of ice and apps on sdcc1
perf bpf_skel: Do not use typedef to avoid error on old clang
netfs: Fix lockdep warning from taking sb_writers whilst holding mmap_lock
RDMA/irdma: Fix a user-after-free in add_pble_prm
RDMA/irdma: Fix a potential memory allocation issue in 'irdma_prm_add_pble_mem()'
RDMA/irdma: Report correct WC errors
RDMA/irdma: Don't arm the CQ more than two times if no CE for this CQ
ice: fix FDIR init missing when reset VF
vmxnet3: fix minimum vectors alloc issue
i2c: virtio: fix completion handling
drm/msm: Fix null ptr access msm_ioctl_gem_submit()
drm/msm/a6xx: Fix uinitialized use of gpu_scid
drm/msm/dsi: set default num_data_lanes
drm/msm/dp: Avoid unpowered AUX xfers that caused crashes
KVM: arm64: Save PSTATE early on exit
s390/test_unwind: use raw opcode instead of invalid instruction
Revert "tty: serial: fsl_lpuart: drop earlycon entry for i.MX8QXP"
net/mlx4_en: Update reported link modes for 1/10G
loop: Use pr_warn_once() for loop_control_remove() warning
ALSA: hda: Add Intel DG2 PCI ID and HDMI codec vid
ALSA: hda/hdmi: fix HDA codec entry table order for ADL-P
parisc/agp: Annotate parisc agp init functions with __init
i2c: rk3x: Handle a spurious start completion interrupt flag
net: netlink: af_netlink: Prevent empty skb by adding a check on len.
drm/amdgpu: cancel the correct hrtimer on exit
drm/amdgpu: check atomic flag to differeniate with legacy path
drm/amd/display: Fix for the no Audio bug with Tiled Displays
drm/amdkfd: fix double free mem structure
drm/amd/display: add connector type check for CRC source set
drm/amdkfd: process_info lock not needed for svm
tracing: Fix a kmemleak false positive in tracing_map
staging: most: dim2: use device release method
fuse: make sure reclaim doesn't write the inode
perf inject: Fix itrace space allowed for new attributes
Linux 5.15.10
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I611aba3cbaa414f3dc6e3922245e140c36cbcb14
commit 5c791fe1e2a4f401f819065ea4fc0450849f1818 upstream.
In writeback cache mode mtime/ctime updates are cached, and flushed to the
server using the ->write_inode() callback.
Closing the file will result in a dirty inode being immediately written,
but in other cases the inode can remain dirty after all references are
dropped. This result in the inode being written back from reclaim, which
can deadlock on a regular allocation while the request is being served.
The usual mechanisms (GFP_NOFS/PF_MEMALLOC*) don't work for FUSE, because
serving a request involves unrelated userspace process(es).
Instead do the same as for dirty pages: make sure the inode is written
before the last reference is gone.
- fallocate(2)/copy_file_range(2): these call file_update_time() or
file_modified(), so flush the inode before returning from the call
- unlink(2), link(2) and rename(2): these call fuse_update_ctime(), so
flush the ctime directly from this helper
Reported-by: chenguanyou <chenguanyou@xiaomi.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: Ed Tsai <ed.tsai@mediatek.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The page used to contain the fuse_dentry_canonical_path to be handled in
fuse_dev_do_write is allocated using __get_free_pages(GFP_KERNEL).
The returned page may contain undefined data, that by chance may be
considered as a valid path name that is not in the cache. In that case,
if the FUSE daemon mistakenly doesn't fill the canonical path buffer,
the FUSE driver may fall into two blocking
request_wait_answer(fuse_dev_write->kern_path->fuse_lookup_name)
causing a deadlock condition.
The stack is as follows:
find S 0 20511 20117 0x00000000
Call trace:
[<ffffff8008085e78>] __switch_to+0xb8/0xd4
[<ffffff8008a0cac4>] __schedule+0x458/0x714
[<ffffff8008a0ce0c>] schedule+0x8c/0xa8
[<ffffff800833865c>] request_wait_answer+0x74/0x220
[<ffffff8008339f70>] __fuse_request_send+0x8c/0xa0
[<ffffff8008339fe4>] fuse_request_send+0x60/0x6c
[<ffffff800833c1a8>] fuse_dentry_canonical_path+0xb8/0x104
[<ffffff800820b14c>] do_sys_open+0x1b4/0x260
[<ffffff800820b27c>] SyS_openat+0x3c/0x4c
[<ffffff8008083540>] el0_svc_naked+0x34/0x38
mount.ntfs-3g S 0 5845 1 0x00000000
Call trace:
[<ffffff8008085e78>] __switch_to+0xb8/0xd4
[<ffffff8008a0cac4>] __schedule+0x458/0x714
[<ffffff8008a0ce0c>] schedule+0x8c/0xa8
[<ffffff800833865c>] request_wait_answer+0x74/0x220
[<ffffff8008339f70>] __fuse_request_send+0x8c/0xa0
[<ffffff8008339fe4>] fuse_request_send+0x60/0x6c
[<ffffff800833bdb0>] fuse_simple_request+0x128/0x16c
[<ffffff800833dddc>] fuse_lookup_name+0x104/0x1b0
[<ffffff800833dee4>] fuse_lookup+0x5c/0x11c
[<ffffff800821861c>] lookup_slow+0xfc/0x174
[<ffffff800821b474>] walk_component+0xf0/0x290
[<ffffff800821bbac>] path_lookupat+0xa0/0x128
[<ffffff800821c7f4>] filename_lookup+0x84/0x124
[<ffffff800821c8d8>] kern_path+0x44/0x54
[<ffffff800833b0c8>] fuse_dev_do_write+0x828/0xa0c
[<ffffff800833b610>] fuse_dev_write+0x90/0xb4
[<ffffff800820b770>] do_iter_readv_writev+0xf4/0x13c
[<ffffff800820cc88>] do_readv_writev+0xec/0x220
[<ffffff800820d05c>] vfs_writev+0x60/0x74
[<ffffff800820d0ec>] do_writev+0x7c/0x100
[<ffffff800820e348>] SyS_writev+0x38/0x48
[<ffffff8008083540>] el0_svc_naked+0x34/0x38
Fix by ensuring that the page allocated for the canonical path is zeroed.
Bug: 194856119
Bug: 196051870
Fixes: 24ab59f6bb42 ("ANDROID: fuse: Add support for d_canonical_path")
Signed-off-by: Biao Li <libiao@allwinnertech.com>
Signed-off-by: Shuosheng Huang <huangshuosheng@allwinnertech.com>
Signed-off-by: Alessio Balsini <balsini@google.com>
Change-Id: I400815dc1049d90c308f5cf87ce60de97ff82131
Use invalidate_lock instead of fuse's private i_mmap_sem. The intended
purpose is exactly the same. By this conversion we fix a long standing
race between hole punching and read(2) / readahead(2) paths that can
lead to stale page cache contents.
CC: Miklos Szeredi <miklos@szeredi.hu>
Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Server responds to LOOKUP and other ops (READDIRPLUS/CREATE/MKNOD/...)
with ourarg containing nodeid and generation.
If a fuse inode is found in inode cache with the same nodeid but different
generation, the existing fuse inode should be unhashed and marked "bad" and
a new inode with the new generation should be hashed instead.
This can happen, for example, with passhrough fuse filesystem that returns
the real filesystem ino/generation on lookup and where real inode numbers
can get recycled due to real files being unlinked not via the fuse
passthrough filesystem.
With current code, this situation will not be detected and an old fuse
dentry that used to point to an older generation real inode, can be used to
access a completely new inode, which should be accessed only via the new
dentry.
Note that because the FORGET message carries the nodeid w/o generation, the
server should wait to get FORGET counts for the nlookup counts of the old
and reused inodes combined, before it can free the resources associated to
that nodeid.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
fc_mount() already handles the vfs_get_tree(), sb->s_umount
unlocking and vfs_create_mount() sequence. Using it greatly
simplifies fuse_dentry_automount().
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
We recently fixed an infinite loop by setting the SB_BORN flag on
submounts along with the write barrier needed by super_cache_count().
This is the job of vfs_get_tree() and FUSE shouldn't have to care
about the barrier at all.
Split out some code from fuse_dentry_automount() to the dedicated
fuse_get_tree_submount() handler for submounts and call vfs_get_tree().
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
We don't set the SB_BORN flag on submounts. This is wrong as these
superblocks are then considered as partially constructed or dying
in the rest of the code and can break some assumptions.
One such case is when you have a virtiofs filesystem with submounts
and you try to mount it again : virtio_fs_get_tree() tries to obtain
a superblock with sget_fc(). The logic in sget_fc() is to loop until
it has either found an existing matching superblock with SB_BORN set
or to create a brand new one. It is assumed that a superblock without
SB_BORN is transient and the loop is restarted. Forgetting to set
SB_BORN on submounts hence causes sget_fc() to retry forever.
Setting SB_BORN requires special care, i.e. a write barrier for
super_cache_count() which can check SB_BORN without taking any lock.
We should call vfs_get_tree() to deal with that but this requires
to have a proper ->get_tree() implementation for submounts, which
is a bigger piece of work. Go for a simple bug fix in the meatime.
Fixes: bf109c6404 ("fuse: implement crossmounts")
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
As soon as fuse_dentry_automount() does up_write(&sb->s_umount), the
superblock can theoretically be killed. If this happens before the
submount was added to the &fc->mounts list, fuse_mount_remove() later
crashes in list_del_init() because it assumes the submount to be
already there.
Add the submount before dropping sb->s_umount to fix the inconsistency.
It is okay to nest fc->killsb under sb->s_umount, we already do this
on the ->kill_sb() path.
Signed-off-by: Greg Kurz <groug@kaod.org>
Fixes: bf109c6404 ("fuse: implement crossmounts")
Cc: stable@vger.kernel.org # v5.10+
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
If fuse_fill_super_submount() returns an error, the error path
triggers a crash:
[ 26.206673] BUG: kernel NULL pointer dereference, address: 0000000000000000
[...]
[ 26.226362] RIP: 0010:__list_del_entry_valid+0x25/0x90
[...]
[ 26.247938] Call Trace:
[ 26.248300] fuse_mount_remove+0x2c/0x70 [fuse]
[ 26.248892] virtio_kill_sb+0x22/0x160 [virtiofs]
[ 26.249487] deactivate_locked_super+0x36/0xa0
[ 26.250077] fuse_dentry_automount+0x178/0x1a0 [fuse]
The crash happens because fuse_mount_remove() assumes that the FUSE
mount was already added to list under the FUSE connection, but this
only done after fuse_fill_super_submount() has returned success.
This means that until fuse_fill_super_submount() has returned success,
the FUSE mount isn't actually owned by the superblock. We should thus
reclaim ownership by clearing sb->s_fs_info, which will skip the call
to fuse_mount_remove(), and perform rollback, like virtio_fs_get_tree()
already does for the root sb.
Fixes: bf109c6404 ("fuse: implement crossmounts")
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Pull fileattr conversion updates from Miklos Szeredi via Al Viro:
"This splits the handling of FS_IOC_[GS]ETFLAGS from ->ioctl() into a
separate method.
The interface is reasonably uniform across the filesystems that
support it and gives nice boilerplate removal"
* 'miklos.fileattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (23 commits)
ovl: remove unneeded ioctls
fuse: convert to fileattr
fuse: add internal open/release helpers
fuse: unsigned open flags
fuse: move ioctl to separate source file
vfs: remove unused ioctl helpers
ubifs: convert to fileattr
reiserfs: convert to fileattr
ocfs2: convert to fileattr
nilfs2: convert to fileattr
jfs: convert to fileattr
hfsplus: convert to fileattr
efivars: convert to fileattr
xfs: convert to fileattr
orangefs: convert to fileattr
gfs2: convert to fileattr
f2fs: convert to fileattr
ext4: convert to fileattr
ext2: convert to fileattr
btrfs: convert to fileattr
...
Since fuse just passes ioctl args through to/from server, converting to the
fileattr API is more involved, than most other filesystems.
Both .fileattr_set() and .fileattr_get() need to obtain an open file to
operate on. The simplest way is with the following sequence:
FUSE_OPEN
FUSE_IOCTL
FUSE_RELEASE
If this turns out to be a performance problem, it could be optimized for
the case when there's already a file (any file) open for the inode.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
inode_wrong_type(inode, mode) returns true if setting inode->i_mode
to given value would've changed the inode type. We have enough of
those checks open-coded to make a helper worthwhile.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Steps on the way to 5.12-rc1.
Resolves conflicts in:
fs/overlayfs/inode.c
Note, incfs is broken here, will mark it as BROKEN in another patch to
make it more obvious.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I95a5938fc3cfa27d521c5d4fd18b2adfb13a6d84
Expose the FUSE_PASSTHROUGH interface to user space and declare all the
basic data structures and functions as the skeleton on top of which the
FUSE passthrough functionality will be built.
As part of this, introduce the new FUSE passthrough ioctl, which allows
the FUSE daemon to specify a direct connection between a FUSE file and a
lower file system file. Such ioctl requires user space to pass the file
descriptor of one of its opened files through the fuse_passthrough_out
data structure introduced in this patch. This structure includes extra
fields for possible future extensions.
Also, add the passthrough functions for the set-up and tear-down of the
data structures and locks that will be used both when fuse_conns and
fuse_files are created/deleted.
Bug: 168023149
Link: https://lore.kernel.org/lkml/20210125153057.3623715-4-balsini@android.com/
Signed-off-by: Alessio Balsini <balsini@android.com>
Change-Id: I732532581348adadda5b5048a9346c2b0868d539
Signed-off-by: Alessio Balsini <balsini@google.com>
Extend some inode methods with an additional user namespace argument. A
filesystem that is aware of idmapped mounts will receive the user
namespace the mount has been marked with. This can be used for
additional permission checking and also to enable filesystems to
translate between uids and gids if they need to. We have implemented all
relevant helpers in earlier patches.
As requested we simply extend the exisiting inode method instead of
introducing new ones. This is a little more code churn but it's mostly
mechanical and doesnt't leave us with additional inode methods.
Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
When file attributes are changed most filesystems rely on the
setattr_prepare(), setattr_copy(), and notify_change() helpers for
initialization and permission checking. Let them handle idmapped mounts.
If the inode is accessed through an idmapped mount map it into the
mount's user namespace. Afterwards the checks are identical to
non-idmapped mounts. If the initial user namespace is passed nothing
changes so non-idmapped mounts will see identical behavior as before.
Helpers that perform checks on the ia_uid and ia_gid fields in struct
iattr assume that ia_uid and ia_gid are intended values and have already
been mapped correctly at the userspace-kernelspace boundary as we
already do today. If the initial user namespace is passed nothing
changes so non-idmapped mounts will see identical behavior as before.
Link: https://lore.kernel.org/r/20210121131959.646623-8-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
The two helpers inode_permission() and generic_permission() are used by
the vfs to perform basic permission checking by verifying that the
caller is privileged over an inode. In order to handle idmapped mounts
we extend the two helpers with an additional user namespace argument.
On idmapped mounts the two helpers will make sure to map the inode
according to the mount's user namespace and then peform identical
permission checks to inode_permission() and generic_permission(). If the
initial user namespace is passed nothing changes so non-idmapped mounts
will see identical behavior as before.
Link: https://lore.kernel.org/r/20210121131959.646623-6-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Steps on the way to 5.11-rc1
Resolves conflicts in:
fs/fuse/inode.c
include/uapi/linux/fuse.h
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I1e09e7edd8b8d325db685c58fd9c8e52130b7c59
Jan Kara's analysis of the syzbot report (edited):
The reproducer opens a directory on FUSE filesystem, it then attaches
dnotify mark to the open directory. After that a fuse_do_getattr() call
finds that attributes returned by the server are inconsistent, and calls
make_bad_inode() which, among other things does:
inode->i_mode = S_IFREG;
This then confuses dnotify which doesn't tear down its structures
properly and eventually crashes.
Avoid calling make_bad_inode() on a live inode: switch to a private flag on
the fuse inode. Also add the test to ops which the bad_inode_ops would
have caught.
This bug goes back to the initial merge of fuse in 2.6.14...
Reported-by: syzbot+f427adf9324b92652ccc@syzkaller.appspotmail.com
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Tested-by: Jan Kara <jack@suse.cz>
Cc: <stable@vger.kernel.org>
With FUSE_HANDLE_KILLPRIV_V2 support, server will need to kill suid/sgid/
security.capability on open(O_TRUNC), if server supports
FUSE_ATOMIC_O_TRUNC.
But server needs to kill suid/sgid only if caller does not have CAP_FSETID.
Given server does not have this information, client needs to send this info
to server.
So add a flag FUSE_OPEN_KILL_SUIDGID to fuse_open_in request which tells
server to kill suid/sgid (only if group execute is set).
This flag is added to the FUSE_OPEN request, as well as the FUSE_CREATE
request if the create was non-exclusive, since that might result in an
existing file being opened/truncated.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
If client does a write() on a suid/sgid file, VFS will first call
fuse_setattr() with ATTR_KILL_S[UG]ID set. This requires sending setattr
to file server with ATTR_MODE set to kill suid/sgid. But to do that client
needs to know latest mode otherwise it is racy.
To reduce the race window, current code first call fuse_do_getattr() to get
latest ->i_mode and then resets suid/sgid bits and sends rest to server
with setattr(ATTR_MODE). This does not reduce the race completely but
narrows race window significantly.
With fc->handle_killpriv_v2 enabled, it should be possible to remove this
race completely. Do not kill suid/sgid with ATTR_MODE at all. It will be
killed by server when WRITE request is sent to server soon. This is
similar to fc->handle_killpriv logic. V2 is just more refined version of
protocol. Hence this patch does not send ATTR_MODE to kill suid/sgid if
fc->handle_killpriv_v2 is enabled.
This creates an issue if fc->writeback_cache is enabled. In that case
WRITE can be cached in guest and server might not see WRITE request and
hence will not kill suid/sgid. Miklos suggested that in such cases, we
should fallback to a writethrough WRITE instead and that will generate
WRITE request and kill suid/sgid. This patch implements that too.
But this relies on client seeing the suid/sgid set. If another client sets
suid/sgid and this client does not see it immideately, then we will not
fallback to writethrough WRITE. So this is one limitation with both
fc->handle_killpriv_v2 and fc->writeback_cache enabled. Both the options
are not fully compatible. But might be good enough for many use cases.
Note: This patch is not checking whether security.capability is set or not
when falling back to writethrough path. If suid/sgid is not set and
only security.capability is set, that will be taken care of by
file_remove_privs() call in ->writeback_cache path.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
If fc->handle_killpriv_v2 is enabled, we expect file server to clear
suid/sgid/security.capbility upon chown/truncate/write as appropriate.
Upon truncate (ATTR_SIZE), suid/sgid are cleared only if caller does not
have CAP_FSETID. File server does not know whether caller has CAP_FSETID
or not. Hence set FATTR_KILL_SUIDGID upon truncate to let file server know
that caller does not have CAP_FSETID and it should kill suid/sgid as
appropriate.
On chown (ATTR_UID/ATTR_GID) suid/sgid need to be cleared irrespective of
capabilities of calling process, so set FATTR_KILL_SUIDGID unconditionally
in that case.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Failure to do so may result in EEXIST even if the file only exists in the
cache and not in the filesystem.
The atomic nature of O_EXCL mandates that the cached state should be
ignored and existence verified anew.
Reported-by: Ken Schalk <kschalk@nvidia.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fuse mount now only ever has a refcount of one (before being freed) so the
count field is unnecessary.
Remove the refcounting and fold fuse_mount_put() into callers. The only
caller of fuse_mount_put() where fm->fc was NULL is fuse_dentry_automount()
and here the fuse_conn_put() can simply be omitted.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Allows FUSE to report to inotify that it is acting as a layered filesystem.
The userspace component returns a string representing the location of the
underlying file. If the string cannot be resolved into a path, the top
level path is returned instead.
Bug: 23904372
Bug: 171780975
Test: FileObserverTest and FileObserverTestLegacyPath on cuttlefish
Change-Id: Iabdca0bbedfbff59e9c820c58636a68ef9683d9f
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Signed-off-by: Alessio Balsini <balsini@google.com>
Expose the FUSE_PASSTHROUGH interface to userspace and declare all the
basic data structures and functions as the skeleton on top of which the
FUSE passthrough functionality will be built.
As part of this, introduce the new FUSE passthrough ioctl(), which
allows
the FUSE daemon to specify a direct connection between a FUSE file and a
lower file system file. Such ioctl() requires userspace to pass the file
descriptor of one of its opened files through the fuse_passthrough_out
data
structure introduced in this patch. This structure includes extra fields
for possible future extensions.
Also, add the passthrough functions for the set-up and tear-down of the
data structures and locks that will be used both when fuse_conns and
fuse_files are created/deleted.
Bug: 168023149
Link: https://lore.kernel.org/lkml/20201026125016.1905945-2-balsini@android.com/
Signed-off-by: Alessio Balsini <balsini@android.com>
Signed-off-by: Alessio Balsini <balsini@google.com>
Change-Id: I6dd150b93607e10ed53f7e7975b35b6090080fa2
FUSE servers can indicate crossmount points by setting FUSE_ATTR_SUBMOUNT
in fuse_attr.flags. The inode will then be marked as S_AUTOMOUNT, and the
.d_automount implementation creates a new submount at that location, so
that the submount gets a distinct st_dev value.
Note that all submounts get a distinct superblock and a distinct st_dev
value, so for virtio-fs, even if the same filesystem is mounted more than
once on the host, none of its mount points will have the same st_dev. We
need distinct superblocks because the superblock points to the root node,
but the different host mounts may show different trees (e.g. due to
submounts in some of them, but not in others).
Right now, this behavior is only enabled when fuse_conn.auto_submounts is
set, which is the case only for virtio-fs.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
We want to allow submounts for the same fuse_conn, but with different
superblocks so that each of the submounts has its own device ID. To do
so, we need to split all mount-specific information off of fuse_conn
into a new fuse_mount structure, so that multiple mounts can share a
single fuse_conn.
We need to take care only to perform connection-level actions once (i.e.
when the fuse_conn and thus the first fuse_mount are established, or
when the last fuse_mount and thus the fuse_conn are destroyed). For
example, fuse_sb_destroy() must invoke fuse_send_destroy() until the
last superblock is released.
To do so, we keep track of which fuse_mount is the root mount and
perform all fuse_conn-level actions only when this fuse_mount is
involved.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Currently in fuse we don't seem have any lock which can serialize fault
path with truncate/punch_hole path. With dax support I need one for
following reasons.
1. Dax requirement
DAX fault code relies on inode size being stable for the duration of
fault and want to serialize with truncate/punch_hole and they explicitly
mention it.
static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp,
const struct iomap_ops *ops)
/*
* Check whether offset isn't beyond end of file now. Caller is
* supposed to hold locks serializing us with truncate / punch hole so
* this is a reliable test.
*/
max_pgoff = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
2. Make sure there are no users of pages being truncated/punch_hole
get_user_pages() might take references to page and then do some DMA
to said pages. Filesystem might truncate those pages without knowing
that a DMA is in progress or some I/O is in progress. So use
dax_layout_busy_page() to make sure there are no such references
and I/O is not in progress on said pages before moving ahead with
truncation.
3. Limitation of kvm page fault error reporting
If we are truncating file on host first and then removing mappings in
guest lateter (truncate page cache etc), then this could lead to a
problem with KVM. Say a mapping is in place in guest and truncation
happens on host. Now if guest accesses that mapping, then host will
take a fault and kvm will either exit to qemu or spin infinitely.
IOW, before we do truncation on host, we need to make sure that guest
inode does not have any mapping in that region or whole file.
4. virtiofs memory range reclaim
Soon I will introduce the notion of being able to reclaim dax memory
ranges from a fuse dax inode. There also I need to make sure that
no I/O or fault is going on in the reclaimed range and nobody is using
it so that range can be reclaimed without issues.
Currently if we take inode lock, that serializes read/write. But it does
not do anything for faults. So I add another semaphore fuse_inode->i_mmap_sem
for this purpose. It can be used to serialize with faults.
As of now, I am adding taking this semaphore only in dax fault path and
not regular fault path because existing code does not have one. May
be existing code can benefit from it as well to take care of some
races, but that we can fix later if need be. For now, I am just focussing
only on DAX path which is new path.
Also added logic to take fuse_inode->i_mmap_sem in
truncate/punch_hole/open(O_TRUNC) path to make sure file truncation and
fuse dax fault are mutually exlusive and avoid all the above problems.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fuse mounts without "allow_other" are off-limits to all non-owners. Yet it
makes sense to allow querying st_dev on the root, since this value is
provided by the kernel, not the userspace filesystem.
Allow statx(2) with a zero request mask to succeed on a fuse mounts for all
users.
Reported-by: Nikolaus Rath <Nikolaus@rath.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Allow fuse to pass RENAME_WHITEOUT to fuse server. Overlayfs on top of
virtiofs uses RENAME_WHITEOUT.
Without this patch renaming a directory in overlayfs (dir is on lower)
fails with -EINVAL. With this patch it works.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
When adding a new hard link, make sure that i_nlink doesn't overflow.
Fixes: ac45d61357 ("fuse: fix nlink after unlink")
Cc: <stable@vger.kernel.org> # v3.4
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
If a filesystem returns negative inode sizes, future reads on the file were
causing the cpu to spin on truncate_pagecache.
Create a helper to validate the attributes. This now does two things:
- check the file mode
- check if the file size fits in i_size without overflowing
Reported-by: Arijit Banerjee <arijit@rubrik.com>
Fixes: d8a5ba4545 ("[PATCH] FUSE - core")
Cc: <stable@vger.kernel.org> # v2.6.14
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
If writeback cache is enabled, then writes might get reordered with
chmod/chown/utimes. The problem with this is that performing the write in
the fuse daemon might itself change some of these attributes. In such case
the following sequence of operations will result in file ending up with the
wrong mode, for example:
int fd = open ("suid", O_WRONLY|O_CREAT|O_EXCL);
write (fd, "1", 1);
fchown (fd, 0, 0);
fchmod (fd, 04755);
close (fd);
This patch fixes this by flushing pending writes before performing
chown/chmod/utimes.
Reported-by: Giuseppe Scrivano <gscrivan@redhat.com>
Tested-by: Giuseppe Scrivano <gscrivan@redhat.com>
Fixes: 4d99ff8f12 ("fuse: Turn writeback cache on")
Cc: <stable@vger.kernel.org> # v3.15+
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
If the FUSE_READDIRPLUS_AUTO feature is enabled, then lookups on a
directory before/during readdir are used as an indication that READDIRPLUS
should be used instead of READDIR. However if the lookup turns out to be
negative, then selecting READDIRPLUS makes no sense.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
account per-file, dentry, and inode data
blockdev/superblock and temporary per-request data was left alone, as
this usually isn't accounted
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Implements the optimization noted in commit f75fdf22b0 ("fuse: don't
use ->d_time"), as the additional memory can be significant. (In
particular, on SLAB configurations this 8-byte alloc becomes 32 bytes).
Per-dentry, this can consume significant memory.
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>