Commit Graph

223 Commits

Author SHA1 Message Date
Ming Lei
b3fea966d1 block: move bio_alloc_pages() to bcache
bcache is the only user of bio_alloc_pages(), so move this function into
bcache, and avoid it being misused in the future.

Also rename it to bch_bio_allo_pages() since it is bcache only.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-13 06:30:58 +00:00
Ming Lei
0d7bc789e3 block: introduce bio_for_each_bvec() and rq_for_each_bvec()
bio_for_each_bvec() is used for iterating over multi-page bvec for bio
split & merge code.

rq_for_each_bvec() can be used for drivers which may handle the
multi-page bvec directly, so far loop is one perfect use case.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-13 06:30:56 +00:00
Christoph Hellwig
783da8c4e8 block: merge BIOVEC_SEG_BOUNDARY into biovec_phys_mergeable
These two checks should always be performed together, so merge them into
a single helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-13 06:30:53 +00:00
Christoph Hellwig
975b9dc809 block: simplify BIOVEC_PHYS_MERGEABLE
Turn the macro into an inline, move it to blk.h and simplify the
arch hooks a bit.

Also rename the function to biovec_phys_mergeable as there is no need
to shout.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-13 06:30:53 +00:00
Michael Callahan
afc7f3759f block: Add and use op_stat_group() for indexing disk_stat fields.
Add and use a new op_stat_group() function for indexing partition stat
fields rather than indexing them by rq_data_dir() or bio_data_dir().
This function works similarly to op_is_sync() in that it takes the
request::cmd_flags or bio::bi_opf flags and determines which stats
should et updated.

In addition, the second parameter to generic_start_io_acct() and
generic_end_io_acct() is now a REQ_OP rather than simply a read or
write bit and it uses op_stat_group() on the parameter to determine
the stat group.

Note that the partition in_flight counts are not part of the per-cpu
statistics and as such are not indexed via this function.  It's now
indexed by op_is_write().

tj: Refreshed on top of v4.17.  Updated to pass around REQ_OP.

Signed-off-by: Michael Callahan <michaelcallahan@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joshua Morris <josh.h.morris@us.ibm.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Matias Bjorling <mb@lightnvm.io>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Alasdair Kergon <agk@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
2022-04-03 15:41:24 +00:00
Neeraj Soni
bf54c9e71d Integrate the new file encryption framework
These changes integrate new file encryption framework to use new V2 encryption policies.

These changes were earlier reverted in 'commit 4211691d29 ("Reverting crypto and incrementalfs changes")',
as part of android-4.14.171 merge from Android common kernel. This patch attempts to bring them back
post validation.

commit a9a5450 ANDROID: dm: prevent default-key from being enabled without needed hooks
commit e1a94e6 ANDROID: dm: add dm-default-key target for metadata encryption
commit commit 232fd35 ANDROID: dm: enable may_passthrough_inline_crypto on some targets
commit 53bc059 ANDROID: dm: add support for passing through inline crypto support
commit aeed6db ANDROID: block: Introduce passthrough keyslot manager
commit 4f27c8b ANDROID: ext4, f2fs: enable direct I/O with inline encryption
commit c91db46 BACKPORT: FROMLIST: scsi: ufs: add program_key() variant op
commit f9a8e4a ANDROID: block: export symbols needed for modules to use inline crypto
commit 75fea5f ANDROID: block: fix some inline crypto bugs
commit 2871f73 ANDROID: fscrypt: add support for hardware-wrapped keys
commit bb5a657 ANDROID: block: add KSM op to derive software secret from wrapped key
commit d42ba87 ANDROID: block: provide key size as input to inline crypto APIs
commit 86646eb ANDROID: ufshcd-crypto: export cap find API
commit 83bc20e ANDROID: scsi: ufs-qcom: Enable BROKEN_CRYPTO quirk flag
commit c266a13 ANDROID: scsi: ufs: Add quirk bit for controllers that don't play well with inline crypto
commit ea09b99 ANDROID: cuttlefish_defconfig: Enable blk-crypto fallback
commit e12563c BACKPORT: FROMLIST: Update Inline Encryption from v5 to v6 of patch series
commit 8e8f55d ANDROID: scsi: ufs: UFS init should not require inline crypto
commit dae9899 ANDROID: scsi: ufs: UFS crypto variant operations API
commit a69516d ANDROID: cuttlefish_defconfig: enable inline encryption
commit b8f7b23 BACKPORT: FROMLIST: ext4: add inline encryption support
commit e64327f BACKPORT: FROMLIST: f2fs: add inline encryption support
commit a0dc8da BACKPORT: FROMLIST: fscrypt: add inline encryption support
commit 19c3c62 BACKPORT: FROMLIST: scsi: ufs: Add inline encryption support to UFS
commit f858a99 BACKPORT: FROMLIST: scsi: ufs: UFS crypto API
commit 011b834 BACKPORT: FROMLIST: scsi: ufs: UFS driver v2.1 spec crypto additions
commit ec0b569 BACKPORT: FROMLIST: block: blk-crypto for Inline Encryption
commit 760b328 ANDROID: block: Fix bio_crypt_should_process WARN_ON
commit 138adbb BACKPORT: FROMLIST: block: Add encryption context to struct bio
commit 66b5609 BACKPORT: FROMLIST: block: Keyslot Manager for Inline Encryption

Git-repo: https://android.googlesource.com/kernel/common/+/refs/heads/android-4.14-stable
Git-commit: a9a545067a
Git-commit: e1a94e6b17
Git-commit: 232fd353e4
Git-commit: 53bc059bc6
Git-commit: aeed6db424
Git-commit: 4f27c8b90b
Git-commit: c91db466b5
Git-commit: f9a8e4a5c5
Git-commit: 75fea5f605
Git-commit: 2871f73194
Git-commit: bb5a65771a
Git-commit: d42ba87e29
Git-commit: 86646ebb17
Git-commit: 83bc20ed4b
Git-commit: c266a1311e
Git-commit: ea09b9954c
Git-commit: e12563c18d
Git-commit: 8e8f55d1a7
Git-commit: dae9899044
Git-commit: a69516d091
Git-commit: b8f7b23674
Git-commit: e64327f571
Git-commit: a0dc8da519
Git-commit: 19c3c62836
Git-commit: f858a9981a
Git-commit: 011b8344c3
Git-commit: ec0b569b5c
Git-commit: 760b3283e8
Git-commit: 138adbbe5e
Git-commit: 66b5609826

Change-Id: I171d90de41185824e0c7515f3a3b43ab88f4e058
Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
2020-08-18 04:58:02 -07:00
Neeraj Soni
1924eafba6 Remove Per File Key based hardware crypto framework
Remove the Per File Key logic based inline crypto support
for file encryption framework.

Change-Id: I90071562ba5c41b9db470363edac35c9fe5e4efa
Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
2020-08-18 04:50:20 -07:00
Blagovest Kolenichev
dbc4aced9e Merge android-4.14.132 (0dcd8eb) into msm-4.14
* refs/heads/tmp-0dcd8eb:
  Linux 4.14.132
  arm64: insn: Fix ldadd instruction encoding
  tipc: pass tunnel dev as NULL to udp_tunnel(6)_xmit_skb
  futex: Update comments and docs about return values of arch futex code
  bpf, arm64: use more scalable stadd over ldxr / stxr loop in xadd
  arm64: futex: Avoid copying out uninitialised stack in failed cmpxchg()
  bpf: udp: ipv6: Avoid running reuseport's bpf_prog from __udp6_lib_err
  bpf: udp: Avoid calling reuseport's bpf_prog from udp_gro
  bonding: Always enable vlan tx offload
  team: Always enable vlan tx offload
  tun: wake up waitqueues after IFF_UP is set
  tipc: check msg->req data len in tipc_nl_compat_bearer_disable
  tipc: change to use register_pernet_device
  sctp: change to hold sk after auth shkey is created successfully
  net: stmmac: fixed new system time seconds value calculation
  net: remove duplicate fetch in sock_getsockopt
  net/packet: fix memory leak in packet_set_ring()
  ipv4: Use return value of inet_iif() for __raw_v4_lookup in the while loop
  af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET
  eeprom: at24: fix unexpected timeout under high load
  cpu/speculation: Warn on unsupported mitigations= parameter
  NFS/flexfiles: Use the correct TCP timeout for flexfiles I/O
  x86/microcode: Fix the microcode load on CPU hotplug for real
  x86/speculation: Allow guests to use SSBD even if host does not
  scsi: vmw_pscsi: Fix use-after-free in pvscsi_queue_lck()
  dm log writes: make sure super sector log updates are written in order
  mm/page_idle.c: fix oops because end_pfn is larger than max_pfn
  fs/binfmt_flat.c: make load_flat_shared_library() work
  mm/mempolicy.c: fix an incorrect rebind node in mpol_rebind_nodemask
  fs/proc/array.c: allow reporting eip/esp for all coredumping threads
  Revert "compiler.h: update definition of unreachable()"
  qmi_wwan: Fix out-of-bounds read
  net/9p: include trans_common.h to fix missing prototype warning.
  9p: p9dirent_read: check network-provided name length
  9p/rdma: remove useless check in cm_event_handler
  9p: acl: fix uninitialized iattr access
  9p/rdma: do not disconnect on down_interruptible EAGAIN
  9p/xen: fix check for xenbus_read error in front_probe
  block: bio_iov_iter_get_pages: pin more pages for multi-segment IOs
  block: add a lower-level bio_add_page interface
  IB/hfi1: Close PSM sdma_progress sleep window
  Revert "x86/uaccess, ftrace: Fix ftrace_likely_update() vs. SMAP"
  perf header: Fix unchecked usage of strncpy()
  perf help: Remove needless use of strncpy()
  perf ui helpline: Use strlcpy() as a shorter form of strncpy() + explicit set nul

Change-Id: I253fc7ffebfad129b8c2165dd2d5aa5af221fd4b
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2019-07-23 11:01:07 -07:00
Blagovest Kolenichev
7e722ce705 Merge android-4.14.123 (acd501f) into msm-4.14
* refs/heads/tmp-acd501f:
  Revert "arm64/iommu: handle non-remapped addresses in ->mmap and ->get_sgtable"
  Linux 4.14.123
  NFS: Fix a double unlock from nfs_match,get_client
  vfio-ccw: Prevent quiesce function going into an infinite loop
  drm: Wake up next in drm_read() chain if we are forced to putback the event
  drm/drv: Hold ref on parent device during drm_device lifetime
  ASoC: davinci-mcasp: Fix clang warning without CONFIG_PM
  spi: Fix zero length xfer bug
  spi: rspi: Fix sequencer reset during initialization
  spi : spi-topcliff-pch: Fix to handle empty DMA buffers
  scsi: lpfc: Fix SLI3 commands being issued on SLI4 devices
  media: saa7146: avoid high stack usage with clang
  scsi: lpfc: Fix fc4type information for FDMI
  scsi: lpfc: Fix FDMI manufacturer attribute value
  media: vimc: zero the media_device on probe
  media: go7007: avoid clang frame overflow warning with KASAN
  media: vimc: stream: fix thread state before sleep
  media: m88ds3103: serialize reset messages in m88ds3103_set_frontend
  thunderbolt: Fix to check for kmemdup failure
  hwrng: omap - Set default quality
  dmaengine: tegra210-adma: use devm_clk_*() helpers
  batman-adv: allow updating DAT entry timeouts on incoming ARP Replies
  scsi: qla4xxx: avoid freeing unallocated dma memory
  usb: core: Add PM runtime calls to usb_hcd_platform_shutdown
  rcuperf: Fix cleanup path for invalid perf_type strings
  rcutorture: Fix cleanup path for invalid torture_type strings
  x86/mce: Fix machine_check_poll() tests for error types
  tty: ipwireless: fix missing checks for ioremap
  virtio_console: initialize vtermno value for ports
  scsi: qedf: Add missing return in qedf_post_io_req() in the fcport offload check
  media: wl128x: prevent two potential buffer overflows
  media: video-mux: fix null pointer dereferences
  kobject: Don't trigger kobject_uevent(KOBJ_REMOVE) twice.
  spi: tegra114: reset controller on probe
  HID: logitech-hidpp: change low battery level threshold from 31 to 30 percent
  cxgb3/l2t: Fix undefined behaviour
  ASoC: fsl_utils: fix a leaked reference by adding missing of_node_put
  ASoC: eukrea-tlv320: fix a leaked reference by adding missing of_node_put
  HID: core: move Usage Page concatenation to Main item
  RDMA/hns: Fix bad endianess of port_pd variable
  chardev: add additional check for minor range overlap
  x86/ia32: Fix ia32_restore_sigcontext() AC leak
  x86/uaccess, signal: Fix AC=1 bloat
  x86/uaccess, ftrace: Fix ftrace_likely_update() vs. SMAP
  arm64: cpu_ops: fix a leaked reference by adding missing of_node_put
  scsi: ufs: Avoid configuring regulator with undefined voltage range
  scsi: ufs: Fix regulator load and icc-level configuration
  rtlwifi: fix potential NULL pointer dereference
  rtc: xgene: fix possible race condition
  brcmfmac: fix Oops when bringing up interface during USB disconnect
  brcmfmac: fix race during disconnect when USB completion is in progress
  brcmfmac: fix WARNING during USB disconnect in case of unempty psq
  brcmfmac: convert dev_init_lock mutex to completion
  b43: shut up clang -Wuninitialized variable warning
  brcmfmac: fix missing checks for kmemdup
  mwifiex: Fix mem leak in mwifiex_tm_cmd
  rtlwifi: fix a potential NULL pointer dereference
  iio: common: ssp_sensors: Initialize calculated_time in ssp_common_process_data
  iio: hmc5843: fix potential NULL pointer dereferences
  iio: ad_sigma_delta: Properly handle SPI bus locking vs CS assertion
  x86/build: Keep local relocations with ld.lld
  block: sed-opal: fix IOC_OPAL_ENABLE_DISABLE_MBR
  cpufreq: kirkwood: fix possible object reference leak
  cpufreq: pmac32: fix possible object reference leak
  cpufreq/pasemi: fix possible object reference leak
  cpufreq: ppc_cbe: fix possible object reference leak
  s390: cio: fix cio_irb declaration
  x86/microcode: Fix the ancient deprecated microcode loading method
  s390: zcrypt: initialize variables before_use
  clk: rockchip: Make rkpwm a critical clock on rk3288
  extcon: arizona: Disable mic detect if running when driver is removed
  clk: rockchip: Fix video codec clocks on rk3288
  PM / core: Propagate dev->power.wakeup_path when no callbacks
  drm/amdgpu: fix old fence check in amdgpu_fence_emit
  mmc: sdhci-of-esdhc: add erratum eSDHC-A001 and A-008358 support
  mmc: sdhci-of-esdhc: add erratum A-009204 support
  mmc: sdhci-of-esdhc: add erratum eSDHC5 support
  mmc_spi: add a status check for spi_sync_locked
  mmc: core: make pwrseq_emmc (partially) support sleepy GPIO controllers
  scsi: libsas: Do discovery on empty PHY to update PHY info
  hwmon: (f71805f) Use request_muxed_region for Super-IO accesses
  hwmon: (pc87427) Use request_muxed_region for Super-IO accesses
  hwmon: (smsc47b397) Use request_muxed_region for Super-IO accesses
  hwmon: (smsc47m1) Use request_muxed_region for Super-IO accesses
  hwmon: (vt1211) Use request_muxed_region for Super-IO accesses
  RDMA/cxgb4: Fix null pointer dereference on alloc_skb failure
  arm64: vdso: Fix clock_getres() for CLOCK_REALTIME
  i40e: don't allow changes to HW VLAN stripping on active port VLANs
  i40e: Able to add up to 16 MAC filters on an untrusted VF
  phy: sun4i-usb: Make sure to disable PHY0 passby for peripheral mode
  x86/irq/64: Limit IST stack overflow check to #DB stack
  USB: core: Don't unbind interfaces following device reset failure
  drm/msm: a5xx: fix possible object reference leak
  sched/core: Handle overflow in cpu_shares_write_u64
  sched/rt: Check integer overflow at usec to nsec conversion
  sched/core: Check quota and period overflow at usec to nsec conversion
  cgroup: protect cgroup->nr_(dying_)descendants by css_set_lock
  random: add a spinlock_t to struct batched_entropy
  powerpc/64: Fix booting large kernels with STRICT_KERNEL_RWX
  powerpc/numa: improve control of topology updates
  media: pvrusb2: Prevent a buffer overflow
  media: au0828: Fix NULL pointer dereference in au0828_analog_stream_enable()
  media: stm32-dcmi: fix crash when subdev do not expose any formats
  audit: fix a memory leak bug
  media: ov2659: make S_FMT succeed even if requested format doesn't match
  media: au0828: stop video streaming only when last user stops
  media: ov6650: Move v4l2_clk_get() to ov6650_video_probe() helper
  media: coda: clear error return value before picture run
  dmaengine: at_xdmac: remove BUG_ON macro in tasklet
  clk: rockchip: undo several noc and special clocks as critical on rk3288
  pinctrl: samsung: fix leaked of_node references
  pinctrl: pistachio: fix leaked of_node references
  HID: logitech-hidpp: use RAP instead of FAP to get the protocol version
  mm/uaccess: Use 'unsigned long' to placate UBSAN warnings on older GCC versions
  x86/mm: Remove in_nmi() warning from 64-bit implementation of vmalloc_fault()
  smpboot: Place the __percpu annotation correctly
  x86/build: Move _etext to actual end of .text
  vfio-ccw: Release any channel program when releasing/removing vfio-ccw mdev
  vfio-ccw: Do not call flush_workqueue while holding the spinlock
  bcache: avoid clang -Wunintialized warning
  bcache: add failure check to run_cache_set() for journal replay
  bcache: fix failure in journal relplay
  bcache: return error immediately in bch_journal_replay()
  crypto: sun4i-ss - Fix invalid calculation of hash end
  net: cw1200: fix a NULL pointer dereference
  mwifiex: prevent an array overflow
  ASoC: fsl_sai: Update is_slave_mode with correct value
  libbpf: fix samples/bpf build failure due to undefined UINT32_MAX
  mac80211/cfg80211: update bss channel on channel switch
  dmaengine: pl330: _stop: clear interrupt status
  w1: fix the resume command API
  scsi: qedi: Abort ep termination if offload not scheduled
  rtc: 88pm860x: prevent use-after-free on device remove
  iwlwifi: pcie: don't crash on invalid RX interrupt
  btrfs: Don't panic when we can't find a root key
  btrfs: fix panic during relocation after ENOSPC before writeback happens
  Btrfs: fix data bytes_may_use underflow with fallocate due to failed quota reserve
  scsi: qla2xxx: Avoid that lockdep complains about unsafe locking in tcm_qla2xxx_close_session()
  scsi: qla2xxx: Fix abort handling in tcm_qla2xxx_write_pending()
  scsi: qla2xxx: Fix a qla24xx_enable_msix() error path
  sched/cpufreq: Fix kobject memleak
  arm64: Fix compiler warning from pte_unmap() with -Wunused-but-set-variable
  ARM: vdso: Remove dependency with the arch_timer driver internals
  ACPI / property: fix handling of data_nodes in acpi_get_next_subnode()
  brcm80211: potential NULL dereference in brcmf_cfg80211_vndr_cmds_dcmd_handler()
  spi: pxa2xx: fix SCR (divisor) calculation
  ASoC: imx: fix fiq dependencies
  powerpc/boot: Fix missing check of lseek() return value
  powerpc/perf: Return accordingly on invalid chip-id in
  ASoC: hdmi-codec: unlock the device on startup errors
  pinctrl: zte: fix leaked of_node references
  net: ena: gcc 8: fix compilation warning
  dmaengine: tegra210-dma: free dma controller in remove()
  tools/bpf: fix perf build error with uClibc (seen on ARC)
  mmc: core: Verify SD bus width
  gfs2: Fix occasional glock use-after-free
  IB/hfi1: Fix WQ_MEM_RECLAIM warning
  NFS: make nfs_match_client killable
  cxgb4: Fix error path in cxgb4_init_module
  gfs2: Fix lru_count going negative
  Revert "btrfs: Honour FITRIM range constraints during free space trim"
  net: erspan: fix use-after-free
  at76c50x-usb: Don't register led_trigger if usb_register_driver failed
  batman-adv: mcast: fix multicast tt/tvlv worker locking
  bpf: devmap: fix use-after-free Read in __dev_map_entry_free
  ssb: Fix possible NULL pointer dereference in ssb_host_pcmcia_exit
  media: vivid: use vfree() instead of kfree() for dev->bitmap_cap
  media: serial_ir: Fix use-after-free in serial_ir_init_module
  media: cpia2: Fix use-after-free in cpia2_exit
  fbdev: fix WARNING in __alloc_pages_nodemask bug
  btrfs: honor path->skip_locking in backref code
  brcmfmac: add subtype check for event handling in data path
  brcmfmac: assure SSID length from firmware is limited
  hugetlb: use same fault hash key for shared and private mappings
  fbdev: fix divide error in fb_var_to_videomode
  btrfs: sysfs: don't leak memory when failing add fsid
  btrfs: sysfs: Fix error path kobject memory leak
  Btrfs: fix race between ranged fsync and writeback of adjacent ranges
  Btrfs: avoid fallback to transaction commit during fsync of files with holes
  Btrfs: do not abort transaction at btrfs_update_root() after failure to COW path
  gfs2: Fix sign extension bug in gfs2_update_stats
  arm64/iommu: handle non-remapped addresses in ->mmap and ->get_sgtable
  libnvdimm/namespace: Fix label tracking error
  libnvdimm/pmem: Bypass CONFIG_HARDENED_USERCOPY overhead
  kvm: svm/avic: fix off-by-one in checking host APIC ID
  mmc: sdhci-iproc: Set NO_HISPD bit to fix HS50 data hold time problem
  mmc: sdhci-iproc: cygnus: Set NO_HISPD bit to fix HS50 data hold time problem
  crypto: vmx - CTR: always increment IV as quadword
  Revert "scsi: sd: Keep disk read-only when re-reading partition"
  sbitmap: fix improper use of smp_mb__before_atomic()
  bio: fix improper use of smp_mb__before_atomic()
  KVM: x86: fix return value for reserved EFER
  f2fs: Fix use of number of devices
  ext4: do not delete unlinked inode from orphan list on failed truncate
  x86: Hide the int3_emulate_call/jmp functions from UML
  x86: Hide the int3_emulate_call/jmp functions from UML
  Linux 4.14.122
  fbdev: sm712fb: fix memory frequency by avoiding a switch/case fallthrough
  btrfs: Honour FITRIM range constraints during free space trim
  bpf, lru: avoid messing with eviction heuristics upon syscall lookup
  bpf: add map_lookup_elem_sys_only for lookups from syscall side
  driver core: Postpone DMA tear-down until after devres release for probe failure
  md/raid: raid5 preserve the writeback action after the parity check
  Revert "Don't jump to compute_result state from check_result state"
  perf bench numa: Add define for RUSAGE_THREAD if not present
  ufs: fix braino in ufs_get_inode_gid() for solaris UFS flavour
  x86/mm/mem_encrypt: Disable all instrumentation for early SME setup
  sched/cpufreq: Fix kobject memleak
  iwlwifi: mvm: check for length correctness in iwl_mvm_create_skb()
  power: supply: sysfs: prevent endless uevent loop with CONFIG_POWER_SUPPLY_DEBUG
  KVM: arm/arm64: Ensure vcpu target is unset on reset failure
  mac80211: Fix kernel panic due to use of txq after free
  apparmorfs: fix use-after-free on symlink traversal
  securityfs: fix use-after-free on symlink traversal
  power: supply: cpcap-battery: Fix division by zero
  xfrm4: Fix uninitialized memory read in _decode_session4
  esp4: add length check for UDP encapsulation
  vti4: ipip tunnel deregistration fixes.
  xfrm6_tunnel: Fix potential panic when unloading xfrm6_tunnel module
  xfrm: policy: Fix out-of-bound array accesses in __xfrm_policy_unlink
  dm delay: fix a crash when invalid device is specified
  dm zoned: Fix zone report handling
  dm cache metadata: Fix loading discard bitset
  PCI: Work around Pericom PCIe-to-PCI bridge Retrain Link erratum
  PCI: Factor out pcie_retrain_link() function
  PCI: Mark Atheros AR9462 to avoid bus reset
  PCI: Mark AMD Stoney Radeon R7 GPU ATS as broken
  fbdev: sm712fb: fix crashes and garbled display during DPMS modesetting
  fbdev: sm712fb: use 1024x768 by default on non-MIPS, fix garbled display
  fbdev: sm712fb: fix support for 1024x768-16 mode
  fbdev: sm712fb: fix crashes during framebuffer writes by correctly mapping VRAM
  fbdev: sm712fb: fix boot screen glitch when sm712fb replaces VGA
  fbdev: sm712fb: fix white screen of death on reboot, don't set CR3B-CR3F
  fbdev: sm712fb: fix VRAM detection, don't set SR70/71/74/75
  fbdev: sm712fb: fix brightness control on reboot, don't set SR30
  objtool: Allow AR to be overridden with HOSTAR
  perf intel-pt: Fix sample timestamp wrt non-taken branches
  perf intel-pt: Fix improved sample timestamp
  perf intel-pt: Fix instructions sampling rate
  memory: tegra: Fix integer overflow on tick value calculation
  tracing: Fix partial reading of trace event's id file
  ftrace/x86_64: Emulate call function while updating in breakpoint handler
  x86_64: Allow breakpoints to emulate call instructions
  x86_64: Add gap to int3 to allow for call emulation
  ceph: flush dirty inodes before proceeding with remount
  iommu/tegra-smmu: Fix invalid ASID bits on Tegra30/114
  fuse: honor RLIMIT_FSIZE in fuse_file_fallocate
  fuse: fix writepages on 32bit
  clk: rockchip: fix wrong clock definitions for rk3328
  clk: tegra: Fix PLLM programming on Tegra124+ when PMC overrides divider
  clk: hi3660: Mark clk_gate_ufs_subsys as critical
  PNFS fallback to MDS if no deviceid found
  NFS4: Fix v4.0 client state corruption when mount
  Revert "cifs: fix memory leak in SMB2_read"
  media: ov6650: Fix sensor possibly not detected on probe
  cifs: fix strcat buffer overflow and reduce raciness in smb21_set_oplock_level()
  of: fix clang -Wunsequenced for be32_to_cpu()
  p54: drop device reference count if fails to enable device
  intel_th: msu: Fix single mode with IOMMU
  md: add mddev->pers to avoid potential NULL pointer dereference
  stm class: Fix channel free in stm output free path
  parisc: Rename LEVEL to PA_ASM_LEVEL to avoid name clash with DRBD code
  parisc: Use PA_ASM_LEVEL in boot code
  parisc: Skip registering LED when running in QEMU
  parisc: Export running_on_qemu symbol for modules
  net: Always descend into dsa/
  vsock/virtio: Initialize core virtio vsock before registering the driver
  tipc: fix modprobe tipc failed after switch order of device registration
  vsock/virtio: free packets during the socket release
  tipc: switch order of device registration to fix a crash
  ppp: deflate: Fix possible crash in deflate_init
  net: usb: qmi_wwan: add Telit 0x1260 and 0x1261 compositions
  net: test nouarg before dereferencing zerocopy pointers
  net/mlx4_core: Change the error print to info print
  net: avoid weird emergency message
  f2fs: link f2fs quota ops for sysfile
  Enable CONFIG_ION_SYSTEM_HEAP
  BACKPORT: gcov: clang support
  UPSTREAM: gcov: docs: add a note on GCC vs Clang differences
  UPSTREAM: gcov: clang: move common GCC code into gcc_base.c
  UPSTREAM: module: add stubs for within_module functions
  UPSTREAM: gcov: remove CONFIG_GCOV_FORMAT_AUTODETECT
  BACKPORT: kbuild: gcov: enable -fno-tree-loop-im if supported
  fs: sdcardfs: Add missing option to show_options

Conflicts:
	Makefile
	arch/arm64/include/asm/pgtable.h
	drivers/scsi/ufs/ufshcd.c

Change-Id: I0c79879b0989383949ff5a292a9923b668e4514f
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2019-07-23 11:00:08 -07:00
Christoph Hellwig
515e2f3e9f block: add a lower-level bio_add_page interface
[ Upstream commit 0aa69fd32a5f766e997ca8ab4723c5a1146efa8b ]

For the upcoming removal of buffer heads in XFS we need to keep track of
the number of outstanding writeback requests per page.  For this we need
to know if bio_add_page merged a region with the previous bvec or not.
Instead of adding additional arguments this refactors bio_add_page to
be implemented using three lower level helpers which users like XFS can
use directly if they care about the merge decisions.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-07-03 13:15:58 +02:00
Andrea Parri
f0b882dd9e bio: fix improper use of smp_mb__before_atomic()
commit f381c6a4bd0ae0fde2d6340f1b9bb0f58d915de6 upstream.

This barrier only applies to the read-modify-write operations; in
particular, it does not apply to the atomic_set() primitive.

Replace the barrier with an smp_mb().

Fixes: dac56212e8 ("bio: skip atomic inc/dec of ->bi_cnt for most use cases")
Cc: stable@vger.kernel.org
Reported-by: "Paul E. McKenney" <paulmck@linux.ibm.com>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: linux-block@vger.kernel.org
Cc: "Paul E. McKenney" <paulmck@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-31 06:47:10 -07:00
Shivaprasad Hongal
367c46b11c Enable hardware based FBE on f2fs and adapt ext4 fs
Hardware File Based Encryption (FBE) uses inline crypto
engine to encrypt the user data.
1. security/pfk: changes to support per file
   encryption for f2fs using hardware crypto engine.
2. fs/ext4: adapted crypto APIs for generic crypto layer.
3. fs/f2fs: support hardware crypto engine based per file
   encryption.
4. fs/crypto: export APIs to support hardware crypto
   engine based per file encryption.
5. security/pfe: added wrapped key support based on
   upstream changes.
Other changes made to provide support framework for per
file encryption.

Reverting commit e02a4e21f6 ("ext4: Add HW File Based
Encryption on ext4 file system") and adding changes to
have FBE in sync with upstream implementation of FBE.

Change-Id: I17f9909c43ba744eb874f6d237745fbf88a2b848
Signed-off-by: Shivaprasad Hongal <shongal@codeaurora.org>
2018-08-22 10:56:07 -07:00
Jiufei Xue
3c84b5aaf7 block: display the correct diskname for bio
[ Upstream commit 9c0fb1e313aaf4e8edec22433c8b22dd308e466c ]

bio_devname use __bdevname to display the device name, and can
only show the major and minor of the part0,
Fix this by using disk_name to display the correct name.

Fixes: 74d46992e0 ("block: replace bi_bdev with a gendisk pointer and partitions index")
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:52:09 +02:00
Shaohua Li
3ef1c33f98 block-throttle: avoid double charge
commit 111be883981748acc9a56e855c8336404a8e787c upstream.

If a bio is throttled and split after throttling, the bio could be
resubmited and enters the throttling again. This will cause part of the
bio to be charged multiple times. If the cgroup has an IO limit, the
double charge will significantly harm the performance. The bio split
becomes quite common after arbitrary bio size change.

To fix this, we always set the BIO_THROTTLED flag if a bio is throttled.
If the bio is cloned/split, we copy the flag to new bio too to avoid a
double charge. However, cloned bio could be directed to a new disk,
keeping the flag be a problem. The observation is we always set new disk
for the bio in this case, so we can clear the flag in bio_set_dev().

This issue exists for a long time, arbitrary bio size change just makes
it worse, so this should go into stable at least since v4.2.

V1-> V2: Not add extra field in bio based on discussion with Tejun

Cc: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-29 17:53:47 +01:00
Linus Torvalds
a0725ab0c7 Merge branch 'for-4.14/block' of git://git.kernel.dk/linux-block
Pull block layer updates from Jens Axboe:
 "This is the first pull request for 4.14, containing most of the code
  changes. It's a quiet series this round, which I think we needed after
  the churn of the last few series. This contains:

   - Fix for a registration race in loop, from Anton Volkov.

   - Overflow complaint fix from Arnd for DAC960.

   - Series of drbd changes from the usual suspects.

   - Conversion of the stec/skd driver to blk-mq. From Bart.

   - A few BFQ improvements/fixes from Paolo.

   - CFQ improvement from Ritesh, allowing idling for group idle.

   - A few fixes found by Dan's smatch, courtesy of Dan.

   - A warning fixup for a race between changing the IO scheduler and
     device remova. From David Jeffery.

   - A few nbd fixes from Josef.

   - Support for cgroup info in blktrace, from Shaohua.

   - Also from Shaohua, new features in the null_blk driver to allow it
     to actually hold data, among other things.

   - Various corner cases and error handling fixes from Weiping Zhang.

   - Improvements to the IO stats tracking for blk-mq from me. Can
     drastically improve performance for fast devices and/or big
     machines.

   - Series from Christoph removing bi_bdev as being needed for IO
     submission, in preparation for nvme multipathing code.

   - Series from Bart, including various cleanups and fixes for switch
     fall through case complaints"

* 'for-4.14/block' of git://git.kernel.dk/linux-block: (162 commits)
  kernfs: checking for IS_ERR() instead of NULL
  drbd: remove BIOSET_NEED_RESCUER flag from drbd_{md_,}io_bio_set
  drbd: Fix allyesconfig build, fix recent commit
  drbd: switch from kmalloc() to kmalloc_array()
  drbd: abort drbd_start_resync if there is no connection
  drbd: move global variables to drbd namespace and make some static
  drbd: rename "usermode_helper" to "drbd_usermode_helper"
  drbd: fix race between handshake and admin disconnect/down
  drbd: fix potential deadlock when trying to detach during handshake
  drbd: A single dot should be put into a sequence.
  drbd: fix rmmod cleanup, remove _all_ debugfs entries
  drbd: Use setup_timer() instead of init_timer() to simplify the code.
  drbd: fix potential get_ldev/put_ldev refcount imbalance during attach
  drbd: new disk-option disable-write-same
  drbd: Fix resource role for newly created resources in events2
  drbd: mark symbols static where possible
  drbd: Send P_NEG_ACK upon write error in protocol != C
  drbd: add explicit plugging when submitting batches
  drbd: change list_for_each_safe to while(list_first_entry_or_null)
  drbd: introduce drbd_recv_header_maybe_unplug
  ...
2017-09-07 11:59:42 -07:00
Huang Ying
225311a464 mm: test code to write THP to swap device as a whole
To support delay splitting THP (Transparent Huge Page) after swapped
out, we need to enhance swap writing code to support to write a THP as a
whole.  This will improve swap write IO performance.

As Ming Lei <ming.lei@redhat.com> pointed out, this should be based on
multipage bvec support, which hasn't been merged yet.  So this patch is
only for testing the functionality of the other patches in the series.
And will be reimplemented after multipage bvec support is merged.

Link: http://lkml.kernel.org/r/20170724051840.2309-7-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ross Zwisler <ross.zwisler@intel.com> [for brd.c, zram_drv.c, pmem.c]
Cc: Shaohua Li <shli@kernel.org>
Cc: Vishal L Verma <vishal.l.verma@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:28 -07:00
Christoph Hellwig
74d46992e0 block: replace bi_bdev with a gendisk pointer and partitions index
This way we don't need a block_device structure to submit I/O.  The
block_device has different life time rules from the gendisk and
request_queue and is usually only available when the block device node
is open.  Other callers need to explicitly create one (e.g. the lightnvm
passthrough code, or the new nvme multipathing code).

For the actual I/O path all that we need is the gendisk, which exists
once per block device.  But given that the block layer also does
partition remapping we additionally need a partition index, which is
used for said remapping in generic_make_request.

Note that all the block drivers generally want request_queue or
sometimes the gendisk, so this removes a layer of indirection all
over the stack.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23 12:49:55 -06:00
Jens Axboe
d62e26b3ff block: pass in queue to inflight accounting
No functional change in this patch, just in preparation for
basing the inflight mechanism on the queue in question.

Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-09 13:09:16 -06:00
Christoph Hellwig
7c20f11680 bio-integrity: stop abusing bi_end_io
And instead call directly into the integrity code from bio_end_io.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-03 17:00:59 -06:00
Dmitry Monakhov
f9df1cd99e bio: add bvec_iter rewind API
Some ->bi_end_io handlers (for example: pi_verify or decrypt handlers)
need to know original data vector, but after bio traverse io-stack it may
be advanced, splited and relocated many times so it is hard to guess
original iterator. Let's add 'bi_done' conter which accounts number
of bytes iterator was advanced during it's evolution. Later end_io handler
may easily restore original iterator by rewinding iterator to
iter->bi_done.

Note: this change makes sizeof (struct bvec_iter) multiple to 8

Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
[hch: switched to true/false return]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-03 16:56:28 -06:00
Dmitry Monakhov
b1fb2c52b2 block: guard bvec iteration logic
Currently if some one try to advance bvec beyond it's size we simply
dump WARN_ONCE and continue to iterate beyond bvec array boundaries.
This simply means that we endup dereferencing/corrupting random memory
region.

Sane reaction would be to propagate error back to calling context
But bvec_iter_advance's calling context is not always good for error
handling. For safity reason let truncate iterator size to zero which
will break external iteration loop which prevent us from unpredictable
memory range corruption. And even it caller ignores an error, it will
corrupt it's own bvecs, not others.

This patch does:
- Return error back to caller with hope that it will react on this
- Truncate iterator size

Code was added long time ago here 4550dd6c, luckily no one hit it
in real life :)

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
[hch: switch to true/false returns instead of errno values]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-03 16:56:26 -06:00
Dmitry Monakhov
e23947bd76 bio-integrity: fold bio_integrity_enabled to bio_integrity_prep
Currently all integrity prep hooks are open-coded, and if prepare fails
we ignore it's code and fail bio with EIO. Let's return real error to
upper layer, so later caller may react accordingly.

In fact no one want to use bio_integrity_prep() w/o bio_integrity_enabled,
so it is reasonable to fold it in to one function.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
[hch: merged with the latest block tree,
	return bool from bio_integrity_prep]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-03 16:56:24 -06:00
Dmitry Monakhov
fbd08e7673 bio-integrity: fix interface for bio_integrity_trim
bio_integrity_trim inherent it's interface from bio_trim and accept
offset and size, but this API is error prone because data offset
must always be insync with bio's data offset. That is why we have
integrity update hook in bio_advance()

So only meaningful values are: offset == 0, sectors == bio_sectors(bio)
Let's just remove them completely.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-03 16:56:22 -06:00
Linus Torvalds
c6b1e36c8f Merge branch 'for-4.13/block' of git://git.kernel.dk/linux-block
Pull core block/IO updates from Jens Axboe:
 "This is the main pull request for the block layer for 4.13. Not a huge
  round in terms of features, but there's a lot of churn related to some
  core cleanups.

  Note this depends on the UUID tree pull request, that Christoph
  already sent out.

  This pull request contains:

   - A series from Christoph, unifying the error/stats codes in the
     block layer. We now use blk_status_t everywhere, instead of using
     different schemes for different places.

   - Also from Christoph, some cleanups around request allocation and IO
     scheduler interactions in blk-mq.

   - And yet another series from Christoph, cleaning up how we handle
     and do bounce buffering in the block layer.

   - A blk-mq debugfs series from Bart, further improving on the support
     we have for exporting internal information to aid debugging IO
     hangs or stalls.

   - Also from Bart, a series that cleans up the request initialization
     differences across types of devices.

   - A series from Goldwyn Rodrigues, allowing the block layer to return
     failure if we will block and the user asked for non-blocking.

   - Patch from Hannes for supporting setting loop devices block size to
     that of the underlying device.

   - Two series of patches from Javier, fixing various issues with
     lightnvm, particular around pblk.

   - A series from me, adding support for write hints. This comes with
     NVMe support as well, so applications can help guide data placement
     on flash to improve performance, latencies, and write
     amplification.

   - A series from Ming, improving and hardening blk-mq support for
     stopping/starting and quiescing hardware queues.

   - Two pull requests for NVMe updates. Nothing major on the feature
     side, but lots of cleanups and bug fixes. From the usual crew.

   - A series from Neil Brown, greatly improving the bio rescue set
     support. Most notably, this kills the bio rescue work queues, if we
     don't really need them.

   - Lots of other little bug fixes that are all over the place"

* 'for-4.13/block' of git://git.kernel.dk/linux-block: (217 commits)
  lightnvm: pblk: set line bitmap check under debug
  lightnvm: pblk: verify that cache read is still valid
  lightnvm: pblk: add initialization check
  lightnvm: pblk: remove target using async. I/Os
  lightnvm: pblk: use vmalloc for GC data buffer
  lightnvm: pblk: use right metadata buffer for recovery
  lightnvm: pblk: schedule if data is not ready
  lightnvm: pblk: remove unused return variable
  lightnvm: pblk: fix double-free on pblk init
  lightnvm: pblk: fix bad le64 assignations
  nvme: Makefile: remove dead build rule
  blk-mq: map all HWQ also in hyperthreaded system
  nvmet-rdma: register ib_client to not deadlock in device removal
  nvme_fc: fix error recovery on link down.
  nvmet_fc: fix crashes on bad opcodes
  nvme_fc: Fix crash when nvme controller connection fails.
  nvme_fc: replace ioabort msleep loop with completion
  nvme_fc: fix double calls to nvme_cleanup_cmd()
  nvme-fabrics: verify that a controller returns the correct NQN
  nvme: simplify nvme_dev_attrs_are_visible
  ...
2017-07-03 10:34:51 -07:00
Jens Axboe
9ae3b3f52c block: provide bio_uninit() free freeing integrity/task associations
Wen reports significant memory leaks with DIF and O_DIRECT:

"With nvme devive + T10 enabled, On a system it has 256GB and started
logging /proc/meminfo & /proc/slabinfo for every minute and in an hour
it increased by 15968128 kB or ~15+GB.. Approximately 256 MB / minute
leaking.

/proc/meminfo | grep SUnreclaim...

SUnreclaim:      6752128 kB
SUnreclaim:      6874880 kB
SUnreclaim:      7238080 kB
....
SUnreclaim:     22307264 kB
SUnreclaim:     22485888 kB
SUnreclaim:     22720256 kB

When testcases with T10 enabled call into __blkdev_direct_IO_simple,
code doesn't free memory allocated by bio_integrity_alloc. The patch
fixes the issue. HTX has been run with +60 hours without failure."

Since __blkdev_direct_IO_simple() allocates the bio on the stack, it
doesn't go through the regular bio free. This means that any ancillary
data allocated with the bio through the stack is not freed. Hence, we
can leak the integrity data associated with the bio, if the device is
using DIF/DIX.

Fix this by providing a bio_uninit() and export it, so that we can use
it to free this data. Note that this is a minimal fix for this issue.
Any current user of bio's that are allocated outside of
bio_alloc_bioset() suffers from this issue, most notably some drivers.
We will fix those in a more comprehensive patch for 4.13. This also
means that the commit marked as being fixed by this isn't the real
culprit, it's just the most obvious one out there.

Fixes: 542ff7bf18 ("block: new direct I/O implementation")
Reported-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-28 15:30:13 -06:00
Christoph Hellwig
80ab6af432 block: remove the unused bio_to_phys macro
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-20 19:21:46 -06:00
Goldwyn Rodrigues
03a07c92a9 block: return on congested block device
A new bio operation flag REQ_NOWAIT is introduced to identify bio's
orignating from iocb with IOCB_NOWAIT. This flag indicates
to return immediately if a request cannot be made instead
of retrying.

Stacked devices such as md (the ones with make_request_fn hooks)
currently are not supported because it may block for housekeeping.
For example, an md can have a part of the device suspended.
For this reason, only request based devices are supported.
In the future, this feature will be expanded to stacked devices
by teaching them how to handle the REQ_NOWAIT flags.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-20 07:12:03 -06:00
NeilBrown
9b10f6a9c2 block: remove bio_clone() and all references.
bio_clone() is no longer used.
Only bio_clone_bioset() or bio_clone_fast().
This is for the best, as bio_clone() used fs_bio_set,
and filesystems are unlikely to want to use bio_clone().

So remove bio_clone() and all references.
This includes a fix to some incorrect documentation.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-18 12:40:59 -06:00
NeilBrown
47e0fb461f blk: make the bioset rescue_workqueue optional.
This patch converts bioset_create() to not create a workqueue by
default, so alloctions will never trigger punt_bios_to_rescuer().  It
also introduces a new flag BIOSET_NEED_RESCUER which tells
bioset_create() to preserve the old behavior.

All callers of bioset_create() that are inside block device drivers,
are given the BIOSET_NEED_RESCUER flag.

biosets used by filesystems or other top-level users do not
need rescuing as the bio can never be queued behind other
bios.  This includes fs_bio_set, blkdev_dio_pool,
btrfs_bioset, xfs_ioend_bioset, and one allocated by
target_core_iblock.c.

biosets used by md/raid do not need rescuing as
their usage was recently audited and revised to never
risk deadlock.

It is hoped that most, if not all, of the remaining biosets
can end up being the non-rescued version.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Credit-to: Ming Lei <ming.lei@redhat.com> (minor fixes)
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-18 12:40:59 -06:00
NeilBrown
011067b056 blk: replace bioset_create_nobvec() with a flags arg to bioset_create()
"flags" arguments are often seen as good API design as they allow
easy extensibility.
bioset_create_nobvec() is implemented internally as a variation in
flags passed to __bioset_create().

To support future extension, make the internal structure part of the
API.
i.e. add a 'flags' argument to bioset_create() and discard
bioset_create_nobvec().

Note that the bio_split allocations in drivers/md/raid* do not need
the bvec mempool - they should have used bioset_create_nobvec().

Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-18 12:40:59 -06:00
Christoph Hellwig
4e4cbee93d block: switch bios to blk_status_t
Replace bi_error with a new bi_status to allow for a clear conversion.
Note that device mapper overloaded bi_error with a private value, which
we'll have to keep arround at least for now and thus propagate to a
proper blk_status_t value.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-09 09:27:32 -06:00
Shaohua Li
e265eb3a30 Merge branch 'md-next' into md-linus 2017-05-01 14:09:21 -07:00
NeilBrown
50512625da Revert "block: introduce bio_copy_data_partial"
This reverts commit 6f8802852f.
bio_copy_data_partial() is no longer needed.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-04-11 10:09:03 -07:00
Shaohua Li
f45958756f block: remove bio_clone_bioset_partial()
commit c18a1e0(block: introduce bio_clone_bioset_partial()) introduced
bio_clone_bioset_partial() for raid1 write behind IO. Now the write behind is
rewritten by Ming. We don't need the API any more, so revert the commit.

Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@fb.com>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-03-25 09:18:37 -07:00
Ming Lei
6f8802852f block: introduce bio_copy_data_partial
Turns out we can use bio_copy_data in raid1's write behind,
and we can make alloc_behind_pages() more clean/efficient,
but we need to partial version of bio_copy_data().

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-03-24 10:41:37 -07:00
Dan Carpenter
7a88fa1919 block: make nr_iovecs unsigned in bio_alloc_bioset()
There isn't a bug here, but Smatch is not smart enough to know that
"nr_iovecs" can't be negative so it complains about underflows.
Really, it's slightly cleaner to make this parameter unsigned.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-03-23 08:16:11 -06:00
Ming Lei
c18a1e0900 block: introduce bio_clone_bioset_partial()
md still need bio clone(not the fast version) for behind write,
and it is more efficient to use bio_clone_bioset_partial().

The idea is simple and just copy the bvecs range specified from
parameters.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-02-15 11:22:05 -08:00
Christoph Hellwig
f9d03f96b9 block: improve handling of the magic discard payload
Instead of allocating a single unused biovec for discard requests, send
them down without any payload.  Instead we allow the driver to add a
"special" payload using a biovec embedded into struct request (unioned
over other fields never used while in the driver), and overloading
the number of segments for this case.

This has a couple of advantages:

 - we don't have to allocate the bio_vec
 - the amount of special casing for discard requests in the block
   layer is significantly reduced
 - using this same scheme for other request types is trivial,
   which will be important for implementing the new WRITE_ZEROES
   op on devices where it actually requires a payload (e.g. SCSI)
 - we can get rid of playing games with the request length, as
   we'll never touch it and completions will work just fine
 - it will allow us to support ranged discard operations in the
   future by merging non-contiguous discard bios into a single
   request
 - last but not least it removes a lot of code

This patch is the common base for my WIP series for ranges discards and to
remove discard_zeroes_data in favor of always using REQ_OP_WRITE_ZEROES,
so it would be good to get it in quickly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-12-09 08:30:51 -07:00
Chaitanya Kulkarni
a6f0788ec2 block: add support for REQ_OP_WRITE_ZEROES
This adds a new block layer operation to zero out a range of
LBAs. This allows to implement zeroing for devices that don't use
either discard with a predictable zero pattern or WRITE SAME of zeroes.
The prominent example of that is NVMe with the Write Zeroes command,
but in the future, this should also help with improving the way
zeroing discards work. For this operation, suitable entry is exported in
sysfs which indicate the number of maximum bytes allowed in one
write zeroes operation by the device.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@hgst.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-12-01 07:58:40 -07:00
Ming Lei
3a83f46775 block: bio: pass bvec table to bio_init()
Some drivers often use external bvec table, so introduce
this helper for this case. It is always safe to access the
bio->bi_io_vec in this way for this case.

After converting to this usage, it will becomes a bit easier
to evaluate the remaining direct access to bio->bi_io_vec,
so it can help to prepare for the following multipage bvec
support.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

Fixed up the new O_DIRECT cases.

Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-22 08:57:21 -07:00
Kent Overstreet
2cefe4dbaa block: add bio_iov_iter_get_pages()
This is a helper that pins down a range from an iov_iter and adds it to
a bio without requiring a separate memory allocation for the page array.
It will be used for upcoming direct I/O implementations for block devices
and iomap based file systems.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
[hch: ported to the iov_iter interface, renamed and added comments.
      All blame should be directed to me and all fame should go to Kent
      after this!]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-02 10:50:18 -06:00
Christoph Hellwig
1e3914d4cf block, fs: move submit_bio to bio.h
This is where all the other bio operations live, so users must include
bio.h anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-01 09:43:26 -06:00
Christoph Hellwig
d38499530e fs: decouple READ and WRITE from the block layer ops
Move READ and WRITE to kernel.h and don't define them in terms of block
layer ops; they are our generic data direction indicators these days
and have no more resemblance with the block layer ops.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-01 09:43:26 -06:00
Christoph Hellwig
c4aebd0332 block: remove bio_is_rw
With the addition of the zoned operations the tests in this function
became incorrect.  But I think it's much better to just open code the
allow operations in the only caller anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-10-28 08:45:17 -06:00
Guoqing Jiang
491221f88d block: export bio_free_pages to other modules
bio_free_pages is introduced in commit 1dfa0f68c0
("block: add a helper to free bio bounce buffer pages"),
we can reuse the func in other modules after it was
imported.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Shaohua Li <shli@fb.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Acked-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-22 07:48:03 -06:00
Christoph Hellwig
fc95db3ede bio.h: remove a very outdated comment
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-14 09:18:08 -06:00
Adrian Hunter
7afafc8a44 block: Fix secure erase
Commit 288dab8a35 ("block: add a separate operation type for secure
erase") split REQ_OP_SECURE_ERASE from REQ_OP_DISCARD without considering
all the places REQ_OP_DISCARD was being used to mean either. Fix those.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: 288dab8a35 ("block: add a separate operation type for secure erase")
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-08-16 09:16:51 -06:00
Jens Axboe
1eff9d322a block: rename bio bi_rw to bi_opf
Since commit 63a4cc2486, bio->bi_rw contains flags in the lower
portion and the op code in the higher portions. This means that
old code that relies on manually setting bi_rw is most likely
going to be broken. Instead of letting that brokeness linger,
rename the member, to force old and out-of-tree code to break
at compile time instead of at runtime.

No intended functional changes in this commit.

Signed-off-by: Jens Axboe <axboe@fb.com>
2016-08-07 14:41:02 -06:00
Paolo Valente
20bd723ec6 block: add missing group association in bio-cloning functions
When a bio is cloned, the newly created bio must be associated with
the same blkcg as the original bio (if BLK_CGROUP is enabled). If
this operation is not performed, then the new bio is not associated
with any group, and the group of the current task is returned when
the group of the bio is requested.

Depending on the cloning frequency, this may cause a large
percentage of the bios belonging to a given group to be treated
as if belonging to other groups (in most cases as if belonging to
the root group). The expected group isolation may thereby be broken.

This commit adds the missing association in bio-cloning functions.

Fixes: da2f0f74cf ("Btrfs: add support for blkio controllers")
Cc: stable@vger.kernel.org # v4.3+

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Reviewed-by: Nikolay Borisov <kernel@kyup.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-08-04 14:19:16 -06:00
Linus Torvalds
3fc9d69093 Merge branch 'for-4.8/drivers' of git://git.kernel.dk/linux-block
Pull block driver updates from Jens Axboe:
 "This branch also contains core changes.  I've come to the conclusion
  that from 4.9 and forward, I'll be doing just a single branch.  We
  often have dependencies between core and drivers, and it's hard to
  always split them up appropriately without pulling core into drivers
  when that happens.

  That said, this contains:

   - separate secure erase type for the core block layer, from
     Christoph.

   - set of discard fixes, from Christoph.

   - bio shrinking fixes from Christoph, as a followup up to the
     op/flags change in the core branch.

   - map and append request fixes from Christoph.

   - NVMeF (NVMe over Fabrics) code from Christoph.  This is pretty
     exciting!

   - nvme-loop fixes from Arnd.

   - removal of ->driverfs_dev from Dan, after providing a
     device_add_disk() helper.

   - bcache fixes from Bhaktipriya and Yijing.

   - cdrom subchannel read fix from Vchannaiah.

   - set of lightnvm updates from Wenwei, Matias, Johannes, and Javier.

   - set of drbd updates and fixes from Fabian, Lars, and Philipp.

   - mg_disk error path fix from Bart.

   - user notification for failed device add for loop, from Minfei.

   - NVMe in general:
        + NVMe delay quirk from Guilherme.
        + SR-IOV support and command retry limits from Keith.
        + fix for memory-less NUMA node from Masayoshi.
        + use UINT_MAX for discard sectors, from Minfei.
        + cancel IO fixes from Ming.
        + don't allocate unused major, from Neil.
        + error code fixup from Dan.
        + use constants for PSDT/FUSE from James.
        + variable init fix from Jay.
        + fabrics fixes from Ming, Sagi, and Wei.
        + various fixes"

* 'for-4.8/drivers' of git://git.kernel.dk/linux-block: (115 commits)
  nvme/pci: Provide SR-IOV support
  nvme: initialize variable before logical OR'ing it
  block: unexport various bio mapping helpers
  scsi/osd: open code blk_make_request
  target: stop using blk_make_request
  block: simplify and export blk_rq_append_bio
  block: ensure bios return from blk_get_request are properly initialized
  virtio_blk: use blk_rq_map_kern
  memstick: don't allow REQ_TYPE_BLOCK_PC requests
  block: shrink bio size again
  block: simplify and cleanup bvec pool handling
  block: get rid of bio_rw and READA
  block: don't ignore -EOPNOTSUPP blkdev_issue_write_same
  block: introduce BLKDEV_DISCARD_ZERO to fix zeroout
  NVMe: don't allocate unused nvme_major
  nvme: avoid crashes when node 0 is memoryless node.
  nvme: Limit command retries
  loop: Make user notify for adding loop device failed
  nvme-loop: fix nvme-loop Kconfig dependencies
  nvmet: fix return value check in nvmet_subsys_alloc()
  ...
2016-07-26 15:37:51 -07:00