Commit Graph

248 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
2d76dea417 Merge 4.19.323 into android-4.19-stable
Changes in 4.19.323
	staging: iio: frequency: ad9833: Get frequency value statically
	staging: iio: frequency: ad9833: Load clock using clock framework
	staging: iio: frequency: ad9834: Validate frequency parameter value
	usbnet: ipheth: fix carrier detection in modes 1 and 4
	net: ethernet: use ip_hdrlen() instead of bit shift
	net: phy: vitesse: repair vsc73xx autonegotiation
	scripts: kconfig: merge_config: config files: add a trailing newline
	arm64: dts: rockchip: override BIOS_DISABLE signal via GPIO hog on RK3399 Puma
	net/mlx5: Update the list of the PCI supported devices
	net: ftgmac100: Enable TX interrupt to avoid TX timeout
	net: dpaa: Pad packets to ETH_ZLEN
	soundwire: stream: Revert "soundwire: stream: fix programming slave ports for non-continous port maps"
	selftests/vm: remove call to ksft_set_plan()
	selftests/kcmp: remove call to ksft_set_plan()
	ASoC: allow module autoloading for table db1200_pids
	pinctrl: at91: make it work with current gpiolib
	microblaze: don't treat zero reserved memory regions as error
	net: ftgmac100: Ensure tx descriptor updates are visible
	wifi: iwlwifi: mvm: fix iwl_mvm_max_scan_ie_fw_cmd_room()
	wifi: iwlwifi: mvm: don't wait for tx queues if firmware is dead
	ASoC: tda7419: fix module autoloading
	spi: bcm63xx: Enable module autoloading
	x86/hyperv: Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency
	ocfs2: add bounds checking to ocfs2_xattr_find_entry()
	ocfs2: strict bound check before memcmp in ocfs2_xattr_find_entry()
	gpio: prevent potential speculation leaks in gpio_device_get_desc()
	USB: serial: pl2303: add device id for Macrosilicon MS3020
	ACPI: PMIC: Remove unneeded check in tps68470_pmic_opregion_probe()
	wifi: ath9k: fix parameter check in ath9k_init_debug()
	wifi: ath9k: Remove error checks when creating debugfs entries
	netfilter: nf_tables: elements with timeout below CONFIG_HZ never expire
	wifi: cfg80211: fix UBSAN noise in cfg80211_wext_siwscan()
	wifi: cfg80211: fix two more possible UBSAN-detected off-by-one errors
	wifi: mac80211: use two-phase skb reclamation in ieee80211_do_stop()
	can: bcm: Clear bo->bcm_proc_read after remove_proc_entry().
	Bluetooth: btusb: Fix not handling ZPL/short-transfer
	block, bfq: fix possible UAF for bfqq->bic with merge chain
	block, bfq: choose the last bfqq from merge chain in bfq_setup_cooperator()
	block, bfq: don't break merge chain in bfq_split_bfqq()
	spi: ppc4xx: handle irq_of_parse_and_map() errors
	spi: ppc4xx: Avoid returning 0 when failed to parse and map IRQ
	ARM: versatile: fix OF node leak in CPUs prepare
	reset: berlin: fix OF node leak in probe() error path
	clocksource/drivers/qcom: Add missing iounmap() on errors in msm_dt_timer_init()
	hwmon: (max16065) Fix overflows seen when writing limits
	mtd: slram: insert break after errors in parsing the map
	hwmon: (ntc_thermistor) fix module autoloading
	power: supply: max17042_battery: Fix SOC threshold calc w/ no current sense
	fbdev: hpfb: Fix an error handling path in hpfb_dio_probe()
	drm/stm: Fix an error handling path in stm_drm_platform_probe()
	drm/amd: fix typo
	drm/amdgpu: Replace one-element array with flexible-array member
	drm/amdgpu: properly handle vbios fake edid sizing
	drm/radeon: Replace one-element array with flexible-array member
	drm/radeon: properly handle vbios fake edid sizing
	drm/rockchip: vop: Allow 4096px width scaling
	drm/radeon/evergreen_cs: fix int overflow errors in cs track offsets
	jfs: fix out-of-bounds in dbNextAG() and diAlloc()
	drm/msm/a5xx: properly clear preemption records on resume
	drm/msm/a5xx: fix races in preemption evaluation stage
	ipmi: docs: don't advertise deprecated sysfs entries
	drm/msm: fix %s null argument error
	xen: use correct end address of kernel for conflict checking
	xen/swiotlb: simplify range_straddles_page_boundary()
	xen/swiotlb: add alignment check for dma buffers
	selftests/bpf: Fix error compiling test_lru_map.c
	xz: cleanup CRC32 edits from 2018
	kthread: add kthread_work tracepoints
	kthread: fix task state in kthread worker if being frozen
	jbd2: introduce/export functions jbd2_journal_submit|finish_inode_data_buffers()
	ext4: clear EXT4_GROUP_INFO_WAS_TRIMMED_BIT even mount with discard
	smackfs: Use rcu_assign_pointer() to ensure safe assignment in smk_set_cipso
	ext4: avoid negative min_clusters in find_group_orlov()
	ext4: return error on ext4_find_inline_entry
	ext4: avoid OOB when system.data xattr changes underneath the filesystem
	nilfs2: fix potential null-ptr-deref in nilfs_btree_insert()
	nilfs2: determine empty node blocks as corrupted
	nilfs2: fix potential oob read in nilfs_btree_check_delete()
	perf sched timehist: Fix missing free of session in perf_sched__timehist()
	perf sched timehist: Fixed timestamp error when unable to confirm event sched_in time
	perf time-utils: Fix 32-bit nsec parsing
	clk: rockchip: Set parent rate for DCLK_VOP clock on RK3228
	drivers: media: dvb-frontends/rtl2832: fix an out-of-bounds write error
	drivers: media: dvb-frontends/rtl2830: fix an out-of-bounds write error
	PCI: xilinx-nwl: Fix register misspelling
	RDMA/iwcm: Fix WARNING:at_kernel/workqueue.c:#check_flush_dependency
	pinctrl: single: fix missing error code in pcs_probe()
	clk: ti: dra7-atl: Fix leak of of_nodes
	pinctrl: mvebu: Fix devinit_dove_pinctrl_probe function
	RDMA/cxgb4: Added NULL check for lookup_atid
	ntb: intel: Fix the NULL vs IS_ERR() bug for debugfs_create_dir()
	nfsd: call cache_put if xdr_reserve_space returns NULL
	f2fs: enhance to update i_mode and acl atomically in f2fs_setattr()
	f2fs: fix typo
	f2fs: fix to update i_ctime in __f2fs_setxattr()
	f2fs: remove unneeded check condition in __f2fs_setxattr()
	f2fs: reduce expensive checkpoint trigger frequency
	coresight: tmc: sg: Do not leak sg_table
	netfilter: nf_reject_ipv6: fix nf_reject_ip6_tcphdr_put()
	net: seeq: Fix use after free vulnerability in ether3 Driver Due to Race Condition
	tcp: introduce tcp_skb_timestamp_us() helper
	tcp: check skb is non-NULL in tcp_rto_delta_us()
	net: qrtr: Update packets cloning when broadcasting
	netfilter: ctnetlink: compile ctnetlink_label_size with CONFIG_NF_CONNTRACK_EVENTS
	crypto: aead,cipher - zeroize key buffer after use
	Remove *.orig pattern from .gitignore
	soc: versatile: integrator: fix OF node leak in probe() error path
	USB: appledisplay: close race between probe and completion handler
	USB: misc: cypress_cy7c63: check for short transfer
	firmware_loader: Block path traversal
	tty: rp2: Fix reset with non forgiving PCIe host bridges
	drbd: Fix atomicity violation in drbd_uuid_set_bm()
	drbd: Add NULL check for net_conf to prevent dereference in state validation
	ACPI: sysfs: validate return type of _STR method
	f2fs: prevent possible int overflow in dir_block_index()
	f2fs: avoid potential int overflow in sanity_check_area_boundary()
	vfs: fix race between evice_inodes() and find_inode()&iput()
	fs: Fix file_set_fowner LSM hook inconsistencies
	nfs: fix memory leak in error path of nfs4_do_reclaim
	PCI: xilinx-nwl: Use irq_data_get_irq_chip_data()
	PCI: xilinx-nwl: Fix off-by-one in INTx IRQ handler
	soc: versatile: realview: fix memory leak during device remove
	soc: versatile: realview: fix soc_dev leak during device remove
	usb: yurex: Replace snprintf() with the safer scnprintf() variant
	USB: misc: yurex: fix race between read and write
	pps: remove usage of the deprecated ida_simple_xx() API
	pps: add an error check in parport_attach
	i2c: aspeed: Update the stop sw state when the bus recovery occurs
	i2c: isch: Add missed 'else'
	usb: yurex: Fix inconsistent locking bug in yurex_read()
	mailbox: rockchip: fix a typo in module autoloading
	mailbox: bcm2835: Fix timeout during suspend mode
	ceph: remove the incorrect Fw reference check when dirtying pages
	netfilter: uapi: NFTA_FLOWTABLE_HOOK is NLA_NESTED
	netfilter: nf_tables: prevent nf_skb_duplicated corruption
	r8152: Factor out OOB link list waits
	net: ethernet: lantiq_etop: fix memory disclosure
	net: avoid potential underflow in qdisc_pkt_len_init() with UFO
	net: add more sanity checks to qdisc_pkt_len_init()
	ipv4: ip_gre: Fix drops of small packets in ipgre_xmit
	sctp: set sk_state back to CLOSED if autobind fails in sctp_listen_start
	ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs
	ALSA: hda/conexant: Fix conflicting quirk for System76 Pangolin
	f2fs: Require FMODE_WRITE for atomic write ioctls
	wifi: ath9k: fix possible integer overflow in ath9k_get_et_stats()
	wifi: ath9k_htc: Use __skb_set_length() for resetting urb before resubmit
	net: hisilicon: hip04: fix OF node leak in probe()
	net: hisilicon: hns_dsaf_mac: fix OF node leak in hns_mac_get_info()
	net: hisilicon: hns_mdio: fix OF node leak in probe()
	ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails
	ACPICA: Fix memory leak if acpi_ps_get_next_field() fails
	ACPI: EC: Do not release locks during operation region accesses
	ACPICA: check null return of ACPI_ALLOCATE_ZEROED() in acpi_db_convert_to_package()
	tipc: guard against string buffer overrun
	net: mvpp2: Increase size of queue_name buffer
	ipv4: Check !in_dev earlier for ioctl(SIOCSIFADDR).
	ipv4: Mask upper DSCP bits and ECN bits in NETLINK_FIB_LOOKUP family
	tcp: avoid reusing FIN_WAIT2 when trying to find port in connect() process
	ACPICA: iasl: handle empty connection_node
	wifi: mwifiex: Fix memcpy() field-spanning write warning in mwifiex_cmd_802_11_scan_ext()
	signal: Replace BUG_ON()s
	ALSA: asihpi: Fix potential OOB array access
	ALSA: hdsp: Break infinite MIDI input flush loop
	fbdev: pxafb: Fix possible use after free in pxafb_task()
	power: reset: brcmstb: Do not go into infinite loop if reset fails
	ata: sata_sil: Rename sil_blacklist to sil_quirks
	jfs: UBSAN: shift-out-of-bounds in dbFindBits
	jfs: Fix uaf in dbFreeBits
	jfs: check if leafidx greater than num leaves per dmap tree
	jfs: Fix uninit-value access of new_ea in ea_buffer
	drm/amd/display: Check stream before comparing them
	drm/amd/display: Fix index out of bounds in degamma hardware format translation
	drm/printer: Allow NULL data in devcoredump printer
	scsi: aacraid: Rearrange order of struct aac_srb_unit
	drm/radeon/r100: Handle unknown family in r100_cp_init_microcode()
	of/irq: Refer to actual buffer size in of_irq_parse_one()
	ext4: ext4_search_dir should return a proper error
	ext4: fix i_data_sem unlock order in ext4_ind_migrate()
	spi: s3c64xx: fix timeout counters in flush_fifo
	selftests: breakpoints: use remaining time to check if suspend succeed
	selftests: vDSO: fix vDSO symbols lookup for powerpc64
	i2c: xiic: Wait for TX empty to avoid missed TX NAKs
	spi: bcm63xx: Fix module autoloading
	perf/core: Fix small negative period being ignored
	parisc: Fix itlb miss handler for 64-bit programs
	ALSA: core: add isascii() check to card ID generator
	ext4: no need to continue when the number of entries is 1
	ext4: propagate errors from ext4_find_extent() in ext4_insert_range()
	ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space()
	ext4: aovid use-after-free in ext4_ext_insert_extent()
	ext4: fix double brelse() the buffer of the extents path
	ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit()
	parisc: Fix 64-bit userspace syscall path
	of/irq: Support #msi-cells=<0> in of_msi_get_domain
	jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error
	ocfs2: fix the la space leak when unmounting an ocfs2 volume
	ocfs2: fix uninit-value in ocfs2_get_block()
	ocfs2: reserve space for inline xattr before attaching reflink tree
	ocfs2: cancel dqi_sync_work before freeing oinfo
	ocfs2: remove unreasonable unlock in ocfs2_read_blocks
	ocfs2: fix null-ptr-deref when journal load failed.
	ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodate
	riscv: define ILLEGAL_POINTER_VALUE for 64bit
	aoe: fix the potential use-after-free problem in more places
	clk: rockchip: fix error for unknown clocks
	media: uapi/linux/cec.h: cec_msg_set_reply_to: zero flags
	media: venus: fix use after free bug in venus_remove due to race condition
	iio: magnetometer: ak8975: Fix reading for ak099xx sensors
	tomoyo: fallback to realpath if symlink's pathname does not exist
	Input: adp5589-keys - fix adp5589_gpio_get_value()
	btrfs: wait for fixup workers before stopping cleaner kthread during umount
	gpio: davinci: fix lazy disable
	ext4: avoid ext4_error()'s caused by ENOMEM in the truncate path
	ext4: fix slab-use-after-free in ext4_split_extent_at()
	ext4: update orig_path in ext4_find_extent()
	arm64: Add Cortex-715 CPU part definition
	arm64: cputype: Add Neoverse-N3 definitions
	arm64: errata: Expand speculative SSBS workaround once more
	uprobes: fix kernel info leak via "[uprobes]" vma
	nfsd: use ktime_get_seconds() for timestamps
	nfsd: fix delegation_blocked() to block correctly for at least 30 seconds
	rtc: at91sam9: drop platform_data support
	rtc: at91sam9: fix OF node leak in probe() error path
	ACPI: battery: Simplify battery hook locking
	ACPI: battery: Fix possible crash when unregistering a battery hook
	ext4: fix inode tree inconsistency caused by ENOMEM
	net: ethernet: cortina: Drop TSO support
	tracing: Remove precision vsnprintf() check from print event
	drm: Move drm_mode_setcrtc() local re-init to failure path
	drm/crtc: fix uninitialized variable use even harder
	virtio_console: fix misc probe bugs
	Input: synaptics-rmi4 - fix UAF of IRQ domain on driver removal
	bpf: Check percpu map value size first
	s390/facility: Disable compile time optimization for decompressor code
	s390/mm: Add cond_resched() to cmm_alloc/free_pages()
	ext4: nested locking for xattr inode
	s390/cpum_sf: Remove WARN_ON_ONCE statements
	ktest.pl: Avoid false positives with grub2 skip regex
	clk: bcm: bcm53573: fix OF node leak in init
	i2c: i801: Use a different adapter-name for IDF adapters
	PCI: Mark Creative Labs EMU20k2 INTx masking as broken
	media: videobuf2-core: clear memory related fields in __vb2_plane_dmabuf_put()
	usb: chipidea: udc: enable suspend interrupt after usb reset
	tools/iio: Add memory allocation failure check for trigger_name
	driver core: bus: Return -EIO instead of 0 when show/store invalid bus attribute
	fbdev: sisfb: Fix strbuf array overflow
	NFS: Remove print_overflow_msg()
	SUNRPC: Fix integer overflow in decode_rc_list()
	tcp: fix tcp_enter_recovery() to zero retrans_stamp when it's safe
	netfilter: br_netfilter: fix panic with metadata_dst skb
	Bluetooth: RFCOMM: FIX possible deadlock in rfcomm_sk_state_change
	gpio: aspeed: Add the flush write to ensure the write complete.
	clk: Add (devm_)clk_get_optional() functions
	clk: generalize devm_clk_get() a bit
	clk: Provide new devm_clk helpers for prepared and enabled clocks
	gpio: aspeed: Use devm_clk api to manage clock source
	igb: Do not bring the device up after non-fatal error
	net: ibm: emac: mal: fix wrong goto
	ppp: fix ppp_async_encode() illegal access
	net: ipv6: ensure we call ipv6_mc_down() at most once
	CDC-NCM: avoid overflow in sanity checking
	HID: plantronics: Workaround for an unexcepted opposite volume key
	Revert "usb: yurex: Replace snprintf() with the safer scnprintf() variant"
	usb: xhci: Fix problem with xhci resume from suspend
	usb: storage: ignore bogus device raised by JieLi BR21 USB sound chip
	net: Fix an unsafe loop on the list
	posix-clock: Fix missing timespec64 check in pc_clock_settime()
	arm64: probes: Remove broken LDR (literal) uprobe support
	arm64: probes: Fix simulate_ldr*_literal()
	PCI: Add function 0 DMA alias quirk for Glenfly Arise chip
	fat: fix uninitialized variable
	KVM: Fix a data race on last_boosted_vcpu in kvm_vcpu_on_spin()
	net: dsa: mv88e6xxx: Fix out-of-bound access
	s390/sclp_vt220: Convert newlines to CRLF instead of LFCR
	KVM: s390: Change virtual to physical address access in diag 0x258 handler
	x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET
	drm/vmwgfx: Handle surface check failure correctly
	iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig
	iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
	iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency()
	iio: light: opt3001: add missing full-scale range value
	Bluetooth: Remove debugfs directory on module init failure
	Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001
	xhci: Fix incorrect stream context type macro
	USB: serial: option: add support for Quectel EG916Q-GL
	USB: serial: option: add Telit FN920C04 MBIM compositions
	parport: Proper fix for array out-of-bounds access
	x86/apic: Always explicitly disarm TSC-deadline timer
	nilfs2: propagate directory read errors from nilfs_find_entry()
	clk: Fix pointer casting to prevent oops in devm_clk_release()
	clk: Fix slab-out-of-bounds error in devm_clk_release()
	RDMA/bnxt_re: Fix incorrect AVID type in WQE structure
	RDMA/cxgb4: Fix RDMA_CM_EVENT_UNREACHABLE error for iWARP
	RDMA/bnxt_re: Return more meaningful error
	drm/msm/dsi: fix 32-bit signed integer extension in pclk_rate calculation
	macsec: don't increment counters for an unrelated SA
	net: ethernet: aeroflex: fix potential memory leak in greth_start_xmit_gbit()
	net: systemport: fix potential memory leak in bcm_sysport_xmit()
	usb: typec: altmode should keep reference to parent
	Bluetooth: bnep: fix wild-memory-access in proto_unregister
	arm64:uprobe fix the uprobe SWBP_INSN in big-endian
	arm64: probes: Fix uprobes for big-endian kernels
	KVM: s390: gaccess: Refactor gpa and length calculation
	KVM: s390: gaccess: Refactor access address range check
	KVM: s390: gaccess: Cleanup access to guest pages
	KVM: s390: gaccess: Check if guest address is in memslot
	udf: fix uninit-value use in udf_get_fileshortad
	jfs: Fix sanity check in dbMount
	net/sun3_82586: fix potential memory leak in sun3_82586_send_packet()
	be2net: fix potential memory leak in be_xmit()
	net: usb: usbnet: fix name regression
	posix-clock: posix-clock: Fix unbalanced locking in pc_clock_settime()
	ALSA: hda/realtek: Update default depop procedure
	drm/amd: Guard against bad data for ATIF ACPI method
	ACPI: button: Add DMI quirk for Samsung Galaxy Book2 to fix initial lid detection issue
	nilfs2: fix kernel bug due to missing clearing of buffer delay flag
	hv_netvsc: Fix VF namespace also in synthetic NIC NETDEV_REGISTER event
	selinux: improve error checking in sel_write_load()
	arm64/uprobes: change the uprobe_opcode_t typedef to fix the sparse warning
	xfrm: validate new SA's prefixlen using SA family when sel.family is unset
	usb: dwc3: remove generic PHY calibrate() calls
	usb: dwc3: Add splitdisable quirk for Hisilicon Kirin Soc
	usb: dwc3: core: Stop processing of pending events if controller is halted
	cgroup: Fix potential overflow issue when checking max_depth
	wifi: mac80211: skip non-uploaded keys in ieee80211_iter_keys
	gtp: simplify error handling code in 'gtp_encap_enable()'
	gtp: allow -1 to be specified as file description from userspace
	net/sched: stop qdisc_tree_reduce_backlog on TC_H_ROOT
	bpf: Fix out-of-bounds write in trie_get_next_key()
	net: support ip generic csum processing in skb_csum_hwoffload_help
	net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension
	netfilter: nft_payload: sanitize offset and length before calling skb_checksum()
	firmware: arm_sdei: Fix the input parameter of cpuhp_remove_state()
	net: amd: mvme147: Fix probe banner message
	misc: sgi-gru: Don't disable preemption in GRU driver
	usbip: tools: Fix detach_port() invalid port error path
	usb: phy: Fix API devm_usb_put_phy() can not release the phy
	xhci: Fix Link TRB DMA in command ring stopped completion event
	Revert "driver core: Fix uevent_show() vs driver detach race"
	wifi: mac80211: do not pass a stopped vif to the driver in .get_txpower
	wifi: ath10k: Fix memory leak in management tx
	wifi: iwlegacy: Clear stale interrupts before resuming device
	nilfs2: fix potential deadlock with newly created symlinks
	ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow
	nilfs2: fix kernel bug due to missing clearing of checked flag
	mm: shmem: fix data-race in shmem_getattr()
	vt: prevent kernel-infoleak in con_font_get()
	Linux 4.19.323

Change-Id: I2348f834187153067ab46b3b48b8fe7da9cee1f1
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-11-09 11:24:17 +00:00
Chun-Yi Lee
12f7b89dd7 aoe: fix the potential use-after-free problem in more places
commit 6d6e54fc71ad1ab0a87047fd9c211e75d86084a3 upstream.

For fixing CVE-2023-6270, f98364e92662 ("aoe: fix the potential
use-after-free problem in aoecmd_cfg_pkts") makes tx() calling dev_put()
instead of doing in aoecmd_cfg_pkts(). It avoids that the tx() runs
into use-after-free.

Then Nicolai Stange found more places in aoe have potential use-after-free
problem with tx(). e.g. revalidate(), aoecmd_ata_rw(), resend(), probe()
and aoecmd_cfg_rsp(). Those functions also use aoenet_xmit() to push
packet to tx queue. So they should also use dev_hold() to increase the
refcnt of skb->dev.

On the other hand, moving dev_put() to tx() causes that the refcnt of
skb->dev be reduced to a negative value, because corresponding
dev_hold() are not called in revalidate(), aoecmd_ata_rw(), resend(),
probe(), and aoecmd_cfg_rsp(). This patch fixed this issue.

Cc: stable@vger.kernel.org
Link: https://nvd.nist.gov/vuln/detail/CVE-2023-6270
Fixes: f98364e92662 ("aoe: fix the potential use-after-free problem in aoecmd_cfg_pkts")
Reported-by: Nicolai Stange <nstange@suse.com>
Signed-off-by: Chun-Yi Lee <jlee@suse.com>
Link: https://lore.kernel.org/stable/20240624064418.27043-1-jlee%40suse.com
Link: https://lore.kernel.org/r/20241002035458.24401-1-jlee@suse.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-08 16:19:14 +01:00
Greg Kroah-Hartman
32e0a5db51 Merge 4.19.311 into android-4.19-stable
Changes in 4.19.311
	ASoC: rt5645: Make LattePanda board DMI match more precise
	x86/xen: Add some null pointer checking to smp.c
	MIPS: Clear Cause.BD in instruction_pointer_set
	net/iucv: fix the allocation size of iucv_path_table array
	block: sed-opal: handle empty atoms when parsing response
	dm-verity, dm-crypt: align "struct bvec_iter" correctly
	scsi: mpt3sas: Prevent sending diag_reset when the controller is ready
	Bluetooth: rfcomm: Fix null-ptr-deref in rfcomm_check_security
	firewire: core: use long bus reset on gap count error
	ASoC: Intel: bytcr_rt5640: Add an extra entry for the Chuwi Vi8 tablet
	Input: gpio_keys_polled - suppress deferred probe error for gpio
	ASoC: wm8962: Enable oscillator if selecting WM8962_FLL_OSC
	ASoC: wm8962: Enable both SPKOUTR_ENA and SPKOUTL_ENA in mono mode
	ASoC: wm8962: Fix up incorrect error message in wm8962_set_fll
	crypto: algif_aead - fix uninitialized ctx->init
	crypto: af_alg - make some functions static
	crypto: algif_aead - Only wake up when ctx->more is zero
	do_sys_name_to_handle(): use kzalloc() to fix kernel-infoleak
	fs/select: rework stack allocation hack for clang
	md: switch to ->check_events for media change notifications
	block: add a new set_read_only method
	md: implement ->set_read_only to hook into BLKROSET processing
	md: Don't clear MD_CLOSING when the raid is about to stop
	aoe: fix the potential use-after-free problem in aoecmd_cfg_pkts
	timekeeping: Fix cross-timestamp interpolation on counter wrap
	timekeeping: Fix cross-timestamp interpolation corner case decision
	timekeeping: Fix cross-timestamp interpolation for non-x86
	wifi: ath10k: fix NULL pointer dereference in ath10k_wmi_tlv_op_pull_mgmt_tx_compl_ev()
	b43: dma: Fix use true/false for bool type variable
	wifi: b43: Stop/wake correct queue in DMA Tx path when QoS is disabled
	wifi: b43: Stop/wake correct queue in PIO Tx path when QoS is disabled
	b43: main: Fix use true/false for bool type
	wifi: b43: Stop correct queue in DMA worker when QoS is disabled
	wifi: b43: Disable QoS for bcm4331
	wifi: mwifiex: debugfs: Drop unnecessary error check for debugfs_create_dir()
	sock_diag: annotate data-races around sock_diag_handlers[family]
	af_unix: Annotate data-race of gc_in_progress in wait_for_unix_gc().
	wifi: libertas: fix some memleaks in lbs_allocate_cmd_buffer()
	ACPI: processor_idle: Fix memory leak in acpi_processor_power_exit()
	bus: tegra-aconnect: Update dependency to ARCH_TEGRA
	iommu/amd: Mark interrupt as managed
	wifi: brcmsmac: avoid function pointer casts
	ARM: dts: arm: realview: Fix development chip ROM compatible value
	ACPI: scan: Fix device check notification handling
	x86, relocs: Ignore relocations in .notes section
	SUNRPC: fix some memleaks in gssx_dec_option_array
	mmc: wmt-sdmmc: remove an incorrect release_mem_region() call in the .remove function
	igb: move PEROUT and EXTTS isr logic to separate functions
	igb: Fix missing time sync events
	Bluetooth: Remove superfluous call to hci_conn_check_pending()
	Bluetooth: hci_core: Fix possible buffer overflow
	sr9800: Add check for usbnet_get_endpoints
	bpf: Fix hashtab overflow check on 32-bit arches
	bpf: Fix stackmap overflow check on 32-bit arches
	ipv6: fib6_rules: flush route cache when rule is changed
	tcp: fix incorrect parameter validation in the do_tcp_getsockopt() function
	l2tp: fix incorrect parameter validation in the pppol2tp_getsockopt() function
	udp: fix incorrect parameter validation in the udp_lib_getsockopt() function
	net: kcm: fix incorrect parameter validation in the kcm_getsockopt) function
	net/x25: fix incorrect parameter validation in the x25_getsockopt() function
	nfp: flower: handle acti_netdevs allocation failure
	dm raid: fix false positive for requeue needed during reshape
	dm: call the resume method on internal suspend
	drm/tegra: dsi: Add missing check for of_find_device_by_node
	gpu: host1x: mipi: Update tegra_mipi_request() to be node based
	drm/tegra: dsi: Make use of the helper function dev_err_probe()
	drm/tegra: dsi: Fix some error handling paths in tegra_dsi_probe()
	drm/tegra: dsi: Fix missing pm_runtime_disable() in the error handling path of tegra_dsi_probe()
	drm/rockchip: inno_hdmi: Fix video timing
	drm: Don't treat 0 as -1 in drm_fixp2int_ceil
	drm/rockchip: lvds: do not overwrite error code
	drm/rockchip: lvds: do not print scary message when probing defer
	media: tc358743: register v4l2 async device only after successful setup
	perf evsel: Fix duplicate initialization of data->id in evsel__parse_sample()
	ABI: sysfs-bus-pci-devices-aer_stats uses an invalid tag
	media: em28xx: annotate unchecked call to media_device_register()
	media: v4l2-tpg: fix some memleaks in tpg_alloc
	media: v4l2-mem2mem: fix a memleak in v4l2_m2m_register_entity
	media: dvbdev: remove double-unlock
	media: media/dvb: Use kmemdup rather than duplicating its implementation
	media: dvbdev: Fix memleak in dvb_register_device
	media: dvbdev: fix error logic at dvb_register_device()
	media: dvb-core: Fix use-after-free due to race at dvb_register_device()
	media: edia: dvbdev: fix a use-after-free
	clk: qcom: reset: Allow specifying custom reset delay
	clk: qcom: reset: support resetting multiple bits
	clk: qcom: reset: Commonize the de/assert functions
	clk: qcom: reset: Ensure write completion on reset de/assertion
	quota: code cleanup for __dquot_alloc_space()
	fs/quota: erase unused but set variable warning
	quota: check time limit when back out space/inode change
	quota: simplify drop_dquot_ref()
	quota: Fix potential NULL pointer dereference
	quota: Fix rcu annotations of inode dquot pointers
	perf thread_map: Free strlist on normal path in thread_map__new_by_tid_str()
	drm/radeon/ni: Fix wrong firmware size logging in ni_init_microcode()
	ALSA: seq: fix function cast warnings
	media: go7007: add check of return value of go7007_read_addr()
	media: pvrusb2: fix pvr2_stream_callback casts
	firmware: qcom: scm: Add WLAN VMID for Qualcomm SCM interface
	clk: qcom: dispcc-sdm845: Adjust internal GDSC wait times
	drm/mediatek: dsi: Fix DSI RGB666 formats and definitions
	PCI: Mark 3ware-9650SE Root Port Extended Tags as broken
	clk: hisilicon: hi3519: Release the correct number of gates in hi3519_clk_unregister()
	drm/tegra: put drm_gem_object ref on error in tegra_fb_create
	mfd: syscon: Call of_node_put() only when of_parse_phandle() takes a ref
	crypto: arm - Rename functions to avoid conflict with crypto/sha256.h
	crypto: arm/sha - fix function cast warnings
	mtd: rawnand: lpc32xx_mlc: fix irq handler prototype
	ASoC: meson: axg-tdm-interface: fix mclk setup without mclk-fs
	drm/amdgpu: Fix missing break in ATOM_ARG_IMM Case of atom_get_src_int()
	media: pvrusb2: fix uaf in pvr2_context_set_notify
	media: dvb-frontends: avoid stack overflow warnings with clang
	media: go7007: fix a memleak in go7007_load_encoder
	drm/mediatek: Fix a null pointer crash in mtk_drm_crtc_finish_page_flip
	powerpc/hv-gpci: Fix the H_GET_PERF_COUNTER_INFO hcall return value checks
	powerpc/embedded6xx: Fix no previous prototype for avr_uart_send() etc.
	backlight: lm3630a: Initialize backlight_properties on init
	backlight: lm3630a: Don't set bl->props.brightness in get_brightness
	backlight: da9052: Fully initialize backlight_properties during probe
	backlight: lm3639: Fully initialize backlight_properties during probe
	backlight: lp8788: Fully initialize backlight_properties during probe
	sparc32: Fix section mismatch in leon_pci_grpci
	ALSA: usb-audio: Stop parsing channels bits when all channels are found.
	scsi: csiostor: Avoid function pointer casts
	scsi: bfa: Fix function pointer type mismatch for hcb_qe->cbfn
	net: sunrpc: Fix an off by one in rpc_sockaddr2uaddr()
	NFS: Fix an off by one in root_nfs_cat()
	clk: qcom: gdsc: Add support to update GDSC transition delay
	serial: max310x: fix syntax error in IRQ error message
	tty: serial: samsung: fix tx_empty() to return TIOCSER_TEMT
	kconfig: fix infinite loop when expanding a macro at the end of file
	rtc: mt6397: select IRQ_DOMAIN instead of depending on it
	serial: 8250_exar: Don't remove GPIO device on suspend
	staging: greybus: fix get_channel_from_mode() failure path
	usb: gadget: net2272: Use irqflags in the call to net2272_probe_fin
	net: hsr: fix placement of logical operator in a multi-line statement
	hsr: Fix uninit-value access in hsr_get_node()
	rds: introduce acquire/release ordering in acquire/release_in_xmit()
	hsr: Handle failures in module init
	net/bnx2x: Prevent access to a freed page in page_pool
	spi: spi-mt65xx: Fix NULL pointer access in interrupt handler
	crypto: af_alg - Fix regression on empty requests
	crypto: af_alg - Work around empty control messages without MSG_MORE
	Linux 4.19.311

Change-Id: I034e9a44b6dec1a7b5c600b3cd77aabc401044d7
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-04-16 10:07:50 +00:00
Chun-Yi Lee
ad80c34944 aoe: fix the potential use-after-free problem in aoecmd_cfg_pkts
[ Upstream commit f98364e926626c678fb4b9004b75cacf92ff0662 ]

This patch is against CVE-2023-6270. The description of cve is:

  A flaw was found in the ATA over Ethernet (AoE) driver in the Linux
  kernel. The aoecmd_cfg_pkts() function improperly updates the refcnt on
  `struct net_device`, and a use-after-free can be triggered by racing
  between the free on the struct and the access through the `skbtxq`
  global queue. This could lead to a denial of service condition or
  potential code execution.

In aoecmd_cfg_pkts(), it always calls dev_put(ifp) when skb initial
code is finished. But the net_device ifp will still be used in
later tx()->dev_queue_xmit() in kthread. Which means that the
dev_put(ifp) should NOT be called in the success path of skb
initial code in aoecmd_cfg_pkts(). Otherwise tx() may run into
use-after-free because the net_device is freed.

This patch removed the dev_put(ifp) in the success path in
aoecmd_cfg_pkts(), and added dev_put() after skb xmit in tx().

Link: https://nvd.nist.gov/vuln/detail/CVE-2023-6270
Fixes: 7562f876cd ("[NET]: Rework dev_base via list_head (v3)")
Signed-off-by: Chun-Yi Lee <jlee@suse.com>
Link: https://lore.kernel.org/r/20240305082048.25526-1-jlee@suse.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-26 18:22:34 -04:00
Greg Kroah-Hartman
512a245be4 Revert "aoe: register default groups with device_add_disk()"
This reverts commit 4fdc94b476.

The patch series submitted for the 4.19-stable branch:
	https://lore.kernel.org/r/20210223092859.17033-1-jefflexu@linux.alibaba.com
should be reverted from the android-4.19-stable branch as it breaks the
ABI for device_add_disk() and the issue the series is resolving does not
affect Android devices.  So revert the whole thing.

Bug: 161946584
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I483477f07ae5359975f7c11bf90bdb29269f1b03
2021-03-11 17:13:48 +01:00
Hannes Reinecke
4fdc94b476 aoe: register default groups with device_add_disk()
commit 95cf7809bf9169fec4e4b7bb24b8069d8f354f96 upstream.

Register default sysfs groups during device_add_disk() to avoid a
race condition with udev during startup.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ed L. Cachin <ed.cashin@acm.org>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-11 14:05:00 +01:00
zhong jiang
69daf897d7 drivers/block/aoe/aoedev: NULL check is not needed for mempool_destroy
mempool_destroy has taken the null pointer into account. So it is safe
to remove the null check.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-08 09:17:20 -06:00
Gustavo A. R. Silva
99972f171b aoe: mark expected switch fall-through
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114722 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-02 09:59:00 -06:00
Joe Perches
5657a819a8 block drivers/block: Use octal not symbolic permissions
Convert the S_<FOO> symbolic permissions to their octal equivalents as
using octal and not symbolic permissions is preferred by many as more
readable.

see: https://lkml.org/lkml/2016/8/2/1945

Done with automated conversion via:
$ ./scripts/checkpatch.pl -f --types=SYMBOLIC_PERMS --fix-inplace <files...>

Miscellanea:

o Wrapped modified multi-line calls to a single line where appropriate
o Realign modified multi-line calls to open parenthesis

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-24 13:38:59 -06:00
Christoph Hellwig
ad180f6f71 aoe: handle highmem pages
Use kmap_atomic when copying out of a bio_vec.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-11 15:08:00 -06:00
Tina Ruchandani
85cf955df8 aoe: use ktime_t instead of timeval
'struct frame' uses two variables to store the sent timestamp - 'struct
timeval' and jiffies. jiffies is used to avoid discrepancies caused by
updates to system time. 'struct timeval' is deprecated because it uses
32-bit representation for seconds which will overflow in year 2038.

This patch does the following:
- Replace the use of 'struct timeval' and jiffies with ktime_t, which
  is the recommended type for timestamping
- ktime_t provides both long range (like jiffies) and high resolution
  (like timeval). Using ktime_get (monotonic time) instead of wall-clock
  time prevents any discprepancies caused by updates to system time.

[updates by Arnd below]
The original patch from Tina never went anywhere as we discussed how
to keep the impact on performance minimal. I've started over now but
arrived at basically the same patch that she had originally, except for
an slightly improved tsince_hr() function. I'm making it more robust
against overflows, and also optimize explicitly for the common case
in which a frame is less than 4.2 seconds old, using only a 32-bit
division in that case.

This should make the new version more efficient than the old code,
since we replace the existing two 32-bit division in do_gettimeofday()
plus one multiplication with a single single 32-bit division in
tsince_hr() and drop the double bookkeeping. It's also more efficient
than the ktime_get_us() API we discussed before, since that would
also rely on multiple divisions.

Link: https://lists.linaro.org/pipermail/y2038/2015-May/000276.html
Signed-off-by: Tina Ruchandani <ruchandani.tina@gmail.com>
Cc: Ed Cashin <ed.cashin@acm.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-17 08:41:07 -07:00
Kees Cook
841b86f328 treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts
With all callbacks converted, and the timer callback prototype
switched over, the TIMER_FUNC_TYPE cast is no longer needed,
so remove it. Conversion was done with the following scripts:

    perl -pi -e 's|\(TIMER_FUNC_TYPE\)||g' \
        $(git grep TIMER_FUNC_TYPE | cut -d: -f1 | sort -u)

    perl -pi -e 's|\(TIMER_DATA_TYPE\)||g' \
        $(git grep TIMER_DATA_TYPE | cut -d: -f1 | sort -u)

The now unused macros are also dropped from include/linux/timer.h.

Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-21 16:35:54 -08:00
Kees Cook
0e0cc9df86 block/aoe: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Ed L. Cashin" <ed.cashin@acm.org>
Cc: linux-block@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-14 20:11:57 -07:00
Kees Cook
5ea22086ed block/aoe: discover_timer: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

This refactors the discover_timer to remove the needless locking and
state machine used for synchronizing timer death. Using del_timer_sync()
will already do the right thing.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Ed L. Cashin" <ed.cashin@acm.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-06 12:50:09 -08:00
Christoph Hellwig
8fc450443e block: don't set bounce limit in blk_init_queue
Instead move it to the callers.  Those that either don't use bio_data() or
page_address() or are specific to architectures that do not support highmem
are skipped.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-27 12:13:45 -06:00
Christoph Hellwig
4e4cbee93d block: switch bios to blk_status_t
Replace bi_error with a new bi_status to allow for a clear conversion.
Note that device mapper overloaded bi_error with a private value, which
we'll have to keep arround at least for now and thus propagate to a
proper blk_status_t value.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-09 09:27:32 -06:00
Christoph Hellwig
2a842acab1 block: introduce new block status code type
Currently we use nornal Linux errno values in the block layer, and while
we accept any error a few have overloaded magic meanings.  This patch
instead introduces a new  blk_status_t value that holds block layer specific
status codes and explicitly explains their meaning.  Helpers to convert from
and to the previous special meanings are provided for now, but I suspect
we want to get rid of them in the long run - those drivers that have a
errno input (e.g. networking) usually get errnos that don't know about
the special block layer overloads, and similarly returning them to userspace
will usually return somethings that strictly speaking isn't correct
for file system operations, but that's left as an exercise for later.

For now the set of errors is a very limited set that closely corresponds
to the previous overloaded errno values, but there is some low hanging
fruite to improve it.

blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse
typechecking, so that we can easily catch places passing the wrong values.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-09 09:27:32 -06:00
Jan Kara
dc3b17cc8b block: Use pointer to backing_dev_info from request_queue
We will want to have struct backing_dev_info allocated separately from
struct request_queue. As the first step add pointer to backing_dev_info
to request_queue and convert all users touching it. No functional
changes in this patch.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-02 08:20:48 -07:00
Jens Axboe
0cbc72a178 aoe: fix crash in page count manipulation
aoeblk contains some mysterious code, that wants to elevate the bio
vec page counts while it's under IO. That is not needed, it's
fragile, and it's causing kernel oopses for some.

Reported-by: Tested-by: Don Koch <kochd@us.ibm.com>
Tested-by: Tested-by: Don Koch <kochd@us.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-12 08:27:07 -07:00
Michal Hocko
32d6bd9059 tree wide: get rid of __GFP_REPEAT for order-0 allocations part I
This is the third version of the patchset previously sent [1].  I have
basically only rebased it on top of 4.7-rc1 tree and dropped "dm: get
rid of superfluous gfp flags" which went through dm tree.  I am sending
it now because it is tree wide and chances for conflicts are reduced
considerably when we want to target rc2.  I plan to send the next step
and rename the flag and move to a better semantic later during this
release cycle so we will have a new semantic ready for 4.8 merge window
hopefully.

Motivation:

While working on something unrelated I've checked the current usage of
__GFP_REPEAT in the tree.  It seems that a majority of the usage is and
always has been bogus because __GFP_REPEAT has always been about costly
high order allocations while we are using it for order-0 or very small
orders very often.  It seems that a big pile of them is just a
copy&paste when a code has been adopted from one arch to another.

I think it makes some sense to get rid of them because they are just
making the semantic more unclear.  Please note that GFP_REPEAT is
documented as

* __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt

* _might_ fail.  This depends upon the particular VM implementation.
  while !costly requests have basically nofail semantic.  So one could
  reasonably expect that order-0 request with __GFP_REPEAT will not loop
  for ever.  This is not implemented right now though.

I would like to move on with __GFP_REPEAT and define a better semantic
for it.

  $ git grep __GFP_REPEAT origin/master | wc -l
  111
  $ git grep __GFP_REPEAT | wc -l
  36

So we are down to the third after this patch series.  The remaining
places really seem to be relying on __GFP_REPEAT due to large allocation
requests.  This still needs some double checking which I will do later
after all the simple ones are sorted out.

I am touching a lot of arch specific code here and I hope I got it right
but as a matter of fact I even didn't compile test for some archs as I
do not have cross compiler for them.  Patches should be quite trivial to
review for stupid compile mistakes though.  The tricky parts are usually
hidden by macro definitions and thats where I would appreciate help from
arch maintainers.

[1] http://lkml.kernel.org/r/1461849846-27209-1-git-send-email-mhocko@kernel.org

This patch (of 19):

__GFP_REPEAT has a rather weak semantic but since it has been introduced
around 2.6.12 it has been ignored for low order allocations.  Yet we
have the full kernel tree with its usage for apparently order-0
allocations.  This is really confusing because __GFP_REPEAT is
explicitly documented to allow allocation failures which is a weaker
semantic than the current order-0 has (basically nofail).

Let's simply drop __GFP_REPEAT from those places.  This would allow to
identify place which really need allocator to retry harder and formulate
a more specific semantic for what the flag is supposed to do actually.

Link: http://lkml.kernel.org/r/1464599699-30131-2-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: John Crispin <blogic@openwrt.org>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-24 17:23:52 -07:00
Joonsoo Kim
0139aa7b7f mm: rename _count, field of the struct page, to _refcount
Many developers already know that field for reference count of the
struct page is _count and atomic type.  They would try to handle it
directly and this could break the purpose of page reference count
tracepoint.  To prevent direct _count modification, this patch rename it
to _refcount and add warning message on the code.  After that, developer
who need to handle reference count will find that field should not be
accessed directly.

[akpm@linux-foundation.org: fix comments, per Vlastimil]
[akpm@linux-foundation.org: Documentation/vm/transhuge.txt too]
[sfr@canb.auug.org.au: sync ethernet driver changes]
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Sunil Goutham <sgoutham@cavium.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Manish Chopra <manish.chopra@qlogic.com>
Cc: Yuval Mintz <yuval.mintz@qlogic.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19 19:12:14 -07:00
Kirill A. Shutemov
09cbfeaf1a mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized.  And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE.  And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special.  They are
not.

The changes are pretty straight-forward:

 - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

 - page_cache_get() -> get_page();

 - page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below.  For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach.  I'll
fix them manually in a separate patch.  Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-04 10:41:08 -07:00
Joonsoo Kim
fe896d1878 mm: introduce page reference manipulation functions
The success of CMA allocation largely depends on the success of
migration and key factor of it is page reference count.  Until now, page
reference is manipulated by direct calling atomic functions so we cannot
follow up who and where manipulate it.  Then, it is hard to find actual
reason of CMA allocation failure.  CMA allocation should be guaranteed
to succeed so finding offending place is really important.

In this patch, call sites where page reference is manipulated are
converted to introduced wrapper function.  This is preparation step to
add tracepoint to each page reference manipulation function.  With this
facility, we can easily find reason of CMA allocation failure.  There is
no functional change in this patch.

In addition, this patch also converts reference read sites.  It will
help a second step that renames page._count to something else and
prevents later attempt to direct access to it (Suggested by Andrew).

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Al Viro
5955102c99 wrappers for ->i_mutex access
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-22 18:04:28 -05:00
Jeff Moyer
30e2bc08b2 Revert "block: remove artifical max_hw_sectors cap"
This reverts commit 34b48db66e.
That commit caused performance regressions for streaming I/O
workloads on a number of different storage devices, from
SATA disks to external RAID arrays.  It also managed to
trip up some buggy firmware in at least one drive, causing
data corruption.

The next patch will bump the default max_sectors_kb value to
1280, which will accommodate a 10-data-disk stripe write
with chunk size 128k.  In the testing I've done using iozone,
fio, and aio-stress, a value of 1280 does not show a big
performance difference from 512.  This will hopefully still
help the software RAID setup that Christoph saw the original
performance gains with while still not regressing other
storage configurations.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-08-18 13:21:13 -07:00
Christoph Hellwig
4246a0b63b block: add a bi_error field to struct bio
Currently we have two different ways to signal an I/O error on a BIO:

 (1) by clearing the BIO_UPTODATE flag
 (2) by returning a Linux errno value to the bi_end_io callback

The first one has the drawback of only communicating a single possible
error (-EIO), and the second one has the drawback of not beeing persistent
when bios are queued up, and are not passed along from child to parent
bio in the ever more popular chaining scenario.  Having both mechanisms
available has the additional drawback of utterly confusing driver authors
and introducing bugs where various I/O submitters only deal with one of
them, and the others have to add boilerplate code to deal with both kinds
of error returns.

So add a new bi_error field to store an errno value directly in struct
bio and remove the existing mechanisms to clean all this up.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-07-29 08:55:15 -06:00
Christoph Hellwig
34b48db66e block: remove artifical max_hw_sectors cap
Set max_sectors to the value the drivers provides as hardware limit by
default.  Linux had proper I/O throttling for a long time and doesn't
rely on a artifically small maximum I/O size anymore.  By not limiting
the I/O size by default we remove an annoying tuning step required for
most Linux installation.

Note that both the user, and if absolutely required the driver can still
impose a limit for FS requests below max_hw_sectors_kb.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-10-21 14:02:54 -06:00
David Rientjes
668f9abbd4 mm: close PageTail race
Commit bf6bddf192 ("mm: introduce compaction and migration for
ballooned pages") introduces page_count(page) into memory compaction
which dereferences page->first_page if PageTail(page).

This results in a very rare NULL pointer dereference on the
aforementioned page_count(page).  Indeed, anything that does
compound_head(), including page_count() is susceptible to racing with
prep_compound_page() and seeing a NULL or dangling page->first_page
pointer.

This patch uses Andrea's implementation of compound_trans_head() that
deals with such a race and makes it the default compound_head()
implementation.  This includes a read memory barrier that ensures that
if PageTail(head) is true that we return a head page that is neither
NULL nor dangling.  The patch then adds a store memory barrier to
prep_compound_page() to ensure page->first_page is set.

This is the safest way to ensure we see the head page that we are
expecting, PageTail(page) is already in the unlikely() path and the
memory barriers are unfortunately required.

Hugetlbfs is the exception, we don't enforce a store memory barrier
during init since no race is possible.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Holger Kiehl <Holger.Kiehl@dwd.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-04 07:55:47 -08:00
Kent Overstreet
feb261e2ee aoe: Convert to immutable biovecs
Now that we've got a mechanism for immutable biovecs -
bi_iter.bi_bvec_done - we need to convert drivers to use primitives that
respect it instead of using the bvec array directly.

The aoe code no longer has to manually iterate over partial bvecs, so
some struct members go away - other struct members are effectively
renamed:

buf->resid	-> buf->iter.bi_size
buf->sector	-> buf->iter.bi_sector

f->bcnt		-> f->iter.bi_size
f->lba		-> f->iter.bi_sector

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
2013-11-23 22:33:52 -08:00
Kent Overstreet
7988613b0e block: Convert bio_for_each_segment() to bvec_iter
More prep work for immutable biovecs - with immutable bvecs drivers
won't be able to use the biovec directly, they'll need to use helpers
that take into account bio->bi_iter.bi_bvec_done.

This updates callers for the new usage without changing the
implementation yet.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: Jim Paris <jim@jtan.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Alex Elder <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris <josh.h.morris@us.ibm.com>
Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
Cc: support@lsi.com
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Quoc-Son Anh <quoc-sonx.anh@intel.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jan Kara <jack@suse.cz>
Cc: linux-m68k@lists.linux-m68k.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: drbd-user@lists.linbit.com
Cc: nbd-general@lists.sourceforge.net
Cc: cbe-oss-dev@lists.ozlabs.org
Cc: xen-devel@lists.xensource.com
Cc: virtualization@lists.linux-foundation.org
Cc: linux-raid@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: DL-MPTFusionLinux@lsi.com
Cc: linux-scsi@vger.kernel.org
Cc: devel@driverdev.osuosl.org
Cc: linux-fsdevel@vger.kernel.org
Cc: cluster-devel@redhat.com
Cc: linux-mm@kvack.org
Acked-by: Geoff Levand <geoff@infradead.org>
2013-11-23 22:33:49 -08:00
Kent Overstreet
a4ad39b1d1 block: Convert bio_iovec() to bvec_iter
For immutable biovecs, we'll be introducing a new bio_iovec() that uses
our new bvec iterator to construct a biovec, taking into account
bvec_iter->bi_bvec_done - this patch updates existing users for the new
usage.

Some of the existing users really do need a pointer into the bvec array
- those uses are all going to be removed, but we'll need the
functionality from immutable to remove them - so for now rename the
existing bio_iovec() -> __bio_iovec(), and it'll be removed in a couple
patches.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: dm-devel@redhat.com
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
2013-11-23 22:33:49 -08:00
Kent Overstreet
4f024f3797 block: Abstract out bvec iterator
Immutable biovecs are going to require an explicit iterator. To
implement immutable bvecs, a later patch is going to add a bi_bvec_done
member to this struct; for now, this patch effectively just renames
things.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Alex Elder <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris <josh.h.morris@us.ibm.com>
Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: dm-devel@redhat.com
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: Boaz Harrosh <bharrosh@panasas.com>
Cc: Benny Halevy <bhalevy@tonian.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Chris Mason <chris.mason@fusionio.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Dave Kleikamp <shaggy@kernel.org>
Cc: Joern Engel <joern@logfs.org>
Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Ben Myers <bpm@sgi.com>
Cc: xfs@oss.sgi.com
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchand@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Peng Tao <tao.peng@emc.com>
Cc: Andy Adamson <andros@netapp.com>
Cc: fanchaoting <fanchaoting@cn.fujitsu.com>
Cc: Jie Liu <jeff.liu@oracle.com>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: Pankaj Kumar <pankaj.km@samsung.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Mel Gorman <mgorman@suse.de>6
2013-11-23 22:33:47 -08:00
Ed Cashin
fea1b13973 aoe: do not BUG if memory pressure prevented debugfs file creation
If the system has trouble allocating memory for the creation of the aoe
debugfs directory or of a file inside it, the debugfs member of an aoedev
can be NULL.

Do not treat a NULL debugfs pointer as a BUG on aoedev shutdown, avoiding
the user impact of an unecessary panic.

Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:59:28 -07:00
Andy Shevchenko
e0ec360597 aoe: suppress compiler warnings
This patch fixes following compiler warnings:

  drivers/block/aoe/aoecmd.c: In function `aoecmd_ata_rw':
  drivers/block/aoe/aoecmd.c:383:17: warning: variable `t' set but not used [-Wunused-but-set-variable]
    struct aoetgt *t;
                   ^
  drivers/block/aoe/aoecmd.c: In function `resend':
  drivers/block/aoe/aoecmd.c:488:21: warning: variable `ah' set but not used [-Wunused-but-set-variable]
    struct aoe_atahdr *ah;
                       ^

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:59:27 -07:00
Andy Shevchenko
a88c1f0cac aoe: remove custom implementation of kbasename()
In the kernel we have a nice helper that may be used here. This patch
substitutes the custom implementation by the native function call.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:59:26 -07:00
Ed Cashin
896dcd9a64 aoe: update internal version number to 85
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:59:26 -07:00
Ed Cashin
ec345120c5 aoe: update copyright date
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:59:25 -07:00
Ed Cashin
2256c1c51e aoe: fill in per-AoE-target information for debugfs file
This information is presented in a compact format that has evolved for
easy routine scanning by expert humans, mostly developers and support
technicians helping to troubleshoot or test AoE-based systems.

Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:59:25 -07:00
Ed Cashin
1cf94797c2 aoe: provide file operations for debugfs files
The place holder in the file contents is filled out in the following
patch.

Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:59:24 -07:00
Ed Cashin
e8866cf2b9 aoe: add AoE-target files to debugfs
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:59:23 -07:00
Ed Cashin
190519cd30 aoe: create and destroy debugfs directory for aoe
This series adds the debugging information that the coraid.com-distributed
aoe driver exports via sysfs, but instead of sysfs, it uses debugfs.

With these patches applied, even without AoE targets on the network, KEDR
reports new possible memory leaks, but these are from callers outside the
aoe driver that have used aoe_devnode to get the name of the character
devices through the aoe_class->devnode callback, and I believe they're
responsible for freeing that memory.

This patch:

Create and destroy the debugfs directory.

Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:59:22 -07:00
Ed Cashin
fb32975d1b aoe: adjust ref of head for compound page tails
Fix a BUG which can trigger when direct-IO is used with AOE.

As discussed previously, the fact that some users of the block layer
provide bios that point to pages with a zero _count means that it is not
OK for the network layer to do a put_page on the skb frags during an
skb_linearize, so the aoe driver gets a reference to pages in bios and
puts the reference before ending the bio.  And because it cannot use
get_page on a page with a zero _count, it manipulates the value
directly.

It is not OK to increment the _count of a compound page tail, though,
since the VM layer will VM_BUG_ON a non-zero _count.  Block users that
do direct I/O can result in the aoe driver seeing compound page tails in
bios.  In that case, the same logic works as long as the head of the
compound page is used instead of the tails.  This patch handles compound
pages and does not BUG.

It relies on the block layer user leaving the relationship between the
page tail and its head alone for the duration between the submission of
the bio and its completion, whether successful or not.

Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-13 17:57:48 -07:00
Ed Cashin
94ac11833f aoe: update internal version number to v83
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:08:05 -07:00
Ed Cashin
ca47bbd93c aoe: update copyright date
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:08:05 -07:00
Ed Cashin
8030d34397 aoe: perform I/O completions in parallel
Some users have a large AoE target while others like to use many AoE
targets at the same time.  In the latter case, there is an opportunity to
greatly improve aggregate throughput by allowing different threads to
complete the I/O associated with each target.  For 36 targets, 4 KiB read
throughput roughly doubles, for example, with these changes in place.

Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:08:05 -07:00
Kees Cook
f170168b9a drivers: avoid parsing names as kthread_run() format strings
Calling kthread_run with a single name parameter causes it to be handled
as a format string. Many callers are passing potentially dynamic string
content, so use "%s" in those cases to avoid any potential accidents.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:41 -07:00
Linus Torvalds
ebb3727779 Merge branch 'for-3.10/drivers' of git://git.kernel.dk/linux-block
Pull block driver updates from Jens Axboe:
 "It might look big in volume, but when categorized, not a lot of
  drivers are touched.  The pull request contains:

   - mtip32xx fixes from Micron.

   - A slew of drbd updates, this time in a nicer series.

   - bcache, a flash/ssd caching framework from Kent.

   - Fixes for cciss"

* 'for-3.10/drivers' of git://git.kernel.dk/linux-block: (66 commits)
  bcache: Use bd_link_disk_holder()
  bcache: Allocator cleanup/fixes
  cciss: bug fix to prevent cciss from loading in kdump crash kernel
  cciss: add cciss_allow_hpsa module parameter
  drivers/block/mg_disk.c: add CONFIG_PM_SLEEP to suspend/resume functions
  mtip32xx: Workaround for unaligned writes
  bcache: Make sure blocksize isn't smaller than device blocksize
  bcache: Fix merge_bvec_fn usage for when it modifies the bvm
  bcache: Correctly check against BIO_MAX_PAGES
  bcache: Hack around stuff that clones up to bi_max_vecs
  bcache: Set ra_pages based on backing device's ra_pages
  bcache: Take data offset from the bdev superblock.
  mtip32xx: mtip32xx: Disable TRIM support
  mtip32xx: fix a smatch warning
  bcache: Disable broken btree fuzz tester
  bcache: Fix a format string overflow
  bcache: Fix a minor memory leak on device teardown
  bcache: Documentation updates
  bcache: Use WARN_ONCE() instead of __WARN()
  bcache: Add missing #include <linux/prefetch.h>
  ...
2013-05-08 11:51:05 -07:00
Linus Torvalds
4de13d7aa8 Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-block
Pull block core updates from Jens Axboe:

 - Major bit is Kents prep work for immutable bio vecs.

 - Stable candidate fix for a scheduling-while-atomic in the queue
   bypass operation.

 - Fix for the hang on exceeded rq->datalen 32-bit unsigned when merging
   discard bios.

 - Tejuns changes to convert the writeback thread pool to the generic
   workqueue mechanism.

 - Runtime PM framework, SCSI patches exists on top of these in James'
   tree.

 - A few random fixes.

* 'for-3.10/core' of git://git.kernel.dk/linux-block: (40 commits)
  relay: move remove_buf_file inside relay_close_buf
  partitions/efi.c: replace useless kzalloc's by kmalloc's
  fs/block_dev.c: fix iov_shorten() criteria in blkdev_aio_read()
  block: fix max discard sectors limit
  blkcg: fix "scheduling while atomic" in blk_queue_bypass_start
  Documentation: cfq-iosched: update documentation help for cfq tunables
  writeback: expose the bdi_wq workqueue
  writeback: replace custom worker pool implementation with unbound workqueue
  writeback: remove unused bdi_pending_list
  aoe: Fix unitialized var usage
  bio-integrity: Add explicit field for owner of bip_buf
  block: Add an explicit bio flag for bios that own their bvec
  block: Add bio_alloc_pages()
  block: Convert some code to bio_for_each_segment_all()
  block: Add bio_for_each_segment_all()
  bounce: Refactor __blk_queue_bounce to not use bi_io_vec
  raid1: use bio_copy_data()
  pktcdvd: Use bio_reset() in disabled code to kill bi_idx usage
  pktcdvd: use bio_copy_data()
  block: Add bio_copy_data()
  ...
2013-05-08 10:13:35 -07:00
Al Viro
db2a144bed block_device_operations->release() should return void
The value passed is 0 in all but "it can never happen" cases (and those
only in a couple of drivers) *and* it would've been lost on the way
out anyway, even if something tried to pass something meaningful.
Just don't bother.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-05-07 02:16:21 -04:00
Mihnea Dobrescu-Balaur
60abc786dd aoe: replace kmalloc and then memcpy with kmemdup
Signed-off-by: Mihnea Dobrescu-Balaur <mihneadb@gmail.com>
Cc: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-30 17:04:08 -07:00