Commit Graph

120 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
817780c598 Merge 5.15.56 into android13-5.15-lts
Changes in 5.15.56
	ALSA: hda - Add fixup for Dell Latitidue E5430
	ALSA: hda/conexant: Apply quirk for another HP ProDesk 600 G3 model
	ALSA: hda/realtek: Fix headset mic for Acer SF313-51
	ALSA: hda/realtek - Fix headset mic problem for a HP machine with alc671
	ALSA: hda/realtek: fix mute/micmute LEDs for HP machines
	ALSA: hda/realtek - Fix headset mic problem for a HP machine with alc221
	ALSA: hda/realtek - Enable the headset-mic on a Xiaomi's laptop
	xen/netback: avoid entering xenvif_rx_next_skb() with an empty rx queue
	fix race between exit_itimers() and /proc/pid/timers
	mm: userfaultfd: fix UFFDIO_CONTINUE on fallocated shmem pages
	mm: split huge PUD on wp_huge_pud fallback
	tracing/histograms: Fix memory leak problem
	net: sock: tracing: Fix sock_exceed_buf_limit not to dereference stale pointer
	ip: fix dflt addr selection for connected nexthop
	ARM: 9213/1: Print message about disabled Spectre workarounds only once
	ARM: 9214/1: alignment: advance IT state after emulating Thumb instruction
	wifi: mac80211: fix queue selection for mesh/OCB interfaces
	cgroup: Use separate src/dst nodes when preloading css_sets for migration
	btrfs: return -EAGAIN for NOWAIT dio reads/writes on compressed and inline extents
	drm/panfrost: Put mapping instead of shmem obj on panfrost_mmu_map_fault_addr() error
	drm/panfrost: Fix shrinker list corruption by madvise IOCTL
	fs/remap: constrain dedupe of EOF blocks
	nilfs2: fix incorrect masking of permission flags for symlinks
	sh: convert nommu io{re,un}map() to static inline functions
	Revert "evm: Fix memleak in init_desc"
	xfs: only run COW extent recovery when there are no live extents
	xfs: don't include bnobt blocks when reserving free block pool
	xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks
	xfs: drop async cache flushes from CIL commits.
	reset: Fix devm bulk optional exclusive control getter
	ARM: dts: imx6qdl-ts7970: Fix ngpio typo and count
	spi: amd: Limit max transfer and message size
	ARM: 9209/1: Spectre-BHB: avoid pr_info() every time a CPU comes out of idle
	ARM: 9210/1: Mark the FDT_FIXED sections as shareable
	net/mlx5e: kTLS, Fix build time constant test in TX
	net/mlx5e: kTLS, Fix build time constant test in RX
	net/mlx5e: Fix enabling sriov while tc nic rules are offloaded
	net/mlx5e: Fix capability check for updating vnic env counters
	net/mlx5e: Ring the TX doorbell on DMA errors
	drm/i915: fix a possible refcount leak in intel_dp_add_mst_connector()
	ima: Fix a potential integer overflow in ima_appraise_measurement
	ASoC: sgtl5000: Fix noise on shutdown/remove
	ASoC: tas2764: Add post reset delays
	ASoC: tas2764: Fix and extend FSYNC polarity handling
	ASoC: tas2764: Correct playback volume range
	ASoC: tas2764: Fix amp gain register offset & default
	ASoC: Intel: Skylake: Correct the ssp rate discovery in skl_get_ssp_clks()
	ASoC: Intel: Skylake: Correct the handling of fmt_config flexible array
	net: stmmac: dwc-qos: Disable split header for Tegra194
	net: ethernet: ti: am65-cpsw: Fix devlink port register sequence
	sysctl: Fix data races in proc_dointvec().
	sysctl: Fix data races in proc_douintvec().
	sysctl: Fix data races in proc_dointvec_minmax().
	sysctl: Fix data races in proc_douintvec_minmax().
	sysctl: Fix data races in proc_doulongvec_minmax().
	sysctl: Fix data races in proc_dointvec_jiffies().
	tcp: Fix a data-race around sysctl_tcp_max_orphans.
	inetpeer: Fix data-races around sysctl.
	net: Fix data-races around sysctl_mem.
	cipso: Fix data-races around sysctl.
	icmp: Fix data-races around sysctl.
	ipv4: Fix a data-race around sysctl_fib_sync_mem.
	ARM: dts: at91: sama5d2: Fix typo in i2s1 node
	ARM: dts: sunxi: Fix SPI NOR campatible on Orange Pi Zero
	arm64: dts: broadcom: bcm4908: Fix timer node for BCM4906 SoC
	arm64: dts: broadcom: bcm4908: Fix cpu node for smp boot
	netfilter: nf_log: incorrect offset to network header
	netfilter: nf_tables: replace BUG_ON by element length check
	drm/i915/gvt: IS_ERR() vs NULL bug in intel_gvt_update_reg_whitelist()
	xen/gntdev: Ignore failure to unmap INVALID_GRANT_HANDLE
	lockd: set fl_owner when unlocking files
	lockd: fix nlm_close_files
	tracing: Fix sleeping while atomic in kdb ftdump
	drm/i915/selftests: fix a couple IS_ERR() vs NULL tests
	drm/i915/dg2: Add Wa_22011100796
	drm/i915/gt: Serialize GRDOM access between multiple engine resets
	drm/i915/gt: Serialize TLB invalidates with GT resets
	drm/i915/uc: correctly track uc_fw init failure
	drm/i915: Require the vm mutex for i915_vma_bind()
	bnxt_en: Fix bnxt_reinit_after_abort() code path
	bnxt_en: Fix bnxt_refclk_read()
	sysctl: Fix data-races in proc_dou8vec_minmax().
	sysctl: Fix data-races in proc_dointvec_ms_jiffies().
	icmp: Fix data-races around sysctl_icmp_echo_enable_probe.
	icmp: Fix a data-race around sysctl_icmp_ignore_bogus_error_responses.
	icmp: Fix a data-race around sysctl_icmp_errors_use_inbound_ifaddr.
	icmp: Fix a data-race around sysctl_icmp_ratelimit.
	icmp: Fix a data-race around sysctl_icmp_ratemask.
	raw: Fix a data-race around sysctl_raw_l3mdev_accept.
	tcp: Fix a data-race around sysctl_tcp_ecn_fallback.
	ipv4: Fix data-races around sysctl_ip_dynaddr.
	nexthop: Fix data-races around nexthop_compat_mode.
	net: ftgmac100: Hold reference returned by of_get_child_by_name()
	net: stmmac: fix leaks in probe
	ima: force signature verification when CONFIG_KEXEC_SIG is configured
	ima: Fix potential memory leak in ima_init_crypto()
	drm/amd/display: Only use depth 36 bpp linebuffers on DCN display engines.
	drm/amd/pm: Prevent divide by zero
	sfc: fix use after free when disabling sriov
	ceph: switch netfs read ops to use rreq->inode instead of rreq->mapping->host
	seg6: fix skb checksum evaluation in SRH encapsulation/insertion
	seg6: fix skb checksum in SRv6 End.B6 and End.B6.Encaps behaviors
	seg6: bpf: fix skb checksum in bpf_push_seg6_encap()
	sfc: fix kernel panic when creating VF
	net: atlantic: remove deep parameter on suspend/resume functions
	net: atlantic: remove aq_nic_deinit() when resume
	KVM: x86: Fully initialize 'struct kvm_lapic_irq' in kvm_pv_kick_cpu_op()
	net/tls: Check for errors in tls_device_init
	ACPI: video: Fix acpi_video_handles_brightness_key_presses()
	mm: sysctl: fix missing numa_stat when !CONFIG_HUGETLB_PAGE
	btrfs: rename btrfs_bio to btrfs_io_context
	btrfs: zoned: fix a leaked bioc in read_zone_info
	ksmbd: use SOCK_NONBLOCK type for kernel_accept()
	powerpc/xive/spapr: correct bitmap allocation size
	vdpa/mlx5: Initialize CVQ vringh only once
	vduse: Tie vduse mgmtdev and its device
	virtio_mmio: Add missing PM calls to freeze/restore
	virtio_mmio: Restore guest page size on resume
	netfilter: br_netfilter: do not skip all hooks with 0 priority
	scsi: hisi_sas: Limit max hw sectors for v3 HW
	cpufreq: pmac32-cpufreq: Fix refcount leak bug
	platform/x86: hp-wmi: Ignore Sanitization Mode event
	firmware: sysfb: Make sysfb_create_simplefb() return a pdev pointer
	firmware: sysfb: Add sysfb_disable() helper function
	fbdev: Disable sysfb device registration when removing conflicting FBs
	net: tipc: fix possible refcount leak in tipc_sk_create()
	NFC: nxp-nci: don't print header length mismatch on i2c error
	nvme-tcp: always fail a request when sending it failed
	nvme: fix regression when disconnect a recovering ctrl
	net: sfp: fix memory leak in sfp_probe()
	ASoC: ops: Fix off by one in range control validation
	pinctrl: aspeed: Fix potential NULL dereference in aspeed_pinmux_set_mux()
	ASoC: Realtek/Maxim SoundWire codecs: disable pm_runtime on remove
	ASoC: rt711-sdca-sdw: fix calibrate mutex initialization
	ASoC: Intel: sof_sdw: handle errors on card registration
	ASoC: rt711: fix calibrate mutex initialization
	ASoC: rt7*-sdw: harden jack_detect_handler
	ASoC: codecs: rt700/rt711/rt711-sdca: initialize workqueues in probe
	ASoC: SOF: Intel: hda-loader: Clarify the cl_dsp_init() flow
	ASoC: wcd938x: Fix event generation for some controls
	ASoC: Intel: bytcr_wm5102: Fix GPIO related probe-ordering problem
	ASoC: wm5110: Fix DRE control
	ASoC: rt711-sdca: fix kernel NULL pointer dereference when IO error
	ASoC: dapm: Initialise kcontrol data for mux/demux controls
	ASoC: cs47l15: Fix event generation for low power mux control
	ASoC: madera: Fix event generation for OUT1 demux
	ASoC: madera: Fix event generation for rate controls
	irqchip: or1k-pic: Undefine mask_ack for level triggered hardware
	x86: Clear .brk area at early boot
	soc: ixp4xx/npe: Fix unused match warning
	ARM: dts: stm32: use the correct clock source for CEC on stm32mp151
	Revert "can: xilinx_can: Limit CANFD brp to 2"
	ALSA: usb-audio: Add quirks for MacroSilicon MS2100/MS2106 devices
	ALSA: usb-audio: Add quirk for Fiero SC-01
	ALSA: usb-audio: Add quirk for Fiero SC-01 (fw v1.0.0)
	nvme-pci: phison e16 has bogus namespace ids
	signal handling: don't use BUG_ON() for debugging
	USB: serial: ftdi_sio: add Belimo device ids
	usb: typec: add missing uevent when partner support PD
	usb: dwc3: gadget: Fix event pending check
	tty: serial: samsung_tty: set dma burst_size to 1
	vt: fix memory overlapping when deleting chars in the buffer
	serial: 8250: fix return error code in serial8250_request_std_resource()
	serial: stm32: Clear prev values before setting RTS delays
	serial: pl011: UPSTAT_AUTORTS requires .throttle/unthrottle
	serial: 8250: Fix PM usage_count for console handover
	x86/pat: Fix x86_has_pat_wp()
	drm/aperture: Run fbdev removal before internal helpers
	Linux 5.15.56

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ib44efbedc8fea205005bd3ec2806ebb8cb19710d
2022-08-10 16:54:10 +02:00
Tariq Toukan
724ec407f9 net/tls: Check for errors in tls_device_init
[ Upstream commit 3d8c51b25a235e283e37750943bbf356ef187230 ]

Add missing error checks in tls_device_init.

Fixes: e8f6979981 ("net/tls: Add generic NIC offload infrastructure")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20220714070754.1428-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-07-21 21:24:31 +02:00
Greg Kroah-Hartman
bd3b9b13ef ANDROID: GKI: networking: add Android ABI padding to a lot of networking structures
Try to mitigate potential future driver core api changes by adding a
padding to a lot of different networking structures:
	struct ipv6_devconf
	struct proto_ops
	struct header_ops
	struct napi_struct
	struct netdev_queue
	struct netdev_rx_queue
	struct xfrmdev_ops
	struct net_device_ops
	struct net_device
	struct packet_type
	struct sk_buff
	struct tlsdev_ops

Based on a change made to the RHEL/CENTOS 8 kernel.

Bug: 151154716
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I590f004754dbc8beafa40e71cac70a0938c38b4a
2022-06-18 18:44:14 +00:00
Greg Kroah-Hartman
521f2e62a3 ANDROID: add kabi padding for structures for the android13 release
There are a lot of different structures that need to have a "frozen" abi
for the next 5+ years.  Add padding to a lot of them in order to be able
to handle any future changes that might be needed due to LTS and
security fixes that might come up.

It's a best guess, based on what has happened in the past from the
5.10.0..5.10.110 release (1 1/2 years).  Yes, past changes do not mean
that future changes will also be needed in the same area, but that is a
hint that those areas are both well maintained and looked after, and
there have been previous problems found in them.

Also the list of structures that are being required based on OEM usage
in the android/ symbol lists were consulted as that's a larger list than
what has been changed in the past.

Hopefully we caught everything we need to worry about, only time will
tell...

Bug: 151154716
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I880bbcda0628a7459988eeb49d18655522697664
2022-05-04 13:39:14 -07:00
Daniel Jordan
da353fac65 net/tls: Fix flipped sign in tls_err_abort() calls
sk->sk_err appears to expect a positive value, a convention that ktls
doesn't always follow and that leads to memory corruption in other code.
For instance,

    [kworker]
    tls_encrypt_done(..., err=<negative error from crypto request>)
      tls_err_abort(.., err)
        sk->sk_err = err;

    [task]
    splice_from_pipe_feed
      ...
        tls_sw_do_sendpage
          if (sk->sk_err) {
            ret = -sk->sk_err;  // ret is positive

    splice_from_pipe_feed (continued)
      ret = actor(...)  // ret is still positive and interpreted as bytes
                        // written, resulting in underflow of buf->len and
                        // sd->len, leading to huge buf->offset and bogus
                        // addresses computed in later calls to actor()

Fix all tls_err_abort() callers to pass a negative error code
consistently and centralize the error-prone sign flip there, throwing in
a warning to catch future misuse and uninlining the function so it
really does only warn once.

Cc: stable@vger.kernel.org
Fixes: c46234ebb4 ("tls: RX path for ktls")
Reported-by: syzbot+b187b77c8474f9648fae@syzkaller.appspotmail.com
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28 14:41:20 +01:00
Cong Wang
7b50ecfcc6 net: Rename ->stream_memory_read to ->sock_is_readable
The proto ops ->stream_memory_read() is currently only used
by TCP to check whether psock queue is empty or not. We need
to rename it before reusing it for non-TCP protocols, and
adjust the exsiting users accordingly.

Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211008203306.37525-2-xiyou.wangcong@gmail.com
2021-10-26 12:29:33 -07:00
Alexander Aring
e3ae2365ef net: sock: introduce sk_error_report
This patch introduces a function wrapper to call the sk_error_report
callback. That will prepare to add additional handling whenever
sk_error_report is called, for example to trace socket errors.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29 11:28:21 -07:00
Kuniyuki Iwashima
10ed7ce42b net/tls: Remove the __TLS_DEC_STATS() macro.
The commit d26b698dd3 ("net/tls: add skeleton of MIB statistics")
introduced __TLS_DEC_STATS(), but it is not used and __SNMP_DEC_STATS() is
not defined also. Let's remove it.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23 13:47:55 -07:00
Maxim Mikityanskiy
c55dcdd435 net/tls: Fix use-after-free after the TLS device goes down and up
When a netdev with active TLS offload goes down, tls_device_down is
called to stop the offload and tear down the TLS context. However, the
socket stays alive, and it still points to the TLS context, which is now
deallocated. If a netdev goes up, while the connection is still active,
and the data flow resumes after a number of TCP retransmissions, it will
lead to a use-after-free of the TLS context.

This commit addresses this bug by keeping the context alive until its
normal destruction, and implements the necessary fallbacks, so that the
connection can resume in software (non-offloaded) kTLS mode.

On the TX side tls_sw_fallback is used to encrypt all packets. The RX
side already has all the necessary fallbacks, because receiving
non-decrypted packets is supported. The thing needed on the RX side is
to block resync requests, which are normally produced after receiving
non-decrypted packets.

The necessary synchronization is implemented for a graceful teardown:
first the fallbacks are deployed, then the driver resources are released
(it used to be possible to have a tls_dev_resync after tls_dev_del).

A new flag called TLS_RX_DEV_DEGRADED is added to indicate the fallback
mode. It's used to skip the RX resync logic completely, as it becomes
useless, and some objects may be released (for example, resync_async,
which is allocated and freed by the driver).

Fixes: e8f6979981 ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01 15:58:05 -07:00
Maxim Mikityanskiy
05fc8b6cbd net/tls: Replace TLS_RX_SYNC_RUNNING with RCU
RCU synchronization is guaranteed to finish in finite time, unlike a
busy loop that polls a flag. This patch is a preparation for the bugfix
in the next patch, where the same synchronize_net() call will also be
used to sync with the TX datapath.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01 15:58:05 -07:00
Jakub Kicinski
5c39f26e67 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Trivial conflict in CAN, keep the net-next + the byteswap wrapper.

Conflicts:
	drivers/net/can/usb/gs_usb.c

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 18:25:27 -08:00
Vadim Fedorenko
a6acbe6235 net/tls: add CHACHA20-POLY1305 specific behavior
RFC 7905 defines special behavior for ChaCha-Poly TLS sessions.
The differences are in the calculation of nonce and the absence
of explicit IV. This behavior is like TLSv1.3 partly.

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 14:32:37 -08:00
Vadim Fedorenko
923c40c465 net/tls: add CHACHA20-POLY1305 specific defines and structures
To provide support for ChaCha-Poly cipher we need to define
specific constants and structures.

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 14:32:37 -08:00
Vadim Fedorenko
6942a284fb net/tls: make inline helpers protocol-aware
Inline functions defined in tls.h have a lot of AES-specific
constants. Remove these constants and change argument to struct
tls_prot_info to have an access to cipher type in later patches

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 14:32:37 -08:00
Maxim Mikityanskiy
025cc2fb6a net/tls: Protect from calling tls_dev_del for TLS RX twice
tls_device_offload_cleanup_rx doesn't clear tls_ctx->netdev after
calling tls_dev_del if TLX TX offload is also enabled. Clearing
tls_ctx->netdev gets postponed until tls_device_gc_task. It leaves a
time frame when tls_device_down may get called and call tls_dev_del for
RX one extra time, confusing the driver, which may lead to a crash.

This patch corrects this racy behavior by adding a flag to prevent
tls_device_down from calling tls_dev_del the second time.

Fixes: e8f6979981 ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20201125221810.69870-1-saeedm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:31:06 -08:00
Tariq Toukan
138559b9f9 net/tls: Fix wrong record sn in async mode of device resync
In async_resync mode, we log the TCP seq of records until the async request
is completed.  Later, in case one of the logged seqs matches the resync
request, we return it, together with its record serial number.  Before this
fix, we mistakenly returned the serial number of the current record
instead.

Fixes: ed9b7646b0 ("net/tls: Add asynchronous resync")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Boris Pismenny <borisp@nvidia.com>
Link: https://lore.kernel.org/r/20201115131448.2702-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 14:41:20 -08:00
Randy Dunlap
923527dcb4 net/tls: remove a duplicate function prototype
Remove one of the two instances of the function prototype for
tls_validate_xmit_skb().

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Boris Pismenny <borisp@nvidia.com>
Cc: Aviad Yehezkel <aviadye@nvidia.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-09 16:49:57 -07:00
Colin Ian King
a6ed3ebca4 net/tls: fix sign extension issue when left shifting u16 value
Left shifting the u16 value promotes it to a int and then it
gets sign extended to a u64.  If len << 16 is greater than 0x7fffffff
then the upper bits get set to 1 because of the implicit sign extension.
Fix this by casting len to u64 before shifting it.

Addresses-Coverity: ("integer handling issues")
Fixes: ed9b7646b0 ("net/tls: Add asynchronous resync")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 13:36:56 -07:00
Boris Pismenny
ed9b7646b0 net/tls: Add asynchronous resync
This patch adds support for asynchronous resynchronization in tls_device.
Async resync follows two distinct stages:

1. The NIC driver indicates that it would like to resync on some TLS
record within the received packet (P), but the driver does not
know (yet) which of the TLS records within the packet.
At this stage, the NIC driver will query the device to find the exact
TCP sequence for resync (tcpsn), however, the driver does not wait
for the device to provide the response.

2. Eventually, the device responds, and the driver provides the tcpsn
within the resync packet to KTLS. Now, KTLS can check the tcpsn against
any processed TLS records within packet P, and also against any record
that is processed in the future within packet P.

The asynchronous resync path simplifies the device driver, as it can
save bits on the packet completion (32-bit TCP sequence), and pass this
information on an asynchronous command instead.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-27 14:00:22 -07:00
Boris Pismenny
acb5a07aaf Revert "net/tls: Add force_resync for driver resync"
This reverts commit b3ae2459f8.
Revert the force resync API.
Not in use. To be replaced by a better async resync API downstream.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-27 14:00:21 -07:00
John Fastabend
e91de6afa8 bpf: Fix running sk_skb program types with ktls
KTLS uses a stream parser to collect TLS messages and send them to
the upper layer tls receive handler. This ensures the tls receiver
has a full TLS header to parse when it is run. However, when a
socket has BPF_SK_SKB_STREAM_VERDICT program attached before KTLS
is enabled we end up with two stream parsers running on the same
socket.

The result is both try to run on the same socket. First the KTLS
stream parser runs and calls read_sock() which will tcp_read_sock
which in turn calls tcp_rcv_skb(). This dequeues the skb from the
sk_receive_queue. When this is done KTLS code then data_ready()
callback which because we stacked KTLS on top of the bpf stream
verdict program has been replaced with sk_psock_start_strp(). This
will in turn kick the stream parser again and eventually do the
same thing KTLS did above calling into tcp_rcv_skb() and dequeuing
a skb from the sk_receive_queue.

At this point the data stream is broke. Part of the stream was
handled by the KTLS side some other bytes may have been handled
by the BPF side. Generally this results in either missing data
or more likely a "Bad Message" complaint from the kTLS receive
handler as the BPF program steals some bytes meant to be in a
TLS header and/or the TLS header length is no longer correct.

We've already broke the idealized model where we can stack ULPs
in any order with generic callbacks on the TX side to handle this.
So in this patch we do the same thing but for RX side. We add
a sk_psock_strp_enabled() helper so TLS can learn a BPF verdict
program is running and add a tls_sw_has_ctx_rx() helper so BPF
side can learn there is a TLS ULP on the socket.

Then on BPF side we omit calling our stream parser to avoid
breaking the data stream for the KTLS receiver. Then on the
KTLS side we call BPF_SK_SKB_STREAM_VERDICT once the KTLS
receiver is done with the packet but before it posts the
msg to userspace. This gives us symmetry between the TX and
RX halfs and IMO makes it usable again. On the TX side we
process packets in this order BPF -> TLS -> TCP and on
the receive side in the reverse order TCP -> TLS -> BPF.

Discovered while testing OpenSSL 3.0 Alpha2.0 release.

Fixes: d829e9c411 ("tls: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/159079361946.5745.605854335665044485.stgit@john-Precision-5820-Tower
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 14:48:32 -07:00
David S. Miller
1806c13dc2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
xdp_umem.c had overlapping changes between the 64-bit math fix
for the calculation of npgs and the removal of the zerocopy
memory type which got rid of the chunk_size_nohdr member.

The mlx5 Kconfig conflict is a case where we just take the
net-next copy of the Kconfig entry dependency as it takes on
the ESWITCH dependency by one level of indirection which is
what the 'net' conflicting change is trying to ensure.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-31 17:48:46 -07:00
Tariq Toukan
b3ae2459f8 net/tls: Add force_resync for driver resync
This patch adds a field to the tls rx offload context which enables
drivers to force a send_resync call.

This field can be used by drivers to request a resync at the next
possible tls record. It is beneficial for hardware that provides the
resync sequence number asynchronously. In such cases, the packet that
triggered the resync does not contain the information required for a
resync. Instead, the driver requests resync for all the following
TLS record until the asynchronous notification with the resync request
TCP sequence arrives.

A following series for mlx5e ConnectX-6DX TLS RX offload support will
use this mechanism.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 15:07:13 -07:00
Vinay Kumar Yadav
0cada33241 net/tls: fix race condition causing kernel panic
tls_sw_recvmsg() and tls_decrypt_done() can be run concurrently.
// tls_sw_recvmsg()
	if (atomic_read(&ctx->decrypt_pending))
		crypto_wait_req(-EINPROGRESS, &ctx->async_wait);
	else
		reinit_completion(&ctx->async_wait.completion);

//tls_decrypt_done()
  	pending = atomic_dec_return(&ctx->decrypt_pending);

  	if (!pending && READ_ONCE(ctx->async_notify))
  		complete(&ctx->async_wait.completion);

Consider the scenario tls_decrypt_done() is about to run complete()

	if (!pending && READ_ONCE(ctx->async_notify))

and tls_sw_recvmsg() reads decrypt_pending == 0, does reinit_completion(),
then tls_decrypt_done() runs complete(). This sequence of execution
results in wrong completion. Consequently, for next decrypt request,
it will not wait for completion, eventually on connection close, crypto
resources freed, there is no way to handle pending decrypt response.

This race condition can be avoided by having atomic_read() mutually
exclusive with atomic_dec_return(),complete().Intoduced spin lock to
ensure the mutual exclution.

Addressed similar problem in tx direction.

v1->v2:
- More readable commit message.
- Corrected the lock to fix new race scenario.
- Removed barrier which is not needed now.

Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-25 17:41:40 -07:00
Jakub Kicinski
8d5a49e9e3 net/tls: add helper for testing if socket is RX offloaded
There is currently no way for driver to reliably check that
the socket it has looked up is in fact RX offloaded. Add
a helper. This allows drivers to catch misbehaving firmware.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-19 17:46:51 -08:00
Jakub Kicinski
c5daa6cccd net/tls: use sg_next() to walk sg entries
Partially sent record cleanup path increments an SG entry
directly instead of using sg_next(). This should not be a
problem today, as encrypted messages should be always
allocated as arrays. But given this is a cleanup path it's
easy to miss was this ever to change. Use sg_next(), and
simplify the code.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-28 22:40:29 -08:00
Jakub Kicinski
9e5ffed37d net/tls: remove the dead inplace_crypto code
Looks like when BPF support was added by commit d3b18ad31f
("tls: add bpf support to sk_msg handling") and
commit d829e9c411 ("tls: convert to generic sk_msg interface")
it broke/removed the support for in-place crypto as added by
commit 4e6d47206c ("tls: Add support for inplace records
encryption").

The inplace_crypto member of struct tls_rec is dead, inited
to zero, and sometimes set to zero again. It used to be
set to 1 when record was allocated, but the skmsg code doesn't
seem to have been written with the idea of in-place crypto
in mind.

Since non trivial effort is required to bring the feature back
and we don't really have the HW to measure the benefit just
remove the left over support for now to avoid confusing readers.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-28 22:40:29 -08:00
Jakub Kicinski
a9f852e92e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Minor conflict in drivers/s390/net/qeth_l2_main.c, kept the lock
from commit c8183f5489 ("s390/qeth: fix potential deadlock on
workqueue flush"), removed the code which was removed by commit
9897d583b0 ("s390/qeth: consolidate some duplicated HW cmd code").

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-22 16:27:24 -08:00
Willem de Bruijn
d4ffb02dee net/tls: enable sk_msg redirect to tls socket egress
Bring back tls_sw_sendpage_locked. sk_msg redirection into a socket
with TLS_TX takes the following path:

  tcp_bpf_sendmsg_redir
    tcp_bpf_push_locked
      tcp_bpf_push
        kernel_sendpage_locked
          sock->ops->sendpage_locked

Also update the flags test in tls_sw_sendpage_locked to allow flag
MSG_NO_SHARED_FRAGS. bpf_tcp_sendmsg sets this.

Link: https://lore.kernel.org/netdev/CA+FuTSdaAawmZ2N8nfDDKu3XLpXBbMtcCT0q4FntDD2gn8ASUw@mail.gmail.com/T/#t
Link: https://github.com/wdebruij/kerneltools/commits/icept.2
Fixes: 0608c69c9a ("bpf: sk_msg, sock{map|hash} redirect through ULP")
Fixes: f3de19af0f ("Revert \"net/tls: remove unused function tls_sw_sendpage_locked\"")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19 15:03:02 -08:00
David S. Miller
14684b9301 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
One conflict in the BPF samples Makefile, some fixes in 'net' whilst
we were converting over to Makefile.target rules in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-09 11:04:37 -08:00
Jakub Kicinski
79ffe6087e net/tls: add a TX lock
TLS TX needs to release and re-acquire the socket lock if send buffer
fills up.

TLS SW TX path currently depends on only allowing one thread to enter
the function by the abuse of sk_write_pending. If another writer is
already waiting for memory no new ones are allowed in.

This has two problems:
 - writers don't wake other threads up when they leave the kernel;
   meaning that this scheme works for single extra thread (second
   application thread or delayed work) because memory becoming
   available will send a wake up request, but as Mallesham and
   Pooja report with larger number of threads it leads to threads
   being put to sleep indefinitely;
 - the delayed work does not get _scheduled_ but it may _run_ when
   other writers are present leading to crashes as writers don't
   expect state to change under their feet (same records get pushed
   and freed multiple times); it's hard to reliably bail from the
   work, however, because the mere presence of a writer does not
   guarantee that the writer will push pending records before exiting.

Ensuring wakeups always happen will make the code basically open
code a mutex. Just use a mutex.

The TLS HW TX path does not have any locking (not even the
sk_write_pending hack), yet it uses a per-socket sg_tx_data
array to push records.

Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Reported-by: Mallesham  Jatharakonda <mallesh537@gmail.com>
Reported-by: Pooja Trivedi <poojatrivedi@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 17:33:32 -08:00
Jakub Kicinski
bc76e5bb12 net/tls: store decrypted on a single bit
Use a single bit instead of boolean to remember if packet
was already decrypted.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-07 09:58:28 -04:00
Jakub Kicinski
5c5458ec9d net/tls: store async_capable on a single bit
Store async_capable on a single bit instead of a full integer
to save space.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-07 09:58:27 -04:00
Jakub Kicinski
4de30a8d58 net/tls: pass context to tls_device_decrypted()
Avoid unnecessary pointer chasing and calculations, callers already
have most of the state tls_device_decrypted() needs.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-07 09:58:27 -04:00
Jakub Kicinski
d26b698dd3 net/tls: add skeleton of MIB statistics
Add a skeleton structure for adding TLS statistics.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-05 16:29:00 -07:00
Jakub Kicinski
8538d29cea net/tls: add tracing for device/offload events
Add tracing of device-related interaction to aid performance
analysis, especially around resync:

 tls:tls_device_offload_set
 tls:tls_device_rx_resync_send
 tls:tls_device_rx_resync_nh_schedule
 tls:tls_device_rx_resync_nh_delay
 tls:tls_device_tx_resync_req
 tls:tls_device_tx_resync_send

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-05 16:29:00 -07:00
Jakub Kicinski
08700dab81 net/tls: move TOE-related code to a separate file
Move tls_hw_* functions to a new, separate source file
to avoid confusion with normal, non-TOE offload.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-04 14:07:07 -07:00
Jakub Kicinski
25a3cd8189 net/tls: move TOE-related structures to a separate header
Move tls_device structure and register/unregister functions
to a new header to avoid confusion with normal, non-TOE offload.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-04 14:07:06 -07:00
Jakub Kicinski
be2fbc155f net/tls: clean up the number of #ifdefs for CONFIG_TLS_DEVICE
TLS code has a number of #ifdefs which make the code a little
harder to follow. Recent fixes removed the ifdef around the
TLS_HW define, so we can switch to the often used pattern
of defining tls_device functions as empty static inlines
in the header when CONFIG_TLS_DEVICE=n.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:49:49 +02:00
Jakub Kicinski
be7bbea114 net/tls: use the full sk_proto pointer
Since we already have the pointer to the full original sk_proto
stored use that instead of storing all individual callback
pointers as well.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:49:49 +02:00
Davide Caratti
26811cc9f5 net: tls: export protocol version, cipher, tx_conf/rx_conf to socket diag
When an application configures kernel TLS on top of a TCP socket, it's
now possible for inet_diag_handler() to collect information regarding the
protocol version, the cipher type and TX / RX configuration, in case
INET_DIAG_INFO is requested.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-31 23:44:28 -07:00
Jakub Kicinski
15a7dea750 net/tls: use RCU protection on icsk->icsk_ulp_data
We need to make sure context does not get freed while diag
code is interrogating it. Free struct tls_context with
kfree_rcu().

We add the __rcu annotation directly in icsk, and cast it
away in the datapath accessor. Presumably all ULPs will
do a similar thing.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-31 23:44:28 -07:00
Jakub Kicinski
5d92e631b8 net/tls: partially revert fix transition through disconnect with close
Looks like we were slightly overzealous with the shutdown()
cleanup. Even though the sock->sk_state can reach CLOSED again,
socket->state will not got back to SS_UNCONNECTED once
connections is ESTABLISHED. Meaning we will see EISCONN if
we try to reconnect, and EINVAL if we try to listen.

Only listen sockets can be shutdown() and reused, but since
ESTABLISHED sockets can never be re-connected() or used for
listen() we don't need to try to clean up the ULP state early.

Fixes: 32857cf57f ("net/tls: fix transition through disconnect with close")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-05 13:15:30 -07:00
John Fastabend
32857cf57f net/tls: fix transition through disconnect with close
It is possible (via shutdown()) for TCP socks to go through TCP_CLOSE
state via tcp_disconnect() without actually calling tcp_close which
would then call the tls close callback. Because of this a user could
disconnect a socket then put it in a LISTEN state which would break
our assumptions about sockets always being ESTABLISHED state.

More directly because close() can call unhash() and unhash is
implemented by sockmap if a sockmap socket has TLS enabled we can
incorrectly destroy the psock from unhash() and then call its close
handler again. But because the psock (sockmap socket representation)
is already destroyed we call close handler in sk->prot. However,
in some cases (TLS BASE/BASE case) this will still point at the
sockmap close handler resulting in a circular call and crash reported
by syzbot.

To fix both above issues implement the unhash() routine for TLS.

v4:
 - add note about tls offload still needing the fix;
 - move sk_proto to the cold cache line;
 - split TX context free into "release" and "free",
   otherwise the GC work itself is in already freed
   memory;
 - more TX before RX for consistency;
 - reuse tls_ctx_free();
 - schedule the GC work after we're done with context
   to avoid UAF;
 - don't set the unhash in all modes, all modes "inherit"
   TLS_BASE's callbacks anyway;
 - disable the unhash hook for TLS_HW.

Fixes: 3c4d755915 ("tls: kernel TLS support")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:04:17 +02:00
John Fastabend
313ab00480 net/tls: remove sock unlock/lock around strp_done()
The tls close() callback currently drops the sock lock to call
strp_done(). Split up the RX cleanup into stopping the strparser
and releasing most resources, syncing strparser and finally
freeing the context.

To avoid the need for a strp_done() call on the cleanup path
of device offload make sure we don't arm the strparser until
we are sure init will be successful.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:04:16 +02:00
John Fastabend
f87e62d45e net/tls: remove close callback sock unlock/lock around TX work flush
The tls close() callback currently drops the sock lock, makes a
cancel_delayed_work_sync() call, and then relocks the sock.

By restructuring the code we can avoid droping lock and then
reclaiming it. To simplify this we do the following,

 tls_sk_proto_close
 set_bit(CLOSING)
 set_bit(SCHEDULE)
 cancel_delay_work_sync() <- cancel workqueue
 lock_sock(sk)
 ...
 release_sock(sk)
 strp_done()

Setting the CLOSING bit prevents the SCHEDULE bit from being
cleared by any workqueue items e.g. if one happens to be
scheduled and run between when we set SCHEDULE bit and cancel
work. Then because SCHEDULE bit is set now no new work will
be scheduled.

Tested with net selftests and bpf selftests.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:04:16 +02:00
Jakub Kicinski
318892ac06 net/tls: don't arm strparser immediately in tls_set_sw_offload()
In tls_set_device_offload_rx() we prepare the software context
for RX fallback and proceed to add the connection to the device.
Unfortunately, software context prep includes arming strparser
so in case of a later error we have to release the socket lock
to call strp_done().

In preparation for not releasing the socket lock half way through
callbacks move arming strparser into a separate function.
Following patches will make use of that.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:04:16 +02:00
Dirk van der Merwe
b5d9a834f4 net/tls: don't clear TX resync flag on error
Introduce a return code for the tls_dev_resync callback.

When the driver TX resync fails, kernel can retry the resync again
until it succeeds.  This prevents drivers from attempting to offload
TLS packets if the connection is known to be out of sync.

We don't worry about the RX resync since they will be retried naturally
as more encrypted records get received.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 20:21:09 -07:00
David S. Miller
af144a9834 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Two cases of overlapping changes, nothing fancy.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 19:48:57 -07:00
Jakub Kicinski
acd3e96d53 net/tls: make sure offload also gets the keys wiped
Commit 86029d10af ("tls: zero the crypto information from tls_context
before freeing") added memzero_explicit() calls to clear the key material
before freeing struct tls_context, but it missed tls_device.c has its
own way of freeing this structure. Replace the missing free.

Fixes: 86029d10af ("tls: zero the crypto information from tls_context before freeing")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-01 19:22:36 -07:00