msm-5.15

Author	SHA1	Message	Date
Nathan Chancellor	a413ebb604	net: ethernet: ti: Fix return type of netcp_ndo_start_xmit() [ Upstream commit 63fe6ff674a96cfcfc0fa8df1051a27aa31c70b4 ] With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG), indirect call targets are validated against the expected function pointer prototype to make sure the call target is valid to help mitigate ROP attacks. If they are not identical, there is a failure at run time, which manifests as either a kernel panic or thread getting killed. A proposed warning in clang aims to catch these at compile time, which reveals: drivers/net/ethernet/ti/netcp_core.c:1944:21: error: incompatible function pointer types initializing 'netdev_tx_t ()(struct sk_buff , struct net_device )' (aka 'enum netdev_tx ()(struct sk_buff , struct net_device )') with an expression of type 'int (struct sk_buff , struct net_device )' [-Werror,-Wincompatible-function-pointer-types-strict] .ndo_start_xmit = netcp_ndo_start_xmit, ^~~~~~~~~~~~~~~~~~~~ 1 error generated. ->ndo_start_xmit() in 'struct net_device_ops' expects a return type of 'netdev_tx_t', not 'int'. Adjust the return type of netcp_ndo_start_xmit() to match the prototype's to resolve the warning and CFI failure. Link: https://github.com/ClangBuiltLinux/linux/issues/1750 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221102160933.1601260-1-nathan@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:41 +01:00
Zhang Changzhong	3bc893ef36	net: ethernet: ti: am65-cpsw: fix error handling in am65_cpsw_nuss_probe() [ Upstream commit 46fb6512538d201d9a5b2bd7138b6751c37fdf0b ] The am65_cpsw_nuss_cleanup_ndev() function calls unregister_netdev() even if register_netdev() fails, which triggers WARN_ON(1) in unregister_netdevice_many(). To fix it, make sure that unregister_netdev() is called only on registered netdev. Compile tested only. Fixes: `84b4aa4932` ("net: ethernet: ti: am65-cpsw: add multi port support in mac-only mode") Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-08 11:28:40 +01:00
Zhengchao Shao	960f9d30de	net: cpsw: disable napi in cpsw_ndo_open() [ Upstream commit 6d47b53fb3f363a74538a1dbd09954af3d8d4131 ] When failed to create xdp rxqs or fill rx channels in cpsw_ndo_open() for opening device, napi isn't disabled. When open cpsw device next time, it will report a invalid opcode issue. Compiled tested only. Fixes: `d354eb85d6` ("drivers: net: cpsw: dual_emac: simplify napi usage") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20221109011537.96975-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-11-16 09:58:21 +01:00
Randy Dunlap	9c65eef9d6	net: ethernet: ti: davinci_mdio: fix build for mdio bitbang uses commit 35bbe652c421037822aba29423f5f1f7d0d69f3f upstream. davinci_mdio.c uses mdio bitbang APIs, so it should select MDIO_BITBANG to prevent build errors. arm-linux-gnueabi-ld: drivers/net/ethernet/ti/davinci_mdio.o: in function `davinci_mdio_remove': drivers/net/ethernet/ti/davinci_mdio.c:649: undefined reference to `free_mdio_bitbang' arm-linux-gnueabi-ld: drivers/net/ethernet/ti/davinci_mdio.o: in function `davinci_mdio_probe': drivers/net/ethernet/ti/davinci_mdio.c:545: undefined reference to `alloc_mdio_bitbang' arm-linux-gnueabi-ld: drivers/net/ethernet/ti/davinci_mdio.o: in function `davinci_mdiobb_read': drivers/net/ethernet/ti/davinci_mdio.c:236: undefined reference to `mdiobb_read' arm-linux-gnueabi-ld: drivers/net/ethernet/ti/davinci_mdio.o: in function `davinci_mdiobb_write': drivers/net/ethernet/ti/davinci_mdio.c:253: undefined reference to `mdiobb_write' Fixes: d04807b80691 ("net: ethernet: ti: davinci_mdio: Add workaround for errata i2329") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Ravi Gunasekaran <r-gunasekaran@ti.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: Sudip Mukherjee (Codethink) <sudipm.mukherjee@gmail.com> Link: https://lore.kernel.org/r/20220824024216.4939-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-26 12:35:54 +02:00
Ravi Gunasekaran	d449d00a8d	net: ethernet: ti: davinci_mdio: Add workaround for errata i2329 [ Upstream commit d04807b80691c6041ca8e3dcf1870d1bf1082c22 ] On the CPSW and ICSS peripherals, there is a possibility that the MDIO interface returns corrupt data on MDIO reads or writes incorrect data on MDIO writes. There is also a possibility for the MDIO interface to become unavailable until the next peripheral reset. The workaround is to configure the MDIO in manual mode and disable the MDIO state machine and emulate the MDIO protocol by reading and writing appropriate fields in MDIO_MANUAL_IF_REG register of the MDIO controller to manipulate the MDIO clock and data pins. More details about the errata i2329 and the workaround is available in: https://www.ti.com/lit/er/sprz487a/sprz487a.pdf Add implementation to disable MDIO state machine, configure MDIO in manual mode and achieve MDIO read and writes via MDIO Bitbanging Signed-off-by: Ravi Gunasekaran <r-gunasekaran@ti.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-26 12:35:33 +02:00
Siddharth Vadapalli	ad3014b0f6	net: ethernet: ti: am65-cpsw: Fix devlink port register sequence [ Upstream commit 0680e20af5fbf41df8a11b11bd9a7c25b2ca0746 ] Renaming interfaces using udevd depends on the interface being registered before its netdev is registered. Otherwise, udevd reads an empty phys_port_name value, resulting in the interface not being renamed. Fix this by registering the interface before registering its netdev by invoking am65_cpsw_nuss_register_devlink() before invoking register_netdev() for the interface. Move the function call to devlink_port_type_eth_set(), invoking it after register_netdev() is invoked, to ensure that netlink notification for the port state change is generated after the netdev is completely initialized. Fixes: `58356eb31d` ("net: ti: am65-cpsw-nuss: Add devlink support") Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Link: https://lore.kernel.org/r/20220706070208.12207-1-s-vadapalli@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-07-21 21:24:19 +02:00
Miaoqian Lin	a4b7ef3b15	net: ethernet: ti: am65-cpsw-nuss: Fix some refcount leaks [ Upstream commit 5dd89d2fc438457811cbbec07999ce0d80051ff5 ] of_get_child_by_name() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. am65_cpsw_init_cpts() and am65_cpsw_nuss_probe() don't release the refcount in error case. Add missing of_node_put() to avoid refcount leak. Fixes: `b1f66a5bee` ("net: ethernet: ti: am65-cpsw-nuss: enable packet timestamping support") Fixes: `93a7653031` ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-06-14 18:36:10 +02:00
Yang Yingliang	8fa3b32dfa	net: cpsw: add missing of_node_put() in cpsw_probe_dt() commit 95098d5ac2551769807031444e55a0da5d4f0952 upstream. 'tmp_node' need be put before returning from cpsw_probe_dt(), so add missing of_node_put() in error path. Fixes: `ed3525eda4` ("net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-05-12 12:30:16 +02:00
Sondhauß, Jan	585dc196a0	drivers: ethernet: cpsw: fix panic when interrupt coaleceing is set via ethtool [ Upstream commit 2844e2434385819f674d1fb4130c308c50ba681e ] cpsw_ethtool_begin directly returns the result of pm_runtime_get_sync when successful. pm_runtime_get_sync returns -error code on failure and 0 on successful resume but also 1 when the device is already active. So the common case for cpsw_ethtool_begin is to return 1. That leads to inconsistent calls to pm_runtime_put in the call-chain so that pm_runtime_put is called one too many times and as result leaving the cpsw dev behind suspended. The suspended cpsw dev leads to an access violation later on by different parts of the cpsw driver. Fix this by calling the return-friendly pm_runtime_resume_and_get function. Fixes: `d43c65b05b` ("ethtool: runtime-resume netdev parent in ethnl_ops_begin") Signed-off-by: Jan Sondhauss <jan.sondhauss@wago.com> Reviewed-by: Vignesh Raghavendra <vigneshr@ti.com> Link: https://lore.kernel.org/r/20220323084725.65864-1-jan.sondhauss@wago.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-08 14:23:43 +02:00
Jiasheng Jiang	c746fa0f54	net: ethernet: ti: cpts: Handle error for clk_enable [ Upstream commit 6babfc6e6fab068018c36e8f6605184b8c0b349d ] As the potential failure of the clk_enable(), it should be better to check it and return error if fails. Fixes: `8a2c9a5ab4` ("net: ethernet: ti: cpts: rework initialization/deinitialization") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-16 14:23:38 +01:00
Toke Høiland-Jørgensen	925181ea76	net: cpsw: Properly initialise struct page_pool_params [ Upstream commit c63003e3d99761afb280add3b30de1cf30fa522b ] The cpsw driver didn't properly initialise the struct page_pool_params before calling page_pool_create(), which leads to crashes after the struct has been expanded with new parameters. The second Fixes tag below is where the buggy code was introduced, but because the code was moved around this patch will only apply on top of the commit in the first Fixes tag. Fixes: `c5013ac1dd` ("net: ethernet: ti: cpsw: move set of common functions in cpsw_priv") Fixes: `9ed4050c0d` ("net: ethernet: ti: cpsw: add XDP support") Reported-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Tested-by: Colin Foster <colin.foster@in-advantage.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-02-01 17:27:13 +01:00
Ard Biesheuvel	ed27539e5a	net: cpsw: avoid alignment faults by taking NET_IP_ALIGN into account commit 1771afd47430f5e95c9c3a2e3a8a63e67402d3fe upstream. Both versions of the CPSW driver declare a CPSW_HEADROOM_NA macro that takes NET_IP_ALIGN into account, but fail to use it appropriately when storing incoming packets in memory. This results in the IPv4 source and destination addresses to appear misaligned in memory, which causes aligment faults that need to be fixed up in software. So let's switch from CPSW_HEADROOM to CPSW_HEADROOM_NA where needed. This gets rid of any alignment faults on the RX path on a Beaglebone White. Fixes: `9ed4050c0d` ("net: ethernet: ti: cpsw: add XDP support") Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 11:05:41 +01:00
Christophe JAILLET	6fb190ff57	net: ethernet: ti: cpsw_ale: Fix access to un-initialized memory [ Upstream commit 7a166854b4e24c57d56b3eba9fe1594985ee0a2c ] It is spurious to allocate a bitmap without initializing it. So, better safe than sorry, initialize it to 0 at least to have some known values. While at it, switch to the devm_bitmap_ API which is less verbose. Fixes: `4b41d34367` ("net: ethernet: ti: cpsw: allow untagged traffic on host port") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-11-18 19:17:12 +01:00
Maxim Kiselev	9c806a70fa	net: davinci_emac: Fix interrupt pacing disable [ Upstream commit d52bcb47bdf971a59a2467975d2405fcfcb2fa19 ] This patch allows to use 0 for `coal->rx_coalesce_usecs` param to disable rx irq coalescing. Previously we could enable rx irq coalescing via ethtool (For ex: `ethtool -C eth0 rx-usecs 2000`) but we couldn't disable it because this part rejects 0 value: if (!coal->rx_coalesce_usecs) return -EINVAL; Fixes: `84da2658a6` ("TI DaVinci EMAC : Implement interrupt pacing functionality.") Signed-off-by: Maxim Kiselev <bigunclemax@gmail.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Link: https://lore.kernel.org/r/20211101152343.4193233-1-bigunclemax@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-11-18 19:17:05 +01:00
Yufeng Mo	f3ccfda193	ethtool: extend coalesce setting uAPI with CQE mode In order to support more coalesce parameters through netlink, add two new parameter kernel_coal and extack for .set_coalesce and .get_coalesce, then some extra info can return to user with the netlink API. Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-24 07:38:29 -07:00
Colin Ian King	5c8a2bb481	net: ethernet: ti: cpsw: make array stpa static const, makes object smaller Don't populate the array stpa on the stack but instead it static const. Makes the object code smaller by 81 bytes: Before: text data bss dec hex filename 54993 17248 0 72241 11a31 ./drivers/net/ethernet/ti/cpsw_new.o After: text data bss dec hex filename 54784 17376 0 72160 119e0 ./drivers/net/ethernet/ti/cpsw_new.o (gcc version 10.3.0) Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-19 13:23:20 +01:00
Jakub Kicinski	f4083a752a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h `9e26680733` ("bnxt_en: Update firmware call to retrieve TX PTP timestamp") `9e518f2580` ("bnxt_en: 1PPS functions to configure TSIO pins") `099fdeda65` ("bnxt_en: Event handler for PPS events") kernel/bpf/helpers.c include/linux/bpf-cgroup.h `a2baf4e8bb` ("bpf: Fix potentially incorrect results with bpf_get_local_storage()") `c7603cfa04` ("bpf: Add ambient BPF runtime context stored in current") drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c `5957cc557d` ("net/mlx5: Set all field of mlx5_irq before inserting it to the xarray") `2d0b41a376` ("net/mlx5: Refcount mlx5_irq with integer") MAINTAINERS `7b637cd52f` ("MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo") `7d901a1e87` ("net: phy: add Maxlinear GPY115/21x/24x driver") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-13 06:41:22 -07:00
Vladimir Oltean	c35b57ceff	net: switchdev: zero-initialize struct switchdev_notifier_fdb_info emitted by drivers towards the bridge The blamed commit added a new field to struct switchdev_notifier_fdb_info, but did not make sure that all call paths set it to something valid. For example, a switchdev driver may emit a SWITCHDEV_FDB_ADD_TO_BRIDGE notifier, and since the 'is_local' flag is not set, it contains junk from the stack, so the bridge might interpret those notifications as being for local FDB entries when that was not intended. To avoid that now and in the future, zero-initialize all switchdev_notifier_fdb_info structures created by drivers such that all newly added fields to not need to touch drivers again. Fixes: `2c4eca3ef7` ("net: bridge: switchdev: include local flag in FDB notifications") Reported-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Karsten Graul <kgraul@linux.ibm.com> Link: https://lore.kernel.org/r/20210810115024.1629983-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-10 13:22:57 -07:00
Yunsheng Lin	57f05bc2ab	page_pool: keep pp info as long as page pool owns the page Currently, page->pp is cleared and set everytime the page is recycled, which is unnecessary. So only set the page->pp when the page is added to the page pool and only clear it when the page is released from the page pool. This is also a preparation to support allocating frag page in page pool. Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-09 15:49:00 -07:00
Leon Romanovsky	919d13a7e4	devlink: Set device as early as possible All kernel devlink implementations call to devlink_alloc() during initialization routine for specific device which is used later as a parent device for devlink_register(). Such late device assignment causes to the situation which requires us to call to device_register() before setting other parameters, but that call opens devlink to the world and makes accessible for the netlink users. Any attempt to move devlink_register() to be the last call generates the following error due to access to the devlink->dev pointer. [ 8.758862] devlink_nl_param_fill+0x2e8/0xe50 [ 8.760305] devlink_param_notify+0x6d/0x180 [ 8.760435] __devlink_params_register+0x2f1/0x670 [ 8.760558] devlink_params_register+0x1e/0x20 The simple change of API to set devlink device in the devlink_alloc() instead of devlink_register() fixes all this above and ensures that prior to call to devlink_register() everything already set. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-09 10:21:40 +01:00
Grygorii Strashko	acc68b8d2a	net: ethernet: ti: cpsw: fix min eth packet size for non-switch use-cases The CPSW switchdev driver inherited fix from commit `9421c90150` ("net: ethernet: ti: cpsw: fix min eth packet size") which changes min TX packet size to 64bytes (VLAN_ETH_ZLEN, excluding ETH_FCS). It was done to fix HW packed drop issue when packets are sent from Host to the port with PVID and un-tagging enabled. Unfortunately this breaks some other non-switch specific use-cases, like: - [1] CPSW port as DSA CPU port with DSA-tag applied at the end of the packet - [2] Some industrial protocols, which expects min TX packet size 60Bytes (excluding FCS). Fix it by configuring min TX packet size depending on driver mode - 60Bytes (ETH_ZLEN) for multi mac (dual-mac) mode - 64Bytes (VLAN_ETH_ZLEN) for switch mode and update it during driver mode change and annotate with READ_ONCE()/WRITE_ONCE() as it can be read by napi while writing. [1] https://lore.kernel.org/netdev/20210531124051.GA15218@cephalopod/ [2] https://e2e.ti.com/support/arm/sitara_arm/f/791/t/701669 Cc: stable@vger.kernel.org Fixes: `ed3525eda4` ("net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac") Reported-by: Ben Hutchings <ben.hutchings@essensium.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-09 10:16:46 +01:00
Grygorii Strashko	35ba6abb73	net: ethernet: ti: davinci_cpdma: revert "drop frame padding" This reverts commit `9ffc513f95` ("net: ethernet: ti: davinci_cpdma: drop frame padding") which has depndency from not yet merged patch [1] and so breaks cpsw_new driver. [1] https://patchwork.kernel.org/project/netdevbpf/patch/20210805145511.12016-1-grygorii.strashko@ti.com/ Fixes: `9ffc513f95` ("net: ethernet: ti: davinci_cpdma: drop frame padding") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Link: https://lore.kernel.org/r/20210806142809.15069-1-grygorii.strashko@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-06 16:18:15 -07:00
Grygorii Strashko	3bacbe0425	net: ethernet: ti: am65-cpsw: use napi_complete_done() in TX completion This patch enables support for hard irqs deferral feature from Eric Dumazet [1] for TI K3 CPSW driver by using napi_complete_done() in TX completion path. Depending on gro_flush_timeout and napi_defer_hard_irqs at gives up to 30% CPU utilization reduction: gro_flush_timeout=50000 napi_defer_hard_irqs=2 netperf -l 10 -H 192.168.1.1 -t UDP_STREAM -c -C -- -m 1470 MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET Socket Message Elapsed Messages CPU Service Size Size Time Okay Errors Throughput Util Demand bytes bytes secs # # 10^6bits/sec % SS us/KB before: 212992 1470 10.00 809632 0 952.0 42.98 14.792 212992 10.00 809630 952.0 50.66 8.719 after: 212992 1470 10.00 813686 0 956.8 32.14 11.009 212992 10.00 813686 956.8 50.05 8.570 [1] https://lore.kernel.org/netdev/20200422161329.56026-1-edumazet@google.com/ Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-06 11:08:57 +01:00
Vignesh Raghavendra	47bfc4d128	net: ti: am65-cpsw-nuss: fix RX IRQ state after .ndo_stop() On TI K3 am64x platform the issue with RX IRQ is observed - it's become disabled forever after .ndo_stop(). The K3 CPSW driver manipulates RX IRQ by using standard Linux enable_irq()/disable_irq_nosync() API as there is no IRQ enable/disable options in CPSW HW itself, as result during .ndo_stop() following sequence happens phy_stop() teardown TX/RX channels wait for TX tdown complete napi_disable(TX) clean up TX channels (a) napi_disable(RX) At point (a) it's not possible to predict if RX IRQ was triggered or not. if RX IRQ was triggered then it also not possible to definitely say if RX NAPI was run or only scheduled and immediately canceled by napi_disable(RX). Actually the last case causes RX IRQ to be permanently disabled. Another observed issue is that RX IRQ enable counter become unbalanced if (gro_flush_timeout =! 0) while (napi_defer_hard_irqs == 0): Unbalanced enable for IRQ 44 WARNING: CPU: 0 PID: 10 at ../kernel/irq/manage.c:776 __enable_irq+0x38/0x80 __enable_irq+0x38/0x80 enable_irq+0x54/0xb0 am65_cpsw_nuss_rx_poll+0x2f4/0x368 __napi_poll+0x34/0x1b8 net_rx_action+0xe4/0x220 _stext+0x11c/0x284 run_ksoftirqd+0x4c/0x60 To avoid above issues introduce flag indicating if RX was actually disabled before enabling it in am65_cpsw_nuss_rx_poll() and restore RX IRQ state in .ndo_open() Fixes: `4f7cce2724` ("net: ethernet: ti: am65-cpsw: add support for am64x cpsw3g") Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-06 11:08:57 +01:00
Grygorii Strashko	9ffc513f95	net: ethernet: ti: davinci_cpdma: drop frame padding Hence all users of davinci_cpdma switched to skb_put_padto() the frame padding can be removed from it. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-06 10:26:46 +01:00
Grygorii Strashko	61e7a22da7	net: ethernet: ti: davinci_emac: switch to use skb_put_padto() Use skb_put_padto() instead of skb_padto() so skb->len also got updated, as preparation for further removing frame padding from cpdma. It also makes xmit path more clear and linear. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-06 10:26:46 +01:00
Grygorii Strashko	1f88d5d566	net: ethernet: ti: cpsw: switch to use skb_put_padto() Use skb_put_padto() instead of skb_padto() so skb->len also got updated, as preparation for further removing frame padding from cpdma. It also makes xmit path more clear and linear. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-06 10:26:46 +01:00
Jakub Kicinski	0ca8d3ca45	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Build failure in drivers/net/wwan/mhi_wwan_mbim.c: add missing parameter (0, assuming we don't want buffer pre-alloc). Conflict in drivers/net/dsa/sja1105/sja1105_main.c between: `589918df93` ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too") `0fac6aa098` ("net: dsa: sja1105: delete the best_effort_vlan_filtering mode") Follow the instructions from the commit message of the former commit - removed the if conditions. When looking at commit `589918df93` ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too") note that the mask_iotag fields get removed by the following patch. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-05 15:08:47 -07:00
Grygorii Strashko	ae03d189ba	net: ethernet: ti: am65-cpsw: fix crash in am65_cpsw_port_offload_fwd_mark_update() The am65_cpsw_port_offload_fwd_mark_update() causes NULL exception crash when there is at least one disabled port and any other port added to the bridge first time. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000858 pc : am65_cpsw_port_offload_fwd_mark_update+0x54/0x68 lr : am65_cpsw_netdevice_event+0x8c/0xf0 Call trace: am65_cpsw_port_offload_fwd_mark_update+0x54/0x68 notifier_call_chain+0x54/0x98 raw_notifier_call_chain+0x14/0x20 call_netdevice_notifiers_info+0x34/0x78 __netdev_upper_dev_link+0x1c8/0x290 netdev_master_upper_dev_link+0x1c/0x28 br_add_if+0x3f0/0x6d0 [bridge] Fix it by adding proper check for port->ndev != NULL. Fixes: `2934db9bcb` ("net: ti: am65-cpsw-nuss: Add netdevice notifiers") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-05 13:33:23 +01:00
Vladimir Oltean	a54182b2a5	Revert "net: build all switchdev drivers as modules when the bridge is a module" This reverts commit `b0e8181762`. Explicit driver dependency on the bridge is no longer needed since switchdev_bridge_port_{,un}offload() is no longer implemented by the bridge driver but by switchdev. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-04 12:35:07 +01:00
Vladimir Oltean	957e2235e5	net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge With the introduction of explicit offloading API in switchdev in commit `2f5dc00f7a` ("net: bridge: switchdev: let drivers inform which bridge ports are offloaded"), we started having Ethernet switch drivers calling directly into a function exported by net/bridge/br_switchdev.c, which is a function exported by the bridge driver. This means that drivers that did not have an explicit dependency on the bridge before, like cpsw and am65-cpsw, now do - otherwise it is not possible to call a symbol exported by a driver that can be built as module unless you are a module too. There was an attempt to solve the dependency issue in the form of commit `b0e8181762` ("net: build all switchdev drivers as modules when the bridge is a module"). Grygorii Strashko, however, says about it: \| In my opinion, the problem is a bit bigger here than just fixing the \| build :( \| \| In case, of ^cpsw the switchdev mode is kinda optional and in many \| cases (especially for testing purposes, NFS) the multi-mac mode is \| still preferable mode. \| \| There were no such tight dependency between switchdev drivers and \| bridge core before and switchdev serviced as independent, notification \| based layer between them, so ^cpsw still can be "Y" and bridge can be \| "M". Now for mostly every kernel build configuration the CONFIG_BRIDGE \| will need to be set as "Y", or we will have to update drivers to \| support build with BRIDGE=n and maintain separate builds for \| networking vs non-networking testing. But is this enough? Wouldn't \| it cause 'chain reaction' required to add more and more "Y" options \| (like CONFIG_VLAN_8021Q)? \| \| PS. Just to be sure we on the same page - ARM builds will be forced \| (with this patch) to have CONFIG_TI_CPSW_SWITCHDEV=m and so all our \| automation testing will just fail with omap2plus_defconfig. In the light of this, it would be desirable for some configurations to avoid dependencies between switchdev drivers and the bridge, and have the switchdev mode as completely optional within the driver. Arnd Bergmann also tried to write a patch which better expressed the build time dependency for Ethernet switch drivers where the switchdev support is optional, like cpsw/am65-cpsw, and this made the drivers follow the bridge (compile as module if the bridge is a module) only if the optional switchdev support in the driver was enabled in the first place: https://patchwork.kernel.org/project/netdevbpf/patch/20210802144813.1152762-1-arnd@kernel.org/ but this still did not solve the fact that cpsw and am65-cpsw now must be built as modules when the bridge is a module - it just expressed correctly that optional dependency. But the new behavior is an apparent regression from Grygorii's perspective. So to support the use case where the Ethernet driver is built-in, NET_SWITCHDEV (a bool option) is enabled, and the bridge is a module, we need a framework that can handle the possible absence of the bridge from the running system, i.e. runtime bloatware as opposed to build-time bloatware. Luckily we already have this framework, since switchdev has been using it extensively. Events from the bridge side are transmitted to the driver side using notifier chains - this was originally done so that unrelated drivers could snoop for events emitted by the bridge towards ports that are implemented by other drivers (think of a switch driver with LAG offload that listens for switchdev events on a bonding/team interface that it offloads). There are also events which are transmitted from the driver side to the bridge side, which again are modeled using notifiers. SWITCHDEV_FDB_ADD_TO_BRIDGE is an example of this, and deals with notifying the bridge that a MAC address has been dynamically learned. So there is a precedent we can use for modeling the new framework. The difference compared to SWITCHDEV_FDB_ADD_TO_BRIDGE is that the work that the bridge needs to do when a port becomes offloaded is blocking in its nature: replay VLANs, MDBs etc. The calling context is indeed blocking (we are under rtnl_mutex), but the existing switchdev notification chain that the bridge is subscribed to is only the atomic one. So we need to subscribe the bridge to the blocking switchdev notification chain too. This patch: - keeps the driver-side perception of the switchdev_bridge_port_{,un}offload unchanged - moves the implementation of switchdev_bridge_port_{,un}offload from the bridge module into the switchdev module. - makes everybody that is subscribed to the switchdev blocking notifier chain "hear" offload & unoffload events - makes the bridge driver subscribe and handle those events - moves the bridge driver's handling of those events into 2 new functions called br_switchdev_port_{,un}offload. These functions contain in fact the core of the logic that was previously in switchdev_bridge_port_{,un}offload, just that now we go through an extra indirection layer to reach them. Unlike all the other switchdev notification structures, the structure used to carry the bridge port information, struct switchdev_notifier_brport_info, does not contain a "bool handled". This is because in the current usage pattern, we always know that a switchdev bridge port offloading event will be handled by the bridge, because the switchdev_bridge_port_offload() call was initiated by a NETDEV_CHANGEUPPER event in the first place, where info->upper_dev is a bridge. So if the bridge wasn't loaded, then the CHANGEUPPER event couldn't have happened. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-04 12:35:07 +01:00
Leon Romanovsky	acf34954ef	net: ti: am65-cpsw-nuss: fix wrong devlink release order The commit that introduced devlink support released devlink resources in wrong order, that made an unwind flow to be asymmetrical. In addition, the am65-cpsw-nuss used internal to devlink core field - registered. In order to fix the unwind flow and remove such access to the registered field, rewrite the code to call devlink_port_unregister only on registered ports. Fixes: `58356eb31d` ("net: ti: am65-cpsw-nuss: Add devlink support") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-28 10:23:45 +01:00
Arnd Bergmann	a76053707d	dev_ioctl: split out ndo_eth_ioctl Most users of ndo_do_ioctl are ethernet drivers that implement the MII commands SIOCGMIIPHY/SIOCGMIIREG/SIOCSMIIREG, or hardware timestamping with SIOCSHWTSTAMP/SIOCGHWTSTAMP. Separate these from the few drivers that use ndo_do_ioctl to implement SIOCBOND, SIOCBR and SIOCWANDEV commands. This is a purely cosmetic change intended to help readers find their way through the implementation. Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Vivien Didelot <vivien.didelot@gmail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Vladimir Oltean <olteanv@gmail.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: linux-rdma@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-27 20:11:45 +01:00
Vladimir Oltean	b0e8181762	net: build all switchdev drivers as modules when the bridge is a module Currently, all drivers depend on the bool CONFIG_NET_SWITCHDEV, but only the drivers that call some sort of function exported by the bridge, like br_vlan_enabled() or whatever, have an extra dependency on CONFIG_BRIDGE. Since the blamed commit, all switchdev drivers have a functional dependency upon switchdev_bridge_port_{,un}offload(), which is a pair of functions exported by the bridge module and not by the bridge-independent part of CONFIG_NET_SWITCHDEV. Problems appear when we have: CONFIG_BRIDGE=m CONFIG_NET_SWITCHDEV=y CONFIG_TI_CPSW_SWITCHDEV=y because cpsw, am65_cpsw and sparx5 will then be built-in but they will call a symbol exported by a loadable module. This is not possible and will result in the following build error: drivers/net/ethernet/ti/cpsw_new.o: in function `cpsw_netdevice_event': drivers/net/ethernet/ti/cpsw_new.c:1520: undefined reference to `switchdev_bridge_port_offload' drivers/net/ethernet/ti/cpsw_new.c:1537: undefined reference to `switchdev_bridge_port_unoffload' As mentioned, the other switchdev drivers don't suffer from this because switchdev_bridge_port_offload() is not the first symbol exported by the bridge that they are calling, so they already needed to deal with this in the same way. Fixes: `2f5dc00f7a` ("net: bridge: switchdev: let drivers inform which bridge ports are offloaded") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-27 11:12:57 +01:00
Tobias Waldekranz	472111920f	net: bridge: switchdev: allow the TX data plane forwarding to be offloaded Allow switchdevs to forward frames from the CPU in accordance with the bridge configuration in the same way as is done between bridge ports. This means that the bridge will only send a single skb towards one of the ports under the switchdev's control, and expects the driver to deliver the packet to all eligible ports in its domain. Primarily this improves the performance of multicast flows with multiple subscribers, as it allows the hardware to perform the frame replication. The basic flow between the driver and the bridge is as follows: - When joining a bridge port, the switchdev driver calls switchdev_bridge_port_offload() with tx_fwd_offload = true. - The bridge sends offloadable skbs to one of the ports under the switchdev's control using skb->offload_fwd_mark = true. - The switchdev driver checks the skb->offload_fwd_mark field and lets its FDB lookup select the destination port mask for this packet. v1->v2: - convert br_input_skb_cb::fwd_hwdoms to a plain unsigned long - introduce a static key "br_switchdev_fwd_offload_used" to minimize the impact of the newly introduced feature on all the setups which don't have hardware that can make use of it - introduce a check for nbp->flags & BR_FWD_OFFLOAD to optimize cache line access - reorder nbp_switchdev_frame_mark_accel() and br_handle_vlan() in __br_forward() - do not strip VLAN on egress if forwarding offload on VLAN-aware bridge is being used - propagate errors from .ndo_dfwd_add_station() if not EOPNOTSUPP v2->v3: - replace the solution based on .ndo_dfwd_add_station with a solution based on switchdev_bridge_port_offload - rename BR_FWD_OFFLOAD to BR_TX_FWD_OFFLOAD v3->v4: rebase v4->v5: - make sure the static key is decremented on bridge port unoffload - more function and variable renaming and comments for them: br_switchdev_fwd_offload_used to br_switchdev_tx_fwd_offload br_switchdev_accels_skb to br_switchdev_frame_uses_tx_fwd_offload nbp_switchdev_frame_mark_tx_fwd to nbp_switchdev_frame_mark_tx_fwd_to_hwdom nbp_switchdev_frame_mark_accel to nbp_switchdev_frame_mark_tx_fwd_offload fwd_accel to tx_fwd_offload Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-23 16:32:37 +01:00
Vladimir Oltean	4e51bf44a0	net: bridge: move the switchdev object replay helpers to "push" mode Starting with commit `4f2673b3a2` ("net: bridge: add helper to replay port and host-joined mdb entries"), DSA has introduced some bridge helpers that replay switchdev events (FDB/MDB/VLAN additions and deletions) that can be lost by the switchdev drivers in a variety of circumstances: - an IP multicast group was host-joined on the bridge itself before any switchdev port joined the bridge, leading to the host MDB entries missing in the hardware database. - during the bridge creation process, the MAC address of the bridge was added to the FDB as an entry pointing towards the bridge device itself, but with no switchdev ports being part of the bridge yet, this local FDB entry would remain unknown to the switchdev hardware database. - a VLAN/FDB/MDB was added to a bridge port that is a LAG interface, before any switchdev port joined that LAG, leading to the hardware database missing those entries. - a switchdev port left a LAG that is a bridge port, while the LAG remained part of the bridge, and all FDB/MDB/VLAN entries remained installed in the hardware database of the switchdev port. Also, since commit `0d2cfbd41c` ("net: bridge: ignore switchdev events for LAG ports which didn't request replay"), DSA introduced a method, based on a const void ctx, to ensure that two switchdev ports under the same LAG that is a bridge port do not see the same MDB/VLAN entry being replayed twice by the bridge, once for every bridge port that joins the LAG. With so many ordering corner cases being possible, it seems unreasonable to expect a switchdev driver writer to get it right from the first try. Therefore, now that DSA has experimented with the bridge replay helpers for a little bit, we can move the code to the bridge driver where it is more readily available to all switchdev drivers. To convert the switchdev object replay helpers from "pull mode" (where the driver asks for them) to a "push mode" (where the bridge offers them automatically), the biggest problem is that the bridge needs to be aware when a switchdev port joins and leaves, even when the switchdev is only indirectly a bridge port (for example when the bridge port is a LAG upper of the switchdev). Luckily, we already have a hook for that, in the form of the newly introduced switchdev_bridge_port_offload() and switchdev_bridge_port_unoffload() calls. These offer a natural place for hooking the object addition and deletion replays. Extend the above 2 functions with: - pointers to the switchdev atomic notifier (for FDB replays) and the blocking notifier (for MDB and VLAN replays). - the "const void ctx" argument required for drivers to be able to disambiguate between which port is targeted, when multiple ports are lowers of the same LAG that is a bridge port. Most of the drivers pass NULL to this argument, except the ones that support LAG offload and have the proper context check already in place in the switchdev blocking notifier handler. Also unexport the replay helpers, since nobody except the bridge calls them directly now. Note that: (a) we abuse the terminology slightly, because FDB entries are not "switchdev objects", but we count them as objects nonetheless. With no direct way to prove it, I think they are not modeled as switchdev objects because those can only be installed by the bridge to the hardware (as opposed to FDB entries which can be propagated in the other direction too). This is merely an abuse of terms, FDB entries are replayed too, despite not being objects. (b) the bridge does not attempt to sync port attributes to newly joined ports, just the countable stuff (the objects). The reason for this is simple: no universal and symmetric way to sync and unsync them is known. For example, VLAN filtering: what to do on unsync, disable or leave it enabled? Similarly, STP state, ageing timer, etc etc. What a switchdev port does when it becomes standalone again is not really up to the bridge's competence, and the driver should deal with it. On the other hand, replaying deletions of switchdev objects can be seen a matter of cleanup and therefore be treated by the bridge, hence this patch. We make the replay helpers opt-in for drivers, because they might not bring immediate benefits for them: - nbp_vlan_init() is called _after_ netdev_master_upper_dev_link(), so br_vlan_replay() should not do anything for the new drivers on which we call it. The existing drivers where there was even a slight possibility for there to exist a VLAN on a bridge port before they join it are already guarded against this: mlxsw and prestera deny joining LAG interfaces that are members of a bridge. - br_fdb_replay() should now notify of local FDB entries, but I patched all drivers except DSA to ignore these new entries in commit `2c4eca3ef7` ("net: bridge: switchdev: include local flag in FDB notifications"). Driver authors can lift this restriction as they wish, and when they do, they can also opt into the FDB replay functionality. - br_mdb_replay() should fix a real issue which is described in commit `4f2673b3a2` ("net: bridge: add helper to replay port and host-joined mdb entries"). However most drivers do not offload the SWITCHDEV_OBJ_ID_HOST_MDB to see this issue: only cpsw and am65_cpsw offload this switchdev object, and I don't completely understand the way in which they offload this switchdev object anyway. So I'll leave it up to these drivers' respective maintainers to opt into br_mdb_replay(). So most of the drivers pass NULL notifier blocks for the replay helpers, except: - dpaa2-switch which was already acked/regression-tested with the helpers enabled (and there isn't much of a downside in having them) - ocelot which already had replay logic in "pull" mode - DSA which already had replay logic in "pull" mode An important observation is that the drivers which don't currently request bridge event replays don't even have the switchdev_bridge_port_{offload,unoffload} calls placed in proper places right now. This was done to avoid unnecessary rework for drivers which might never even add support for this. For driver writers who wish to add replay support, this can be used as a tentative placement guide: https://patchwork.kernel.org/project/netdevbpf/patch/20210720134655.892334-11-vladimir.oltean@nxp.com/ Cc: Vadym Kochan <vkochan@marvell.com> Cc: Taras Chornyi <tchornyi@marvell.com> Cc: Ioana Ciornei <ioana.ciornei@nxp.com> Cc: Lars Povlsen <lars.povlsen@microchip.com> Cc: Steen Hegelund <Steen.Hegelund@microchip.com> Cc: UNGLinuxDriver@microchip.com Cc: Claudiu Manoil <claudiu.manoil@nxp.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-22 00:26:23 -07:00
Vladimir Oltean	2f5dc00f7a	net: bridge: switchdev: let drivers inform which bridge ports are offloaded On reception of an skb, the bridge checks if it was marked as 'already forwarded in hardware' (checks if skb->offload_fwd_mark == 1), and if it is, it assigns the source hardware domain of that skb based on the hardware domain of the ingress port. Then during forwarding, it enforces that the egress port must have a different hardware domain than the ingress one (this is done in nbp_switchdev_allowed_egress). Non-switchdev drivers don't report any physical switch id (neither through devlink nor .ndo_get_port_parent_id), therefore the bridge assigns them a hardware domain of 0, and packets coming from them will always have skb->offload_fwd_mark = 0. So there aren't any restrictions. Problems appear due to the fact that DSA would like to perform software fallback for bonding and team interfaces that the physical switch cannot offload. +-- br0 ---+ / / \| \ / / \| \ / \| \| bond0 / \| \| / \ swp0 swp1 swp2 swp3 swp4 There, it is desirable that the presence of swp3 and swp4 under a non-offloaded LAG does not preclude us from doing hardware bridging beteen swp0, swp1 and swp2. The bandwidth of the CPU is often times high enough that software bridging between {swp0,swp1,swp2} and bond0 is not impractical. But this creates an impossible paradox given the current way in which port hardware domains are assigned. When the driver receives a packet from swp0 (say, due to flooding), it must set skb->offload_fwd_mark to something. - If we set it to 0, then the bridge will forward it towards swp1, swp2 and bond0. But the switch has already forwarded it towards swp1 and swp2 (not to bond0, remember, that isn't offloaded, so as far as the switch is concerned, ports swp3 and swp4 are not looking up the FDB, and the entire bond0 is a destination that is strictly behind the CPU). But we don't want duplicated traffic towards swp1 and swp2, so it's not ok to set skb->offload_fwd_mark = 0. - If we set it to 1, then the bridge will not forward the skb towards the ports with the same switchdev mark, i.e. not to swp1, swp2 and bond0. Towards swp1 and swp2 that's ok, but towards bond0? It should have forwarded the skb there. So the real issue is that bond0 will be assigned the same hardware domain as {swp0,swp1,swp2}, because the function that assigns hardware domains to bridge ports, nbp_switchdev_add(), recurses through bond0's lower interfaces until it finds something that implements devlink (calls dev_get_port_parent_id with bool recurse = true). This is a problem because the fact that bond0 can be offloaded by swp3 and swp4 in our example is merely an assumption. A solution is to give the bridge explicit hints as to what hardware domain it should use for each port. Currently, the bridging offload is very 'silent': a driver registers a netdevice notifier, which is put on the netns's notifier chain, and which sniffs around for NETDEV_CHANGEUPPER events where the upper is a bridge, and the lower is an interface it knows about (one registered by this driver, normally). Then, from within that notifier, it does a bunch of stuff behind the bridge's back, without the bridge necessarily knowing that there's somebody offloading that port. It looks like this: ip link set swp0 master br0 \| v br_add_if() calls netdev_master_upper_dev_link() \| v call_netdevice_notifiers \| v dsa_slave_netdevice_event \| v oh, hey! it's for me! \| v .port_bridge_join What we do to solve the conundrum is to be less silent, and change the switchdev drivers to present themselves to the bridge. Something like this: ip link set swp0 master br0 \| v br_add_if() calls netdev_master_upper_dev_link() \| v bridge: Aye! I'll use this call_netdevice_notifiers ^ ppid as the \| \| hardware domain for v \| this port, and zero dsa_slave_netdevice_event \| if I got nothing. \| \| v \| oh, hey! it's for me! \| \| \| v \| .port_bridge_join \| \| \| +------------------------+ switchdev_bridge_port_offload(swp0, swp0) Then stacked interfaces (like bond0 on top of swp3/swp4) would be treated differently in DSA, depending on whether we can or cannot offload them. The offload case: ip link set bond0 master br0 \| v br_add_if() calls netdev_master_upper_dev_link() \| v bridge: Aye! I'll use this call_netdevice_notifiers ^ ppid as the \| \| switchdev mark for v \| bond0. dsa_slave_netdevice_event \| Coincidentally (or not), \| \| bond0 and swp0, swp1, swp2 v \| all have the same switchdev hmm, it's not quite for me, \| mark now, since the ASIC but my driver has already \| is able to forward towards called .port_lag_join \| all these ports in hw. for it, because I have \| a port with dp->lag_dev == bond0. \| \| \| v \| .port_bridge_join \| for swp3 and swp4 \| \| \| +------------------------+ switchdev_bridge_port_offload(bond0, swp3) switchdev_bridge_port_offload(bond0, swp4) And the non-offload case: ip link set bond0 master br0 \| v br_add_if() calls netdev_master_upper_dev_link() \| v bridge waiting: call_netdevice_notifiers ^ huh, switchdev_bridge_port_offload \| \| wasn't called, okay, I'll use a v \| hwdom of zero for this one. dsa_slave_netdevice_event : Then packets received on swp0 will \| : not be software-forwarded towards v : swp1, but they will towards bond0. it's not for me, but bond0 is an upper of swp3 and swp4, but their dp->lag_dev is NULL because they couldn't offload it. Basically we can draw the conclusion that the lowers of a bridge port can come and go, so depending on the configuration of lowers for a bridge port, it can dynamically toggle between offloaded and unoffloaded. Therefore, we need an equivalent switchdev_bridge_port_unoffload too. This patch changes the way any switchdev driver interacts with the bridge. From now on, everybody needs to call switchdev_bridge_port_offload and switchdev_bridge_port_unoffload, otherwise the bridge will treat the port as non-offloaded and allow software flooding to other ports from the same ASIC. Note that these functions lay the ground for a more complex handshake between switchdev drivers and the bridge in the future. For drivers that will request a replay of the switchdev objects when they offload and unoffload a bridge port (DSA, dpaa2-switch, ocelot), we place the call to switchdev_bridge_port_unoffload() strategically inside the NETDEV_PRECHANGEUPPER notifier's code path, and not inside NETDEV_CHANGEUPPER. This is because the switchdev object replay helpers need the netdev adjacency lists to be valid, and that is only true in NETDEV_PRECHANGEUPPER. Cc: Vadym Kochan <vkochan@marvell.com> Cc: Taras Chornyi <tchornyi@marvell.com> Cc: Ioana Ciornei <ioana.ciornei@nxp.com> Cc: Lars Povlsen <lars.povlsen@microchip.com> Cc: Steen Hegelund <Steen.Hegelund@microchip.com> Cc: UNGLinuxDriver@microchip.com Cc: Claudiu Manoil <claudiu.manoil@nxp.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch: regression Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com> # ocelot-switch Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-22 00:26:23 -07:00
Pavel Skripkin	0336f8ffec	net: ti: fix UAF in tlan_remove_one priv is netdev private data and it cannot be used after free_netdev() call. Using priv after free_netdev() can cause UAF bug. Fix it by moving free_netdev() at the end of the function. Fixes: `1e0a8b13d3` ("tlan: cancel work at remove path") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-09 11:01:01 -07:00
Jakub Kicinski	b6df00789e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Trivial conflict in net/netfilter/nf_tables_api.c. Duplicate fix in tools/testing/selftests/net/devlink_port_split.py - take the net-next version. skmsg, and L4 bpf - keep the bpf code but remove the flags and err params. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-06-29 15:45:27 -07:00
David S. Miller	e1289cfb63	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2021-06-28 The following pull-request contains BPF updates for your net-next tree. We've added 37 non-merge commits during the last 12 day(s) which contain a total of 56 files changed, 394 insertions(+), 380 deletions(-). The main changes are: 1) XDP driver RCU cleanups, from Toke Høiland-Jørgensen and Paul E. McKenney. 2) Fix bpf_skb_change_proto() IPv4/v6 GSO handling, from Maciej Żenczykowski. 3) Fix false positive kmemleak report for BPF ringbuf alloc, from Rustam Kovhaev. 4) Fix x86 JIT's extable offset calculation for PROBE_LDX NULL, from Ravi Bangoria. 5) Enable libbpf fallback probing with tracing under RHEL7, from Jonathan Edwards. 6) Clean up x86 JIT to remove unused cnt tracking from EMIT macro, from Jiri Olsa. 7) Netlink cleanups for libbpf to please Coverity, from Kumar Kartikeya Dwivedi. 8) Allow to retrieve ancestor cgroup id in tracing programs, from Namhyung Kim. 9) Fix lirc BPF program query to use user-provided prog_cnt, from Sean Young. 10) Add initial libbpf doc including generated kdoc for its API, from Grant Seltzer. 11) Make xdp_rxq_info_unreg_mem_model() more robust, from Jakub Kicinski. 12) Fix up bpfilter startup log-level to info level, from Gary Lin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-28 15:28:03 -07:00
Vladimir Oltean	69bfac968a	net: switchdev: add a context void pointer to struct switchdev_notifier_info In the case where the driver asks for a replay of a certain type of event (port object or attribute) for a bridge port that is a LAG, it may do so because this port has just joined the LAG. But there might already be other switchdev ports in that LAG, and it is preferable that those preexisting switchdev ports do not act upon the replayed event. The solution is to add a context to switchdev events, which is NULL most of the time (when the bridge layer initiates the call) but which can be set to a value controlled by the switchdev driver when a replay is requested. The driver can then check the context to figure out if all ports within the LAG should act upon the switchdev event, or just the ones that match the context. We have to modify all switchdev_handle_* helper functions as well as the prototypes in the drivers that use these helpers too, because these helpers hide the underlying struct switchdev_notifier_info from us and there is no way to retrieve the context otherwise. The context structure will be populated and used in later patches. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-28 14:09:03 -07:00
Toke Høiland-Jørgensen	0cc84b9a60	ti: Remove rcu_read_lock() around XDP program invocation The cpsw driver has rcu_read_lock()/rcu_read_unlock() pairs around XDP program invocations. However, the actual lifetime of the objects referred by the XDP program invocation is longer, all the way through to the call to xdp_do_flush(), making the scope of the rcu_read_lock() too small. This turns out to be harmless because it all happens in a single NAPI poll cycle (and thus under local_bh_disable()), but it makes the rcu_read_lock() misleading. Rather than extend the scope of the rcu_read_lock(), just get rid of it entirely. With the addition of RCU annotations to the XDP_REDIRECT map types that take bh execution into account, lockdep even understands this to be safe, so there's really no reason to keep it around. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Cc: linux-omap@vger.kernel.org Link: https://lore.kernel.org/bpf/20210624160609.292325-20-toke@redhat.com	2021-06-24 19:46:39 +02:00
Vignesh Raghavendra	ce8eb4c728	net: ti: am65-cpsw-nuss: Fix crash when changing number of TX queues When changing number of TX queues using ethtool: # ethtool -L eth0 tx 1 [ 135.301047] Unable to handle kernel paging request at virtual address 00000000af5d0000 [...] [ 135.525128] Call trace: [ 135.525142] dma_release_from_dev_coherent+0x2c/0xb0 [ 135.525148] dma_free_attrs+0x54/0xe0 [ 135.525156] k3_cppi_desc_pool_destroy+0x50/0xa0 [ 135.525164] am65_cpsw_nuss_remove_tx_chns+0x88/0xdc [ 135.525171] am65_cpsw_set_channels+0x3c/0x70 [...] This is because k3_cppi_desc_pool_destroy() which is called after k3_udma_glue_release_tx_chn() in am65_cpsw_nuss_remove_tx_chns() references struct device that is unregistered at the end of k3_udma_glue_release_tx_chn() Therefore the right order is to call k3_cppi_desc_pool_destroy() and destroy desc pool before calling k3_udma_glue_release_tx_chn(). Fix this throughout the driver. Fixes: `93a7653031` ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-22 10:49:56 -07:00
Lorenzo Bianconi	a078d981f8	net: ti: add pp skb recycling support As already done for mvneta and mvpp2, enable skb recycling for ti ethernet drivers ti driver on net-next: ---------------------- [perf top] 47.15% [kernel] [k] _raw_spin_unlock_irqrestore 11.77% [kernel] [k] __cpdma_chan_free 3.16% [kernel] [k] ___bpf_prog_run 2.52% [kernel] [k] cpsw_rx_vlan_encap 2.34% [kernel] [k] __netif_receive_skb_core 2.27% [kernel] [k] free_unref_page 2.26% [kernel] [k] kmem_cache_free 2.24% [kernel] [k] kmem_cache_alloc 1.69% [kernel] [k] __softirqentry_text_start 1.61% [kernel] [k] cpsw_rx_handler 1.19% [kernel] [k] page_pool_release_page 1.19% [kernel] [k] clear_bits_ll 1.15% [kernel] [k] page_frag_free 1.06% [kernel] [k] __dma_page_dev_to_cpu 0.99% [kernel] [k] memset 0.94% [kernel] [k] __alloc_pages_bulk 0.92% [kernel] [k] kfree_skb 0.85% [kernel] [k] packet_rcv 0.78% [kernel] [k] page_address 0.75% [kernel] [k] v7_dma_inv_range 0.71% [kernel] [k] __lock_text_start [iperf3 tcp] [ 5] 0.00-10.00 sec 873 MBytes 732 Mbits/sec 0 sender [ 5] 0.00-10.01 sec 866 MBytes 726 Mbits/sec receiver ti + skb recycling: ------------------- [perf top] 40.58% [kernel] [k] _raw_spin_unlock_irqrestore 16.18% [kernel] [k] __softirqentry_text_start 10.33% [kernel] [k] __cpdma_chan_free 2.62% [kernel] [k] ___bpf_prog_run 2.05% [kernel] [k] cpsw_rx_vlan_encap 2.00% [kernel] [k] kmem_cache_alloc 1.86% [kernel] [k] __netif_receive_skb_core 1.80% [kernel] [k] kmem_cache_free 1.63% [kernel] [k] cpsw_rx_handler 1.12% [kernel] [k] cpsw_rx_mq_poll 1.11% [kernel] [k] page_pool_put_page 1.04% [kernel] [k] _raw_spin_unlock 0.97% [kernel] [k] clear_bits_ll 0.90% [kernel] [k] packet_rcv 0.88% [kernel] [k] __dma_page_dev_to_cpu 0.85% [kernel] [k] kfree_skb 0.80% [kernel] [k] memset 0.71% [kernel] [k] __lock_text_start 0.66% [kernel] [k] v7_dma_inv_range 0.64% [kernel] [k] gen_pool_free_owner [iperf3 tcp] [ 5] 0.00-10.00 sec 884 MBytes 742 Mbits/sec 0 sender [ 5] 0.00-10.01 sec 878 MBytes 735 Mbits/sec receiver Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-15 11:50:43 -07:00
Yang Yingliang	0699073951	net: davinci_emac: Use devm_platform_get_and_ioremap_resource() Use devm_platform_get_and_ioremap_resource() to simplify code and avoid a null-ptr-deref by checking 'res' in it. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-10 13:02:01 -07:00
Yang Yingliang	aced6d37df	net: ethernet: ti: cpsw: Use devm_platform_get_and_ioremap_resource() Use devm_platform_get_and_ioremap_resource() to simplify code. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-10 13:02:01 -07:00
Yang Yingliang	ba539319cc	net: ethernet: ti: cpsw-phy-sel: Use devm_platform_ioremap_resource_byname() Use the devm_platform_ioremap_resource_byname() helper instead of calling platform_get_resource_byname() and devm_ioremap_resource() separately. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-09 15:27:55 -07:00
Yang Yingliang	e77e2cf4a1	net: ethernet: ti: am65-cpts: Use devm_platform_ioremap_resource_byname() Use the devm_platform_ioremap_resource_byname() helper instead of calling platform_get_resource_byname() and devm_ioremap_resource() separately. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-09 15:24:43 -07:00
Jakub Kicinski	5ada57a9a6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net cdc-wdm: s/kill_urbs/poison_urbs/ to fix build Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-05-27 09:55:10 -07:00
Yang Shen	85ead77dc3	net: ti: Fix wrong struct name in comments Fixes the following W=1 kernel build warning(s): drivers/net/ethernet/ti/cpsw_ale.c:88: warning: expecting prototype for struct ale_dev_id. Prototype was for struct cpsw_ale_dev_id instead Cc: Cyril Chemparathy <cyril@ti.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Yang Shen <shenyang39@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-05-17 14:12:39 -07:00

1 2 3 4 5 ...

1028 Commits