Граф коммитов

43415 Коммитов

Автор SHA1 Сообщение Дата
Maxim Mikityanskiy 8eaa1d1108 net/mlx5e: xsk: Discard unaligned XSK frames on striding RQ
Striding RQ uses MTT page mapping, where each page corresponds to an XSK
frame. MTT pages have alignment requirements, and XSK frames don't have
any alignment guarantees in the unaligned mode. Frames with improper
alignment must be discarded, otherwise the packet data will be written
at a wrong address.

Fixes: 282c0c798f ("net/mlx5e: Allow XSK frames smaller than a page")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20220729121356.3990867-1-maximmi@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-02 15:07:09 +02:00
Sebin Sebastian 969e26c63d net: marvell: prestera: remove reduntant code
Fixes the coverity warning 'EVALUATION_ORDER' violation. port is written
twice with the same value.

Signed-off-by: Sebin Sebastian <mailmesebin00@gmail.com>
Link: https://lore.kernel.org/r/20220801040731.34741-1-mailmesebin00@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-01 20:32:07 -07:00
Subbaraya Sundeep 53e99496ab octeontx2-pf: Reduce minimum mtu size to 60
PTP messages like SYNC, FOLLOW_UP, DELAY_REQ are of size 58 bytes.
Using a minimum packet length as 64 makes NIX to pad 6 bytes of
zeroes while transmission. This is causing latest ptp4l application to
emit errors since length in PTP header and received packet are not same.
Padding upto 3 bytes is fine but more than that makes ptp4l to assume
the pad bytes as a TLV. Hence reduce the size to 60 from 64.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Link: https://lore.kernel.org/r/20220729092457.3850-1-naveenm@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-01 20:10:36 -07:00
Christophe JAILLET 2e8f205d91 net: txgbe: Fix an error handling path in txgbe_probe()
A pci_enable_pcie_error_reporting() should be balanced by a corresponding
pci_disable_pcie_error_reporting() call in the error handling path, as
already done in the remove function.

Fixes: 3ce7547e5b ("net: txgbe: Add build support for txgbe")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Jiawen Wu <jiawenwu@trustnetic.com>
Link: https://lore.kernel.org/r/082003d00be1f05578c9c6434272ceb314609b8e.1659285240.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-01 12:27:12 -07:00
Jian Shen a86e86db5e net: ionic: fix error check for vlan flags in ionic_set_nic_features()
The prototype of input features of ionic_set_nic_features() is
netdev_features_t, but the vlan_flags is using the private
definition of ionic drivers. It should use the variable
ctx.cmd.lif_setattr.features, rather than features to check
the vlan flags. So fixes it.

Fixes: beead698b1 ("ionic: Add the basic NDO callbacks for netdev support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-01 12:09:34 -07:00
Jian Shen 7dc839fe47 net: ice: fix error NETIF_F_HW_VLAN_CTAG_FILTER check in ice_vsi_sync_fltr()
vsi->current_netdev_flags is used store the current net device
flags, not the active netdevice features. So it should use
vsi->netdev->featurs, rather than vsi->current_netdev_flags
to check NETIF_F_HW_VLAN_CTAG_FILTER.

Fixes: 1babaf77f4 ("ice: Advertise 802.1ad VLAN filtering and offloads for PF netdev")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Acked-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-01 12:09:28 -07:00
Baowen Zheng 45490ce2ff nfp: flower: add support for tunnel offload without key ID
Currently nfp driver will reject to offload tunnel key action without
tunnel key ID which means tunnel ID is 0. But it is a normal case for tc
flower since user can setup a tunnel with tunnel ID is 0.

So we need to support this case to accept tunnel key action without
tunnel key ID.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220729091641.354748-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-01 12:02:47 -07:00
Jakub Kicinski 9936e07eaf Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
1GbE Intel Wired LAN Driver Updates 2022-07-28

Jacob Keller says:

Convert all of the Intel drivers with PTP support to the newer .adjfine
implementation which uses scaled parts per million.

This improves the precision of the frequency adjustments by taking advantage
of the full scaled parts per million input coming from user space.

In addition, all implementations are converted to using the
mul_u64_u64_div_u64 function which better handles the intermediate value.
This function supports architecture specific instructions where possible to
avoid loss of precision if the normal 64-bit multiplication would overflow.

Of note, the i40e implementation is now able to avoid loss of precision on
slower link speeds by taking advantage of this to multiply by the link speed
factor first. This results in a significantly more precise adjustment by
allowing the calculation to impact the lower bits.

This also gets us a step closer to being able to remove the .adjfreq
entirely by removing its use from many drivers.

I plan to follow this up with a series to update the drivers from other
vendors and drop the .adjfreq implementation entirely.

* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  igb: convert .adjfreq to .adjfine
  ixgbe: convert .adjfreq to .adjfine
  i40e: convert .adjfreq to .adjfine
  i40e: use mul_u64_u64_div_u64 for PTP frequency calculation
  e1000e: convert .adjfreq to .adjfine
  e1000e: remove unnecessary range check in e1000e_phc_adjfreq
  ice: implement adjfine with mul_u64_u64_div_u64
====================

Link: https://lore.kernel.org/r/20220728181836.3387862-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-01 11:15:40 -07:00
Dimitris Michailidis 8b684570ee net/funeth: Tx handling of XDP with fragments.
By now all the functions fun_xdp_tx() calls are shared with the skb path
and thus are fragment-capable. Update fun_xdp_tx(), that up to now has
been passing just one buffer, to check for fragments and call
accordingly.  This makes XDP_TX and ndo_xdp_xmit fragment-capable.

Signed-off-by: Dimitris Michailidis <dmichail@fungible.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-01 12:38:57 +01:00
Dimitris Michailidis 1c45b0cd6c net/funeth: Unify skb/XDP packet mapping.
Instead of passing an skb to the mapping function pass an
skb_shared_info plus an additional address/length pair. This makes it
usable for both skbs and XDP. Call it from the XDP path and adjust the
skb path.

Signed-off-by: Dimitris Michailidis <dmichail@fungible.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-01 12:38:57 +01:00
Dimitris Michailidis a3b461bbd1 net/funeth: Unify skb/XDP gather list writing.
Extract the Tx gather list writing code that skbs use into a utility
function and use it also for XDP.

Signed-off-by: Dimitris Michailidis <dmichail@fungible.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-01 12:38:57 +01:00
Dimitris Michailidis 16ead40812 net/funeth: Unify skb/XDP Tx packet unmapping.
Current XDP unmapping is a subset of its skb analog, dealing with
only one buffer. In preparation for multi-frag XDP rename the skb
function and use it also for XDP. The XDP version is removed.

Signed-off-by: Dimitris Michailidis <dmichail@fungible.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-01 12:38:57 +01:00
Jiri Pirko 644a66c60f net: devlink: convert reload command to take implicit devlink->lock
Convert reload command to behave the same way as the rest of the
commands and let if be called with devlink->lock held. Remove the
temporary devl_lock taking from drivers. As the DEVLINK_NL_FLAG_NO_LOCK
flag is no longer used, remove it alongside.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-01 12:14:00 +01:00
David S. Miller 9fe2e6f396 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-07-29

This series contains updates to iavf driver only.

Przemyslaw prevents setting of TC max rate below minimum supported values
and reports updated queue values when setting up TCs.
---
v2: Dropped patch 3 (hw-tc-offload check)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-01 12:00:53 +01:00
Jakub Kicinski 63757225a9 mlx5-updates-2022-07-28
Misc updates to mlx5 driver:
 
 1) Gal corrects to use skb_tcp_all_headers on encapsulated skbs.
 
 2) Roi Adds the support for offloading standalone police actions.
 
 3) lama, did some refactoring to minimize code coupling with
 mlx5e_priv "god object" in some of the follows, and converts some of the
 objects to pointers to preserve on memory when these objects aren't needed.
 This is part one of two parts series.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmLi98IACgkQSD+KveBX
 +j7Sxwf9ER9/fJiQQBfcBiqO0NZvhnbJ9XXPAsTOCJIU255+AcFMoHLmR3HY6yZY
 kH/3Va4AON7cH9hc71baGIGYJKi8aPwrLgiPPiWjnJ/0cHfLDihtxmV6Z+7VHEzN
 v53L9DjPqEVrH/TLzz05RwGCPrxNDJFzRIwQ84sT74ESXvuRjN1zCzgdOQGdoTGf
 770kp9J1qSQyAjgtV5JtjDg3BMlTJhWcZKFpidQ/1qaKR33sj8ncuIYVG0hlN38b
 Y0B2kJtBrKsrlUWtM89kQx54QHgAjF3aI83Pls0peLVuzrcJIMkKgKyw2AYcCuSw
 9exjRG3R4zaWbbOvgolej4/sm59rIA==
 =VDrF
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2022-07-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2022-07-28

Misc updates to mlx5 driver:

1) Gal corrects to use skb_tcp_all_headers on encapsulated skbs.

2) Roi Adds the support for offloading standalone police actions.

3) lama, did some refactoring to minimize code coupling with
mlx5e_priv "god object" in some of the follows, and converts some of the
objects to pointers to preserve on memory when these objects aren't needed.
This is part one of two parts series.

* tag 'mlx5-updates-2022-07-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: Move mlx5e_init_l2_addr to en_main
  net/mlx5e: Split en_fs ndo's and move to en_main
  net/mlx5e: Separate mlx5e_set_rx_mode_work and move caller to en_main
  net/mlx5e: Add mdev to flow_steering struct
  net/mlx5e: Report flow steering errors with mdev err report API
  net/mlx5e: Convert mlx5e_flow_steering member of mlx5e_priv to pointer
  net/mlx5e: Allocate VLAN and TC for featured profiles only
  net/mlx5e: Make mlx5e_tc_table private
  net/mlx5e: Convert mlx5e_tc_table member of mlx5e_flow_steering to pointer
  net/mlx5e: TC, Support tc action api for police
  net/mlx5e: TC, Separate get/update/replace meter functions
  net/mlx5e: Add red and green counters for metering
  net/mlx5e: TC, Allocate post meter ft per rule
  net/mlx5: DR, Add support for flow metering ASO
  net/mlx5e: Fix wrong use of skb_tcp_all_headers() with encapsulation
====================

Link: https://lore.kernel.org/r/20220728205728.143074-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:39:07 -07:00
Jakub Kicinski 84a8d931ab Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2022-07-28

This series contains updates to ice driver only.

Michal allows for VF true promiscuous mode to be set for multiple VFs
and adds clearing of promiscuous filters when VF trust is removed.

Maciej refactors ice_set_features() to track/check changed features
instead of constantly checking against netdev features and adds support for
NETIF_F_LOOPBACK.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: allow toggling loopback mode via ndo_set_features callback
  ice: compress branches in ice_set_features()
  ice: Fix promiscuous mode not turning off
  ice: Introduce enabling promiscuous mode on multiple VF's
====================

Link: https://lore.kernel.org/r/20220728195538.3391360-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:26:10 -07:00
Edward Cree 7267aa6d99 sfc: implement ethtool get/set RX ring size for EF100 reps
It's not truly a ring, but the maximum length of the list of queued RX
 SKBs is analogous to an RX ring size, so use that API to configure it.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:07 -07:00
Edward Cree e37f3b1561 sfc: use a dynamic m-port for representor RX and set it promisc
Representors do not want to be subject to the PF's Ethernet address
 filters, since traffic from VFs will typically have a destination
 either elsewhere on the link segment or on an overlay network.
So, create a dynamic m-port with promiscuous and all-multicast
 filters, and set it as the egress port of representor default rules.
 Since the m-port is an alias of the calling PF's own m-port, traffic
 will still be delivered to the PF's RXQs, but it will be subject to
 the VNRX filter rules installed on the dynamic m-port (specified by
 the v-port ID field of the filter spec).

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:07 -07:00
Edward Cree 77eb40749d sfc: move table locking into filter_table_{probe,remove} methods
We need to be able to drop the efx->filter_sem in ef100_filter_table_up()
 so that we can call functions that insert filters (and thus take that
 rwsem for read), which means the efx->type->filter_table_probe method
 needs to be responsible for taking the lock in the first place.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:06 -07:00
Edward Cree 67ab160ed0 sfc: insert default MAE rules to connect VFs to representors
Default rules are low-priority switching rules which the hardware uses
 in the absence of higher-priority rules.  Each representor requires a
 corresponding rule matching traffic from its representee VF and
 delivering to the PF (where a check on INGRESS_MPORT in
 __ef100_rx_packet() will direct it to the representor).  No rule is
 required in the reverse direction, because representor TX uses a TX
 override descriptor to bypass the MAE and deliver directly to the VF.
Since inserting any rule into the MAE disables the firmware's own
 default rules, also insert a pair of rules to connect the PF to the
 physical network port and vice-versa.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:06 -07:00
Edward Cree f50e8fcda6 sfc: receive packets from EF100 VFs into representors
If the source m-port of a packet in __ef100_rx_packet() is a VF,
 hand off the packet to the corresponding representor with
 efx_ef100_rep_rx_packet().

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:06 -07:00
Edward Cree 08d0b16ecb sfc: check ef100 RX packets are from the wire
If not, for now drop them and warn.  A subsequent patch will look up
 the source m-port to try and find a representor to deliver them to.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:06 -07:00
Edward Cree 6f6838aabf sfc: determine wire m-port at EF100 PF probe time
Traffic delivered to the (MAE admin) PF could be from either the wire
 or a VF.  The INGRESS_MPORT field of the RX prefix distinguishes these;
 base_mport is the value this field will have for traffic from the wire
 (which should be delivered to the PF's netdevice, not a representor).

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:06 -07:00
Edward Cree 9fe00c800e sfc: ef100 representor RX top half
Representor RX uses a NAPI context driven by a 'fake interrupt': when
 the parent PF receives a packet destined for the representor, it adds
 it to an SKB list (efv->rx_list), and schedules NAPI if the 'fake
 interrupt' is primed.  The NAPI poll then pulls packets off this list
 and feeds them to the stack with netif_receive_skb_list().
This scheme allows us to decouple representor RX from the parent PF's
 RX fast-path.
This patch implements the 'top half', which builds an SKB, copies data
 into it from the RX buffer (which can then be released), adds it to
 the queue and fires the 'fake interrupt' if necessary.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:05 -07:00
Edward Cree 69bb5fa73d sfc: ef100 representor RX NAPI poll
This patch adds the 'bottom half' napi->poll routine for representor RX.
See the next patch (with the top half) for an explanation of the 'fake
 interrupt' scheme used to drive this NAPI context.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:05 -07:00
Edward Cree a95115c407 sfc: plumb ef100 representor stats
Implement .ndo_get_stats64() method to read values out of struct
 efx_rep_sw_stats.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:05 -07:00
Dan Carpenter 71930846b3 net: marvell: prestera: uninitialized variable bug
The "ret" variable needs to be initialized at the start.

Fixes: 52323ef754 ("net: marvell: prestera: add phylink support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/YuKeBBuGtsmd7QdT@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 20:27:39 -07:00
Przemyslaw Patynowski 93cb804eda iavf: Fix 'tc qdisc show' listing too many queues
Fix tc qdisc show dev <ethX> root displaying too many fq_codel qdiscs.
tc_modify_qdisc, which is caller of ndo_setup_tc, expects driver to call
netif_set_real_num_tx_queues, which prepares qdiscs.
Without this patch, fq_codel qdiscs would not be adjusted to number of
queues on VF.
e.g.:
tc qdisc show dev <ethX>
qdisc mq 0: root
qdisc fq_codel 0: parent :4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :3 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
tc qdisc add dev <ethX> root mqprio num_tc 2 map 1 0 0 0 0 0 0 0 queues 1@0 1@1 hw 1 mode channel shaper bw_rlimit max_rate 5000Mbit 150Mbit
tc qdisc show dev <ethX>
qdisc mqprio 8003: root tc 2 map 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
             queues:(0:0) (1:1)
             mode:channel
             shaper:bw_rlimit   max_rate:5Gbit 150Mbit
qdisc fq_codel 0: parent 8003:4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent 8003:3 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent 8003:2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent 8003:1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64

While after fix:
tc qdisc add dev <ethX> root mqprio num_tc 2 map 1 0 0 0 0 0 0 0 queues 1@0 1@1 hw 1 mode channel shaper bw_rlimit max_rate 5000Mbit 150Mbit
tc qdisc show dev <ethX> #should show 2, shows 4
qdisc mqprio 8004: root tc 2 map 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
             queues:(0:0) (1:1)
             mode:channel
             shaper:bw_rlimit   max_rate:5Gbit 150Mbit
qdisc fq_codel 0: parent 8004:2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent 8004:1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64

Fixes: d5b33d0244 ("i40evf: add ndo_setup_tc callback to i40evf")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Co-developed-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Co-developed-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-29 08:45:32 -07:00
Przemyslaw Patynowski ec60d54cb9 iavf: Fix max_rate limiting
Fix max_rate option in TC, check for proper quanta boundaries.
Check for minimum value provided and if it fits expected 50Mbps
quanta.

Without this patch, iavf could send settings for max_rate limiting
that would be accepted from by PF even the max_rate option is less
than expected 50Mbps quanta. It results in no rate limiting
on traffic as rate limiting will be floored to 0.

Example:
tc qdisc add dev $vf root mqprio num_tc 3 map 0 2 1 queues \
2@0 2@2 2@4 hw 1 mode channel shaper bw_rlimit \
max_rate 50Mbps 500Mbps 500Mbps

Should limit TC0 to circa 50 Mbps

tc qdisc add dev $vf root mqprio num_tc 3 map 0 2 1 queues \
2@0 2@2 2@4 hw 1 mode channel shaper bw_rlimit \
max_rate 0Mbps 100Kbit 500Mbps

Should return error

Fixes: d5b33d0244 ("i40evf: add ndo_setup_tc callback to i40evf")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Jun Zhang <xuejun.zhang@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-29 08:45:31 -07:00
Lorenzo Bianconi 853246dbf5 net: ethernet: mtk_eth_soc: add xdp tx return bulking support
Convert mtk_eth_soc driver to xdp_return_frame_bulk APIs.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-29 12:18:10 +01:00
Lorenzo Bianconi 155738a4f3 net: ethernet: mtk_eth_soc: introduce xdp multi-frag support
Add the capability to map non-linear xdp frames in XDP_TX and
ndo_xdp_xmit callback.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-29 12:18:10 +01:00
Lorenzo Bianconi b16fe6d82b net: ethernet: mtk_eth_soc: introduce mtk_xdp_frame_map utility routine
This is a preliminary patch to add xdp multi-frag support to mtk_eth_soc
driver

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-29 12:18:10 +01:00
Danielle Ratson eba28aaf2f mlxsw: spectrum: Support ethtool 'get_ts_info' callback in Spectrum-2
The 'get_ts_info' callback is used for obtaining information about
time stamping and PTP hardware clock capabilities of a network device.

The existing function of Spectrum-1 is used to advertise the PHC
capabilities and the supported RX and TX filters. Implement a similar
function for Spectrum-2, expose that the supported 'rx_filters' are all
PTP event packets, as for these packets the driver fills the time stamp
from the CQE in the SKB.

In the future, mlxsw driver will be extended to support one-step PTP in
Spectrum-2 and newer ASICs. Then additional 'tx_types' will be supported.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-29 11:02:24 +01:00
Danielle Ratson 08ef8bc825 mlxsw: spectrum_ptp: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls
The SIOCSHWTSTAMP ioctl configures HW timestamping on a given port. In
Spectrum-2 and above, each packet gets time stamp by default, but in
order to provide an accurate time stamp, software should configure to
update the correction field. In addition, the PTP traps are not enabled
by default, software should enable it per port or for all ports.

The switch behaves like a transparent clock between CPU port and each
front panel port. If ingress correction is set on a port for a given packet
type, then when such a packet is received via the port, the current time
stamp is subtracted from the correction field. If egress correction is set
on a port for a given packet type, then when such a packet is transmitted
via the port, the current time stamp is added to the correction field.

The result is that as the packet ingresses through a port with ingress
correction enabled, and egresses through a port with egress correction
enabled, the PTP correction field is updated to reflect the time that the
packet spent in the ASIC.

This can be used to update the correction field of trapped packets by
enabling ingress correction on a port where time stamping was enabled,
and egress correction on the CPU port. Similarly, for packets transmitted
from the host, ingress correction should be enabled on the CPU port, and
egress correction on a front-panel port.

However, since the correction fields will be updated for all PTP packets
crossing the CPU port, in order not to mangle the correction field, the
front panel port involved in the packet transfer must have the
corresponding correction enabled as well.

Therefore, when HW timestamping is enabled on at least one port, we have
to configure hardware to update the correction field and trap PTP event
packets on all ports.

Add reference count as part of 'struct mlxsw_sp_ptp_state', to maintain
how many ports use HW timestamping. Handle the correction field
configuration only when the first port enables time stamping and when the
last port disables time stamping. Store the configuration as part of
'struct mlxsw_sp_ptp_state', as it is global for all ports.

The SIOCGHWTSTAMP ioctl is a getter for the current configuration,
implement it and use the global configuration.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-29 11:02:24 +01:00
Danielle Ratson 382ad0d957 mlxsw: spectrum: Support time stamping on Spectrum-2
As opposed to Spectrum-1, in which time stamps arrive through a pair of
dedicated events into a queue and later are being matched to the
corresponding packets, in Spectrum-2 we are reading the time stamps
directly from the CQE. Software can get the time stamp in UTC format
using CQEv2.

Add a time stamp field to 'struct mlxsw_skb_cb'. In
mlxsw_pci_cqe_{rdq,sdq}_handle() extract the time stamp from the CQE into
the new time stamp field. Note that the time stamp in the CQE is
represented by 38 bits, which is a short representation of UTC time.
Software should create the full time stamp using the global UTC clock.
Read UTC clock from hardware only for PTP packets which were trapped to CPU
with PTP0 trap ID (event packets).

Use the time stamp from the SKB when packet is received or transmitted.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-29 11:02:24 +01:00
Danielle Ratson 24157bc69f mlxsw: Send PTP packets as data packets to overcome a limitation
In Spectrum-2 and Spectrum-3, the correction field of PTP packets which are
sent as control packets is not updated at egress port. To overcome this
limitation, PTP packets which require time stamp, should be sent as data
packets with the following details:
1. FID valid = 1
2. FID value above the maximum FID
3. rx_router_port = 1

>From Spectrum-4 and on, this limitation will be solved.

Extend the function which handles TX header, in case that the packet is
a PTP packet, add TX header with type=data and all the above mentioned
requirements. Add operation as part of 'struct mlxsw_sp_ptp_ops', to be
able to separate the handling of PTP packets between different ASICs. Use
the data packet solution only for Spectrum-2 and Spectrum-3. Therefore, add
a dedicated operation structure for Spectrum-4, as it will be same to
Spectrum-2 in PTP implementation, just will not have the limitation of
control packets.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-29 11:02:23 +01:00
Danielle Ratson a5bf8e5e8b mlxsw: spectrum_ptp: Add implementation for physical hardware clock operations
Implement physical hardware clock operations. The main difference between
the existing operations of Spectrum-1 and the new operations of Spectrum-2
is the usage of UTC hardware clock instead of FRC.

Add support for init() and fini() functions for PTP clock in Spectrum-2.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-29 11:02:23 +01:00
Danielle Ratson bbd300570a mlxsw: Query UTC sec and nsec PCI offsets and values
Query UTC sec and nsec PCI offsets during the pci_init(), to be able to
read UTC time later.

Implement functions to read UTC seconds and nanoseconds from the offset
which was read as part of initialization.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-29 11:02:23 +01:00
Danielle Ratson d25ff63a18 mlxsw: spectrum_ptp: Add PTP initialization / finalization for Spectrum-2
Lay the groundwork for Spectrum-2 support. On Spectrum-2, the packets get
the time stamps from the CQE, which means that the time stamp is attached
to its packet.

Configure MTPTPT to set which message types should arrive under which
PTP trap. PTP0 will be used for event message types, which means that
the packets require time stamp. PTP1 will be used for other packets.

Note that in Spectrum-2, all packets contain time stamp by default. The two
types of traps (PTP0, PTP1) will be used to separate between PTP_EVENT
traps and PTP_GENERAL traps, so then the driver will fill the time stamp as
part of the SKB only for event message types.

Later the driver will enable the traps using 'MTPCPC.ptp_trap_en' bit.
Then, PTP packets start arriving through the PTP traps.

Currently, the structure 'mlxsw_sp2_ptp_state' contains only the common
structure, the next patches will extend it.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-29 11:02:23 +01:00
Danielle Ratson 42823208b9 mlxsw: Support CQEv2 for SDQ in Spectrum-2 and newer ASICs
Currently, Tx completions are reported using Completion Queue Element
version 1 (CQEv1). These elements do not contain the Tx time stamp,
which is fine as Spectrum-1 reads Tx time stamps via a dedicated FIFO
and Spectrum-2 does not currently support PTP.

In preparation for Spectrum-2 PTP support, use CQEv2 for Spectrum-2 and
newer ASICs, as this CQE format encodes the Tx time stamp.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-29 11:02:23 +01:00
Amit Cohen 37b62b282b mlxsw: spectrum_ptp: Add helper functions to configure PTP traps
MTPTPT register is used to set which message types should arrive under
which PTP trap. Currently, PTP0 is used for event message types, which
means that the packets require time stamp. PTP1 is used for other packets.

This configuration will be same for Spectrum-2 and newer ASICs. In
preparation for Spectrum-2 PTP support, add helper functions to
configure PTP traps and use them for Spectrum-1. These functions will be
used later also for Spectrum-2.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-29 11:02:23 +01:00
Yang Li 707e304dd2 mlxsw: core_linecards: Remove duplicated include in core_linecard_dev.c
Fix following includecheck warning:
./drivers/net/ethernet/mellanox/mlxsw/core_linecard_dev.c: linux/err.h is included more than once.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20220727233801.23781-1-yang.lee@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 22:19:41 -07:00
Moshe Shemesh c90005b5f7 devlink: Hold the instance lock in health callbacks
Let the core take the devlink instance lock around health callbacks and
remove the now redundant locking in the drivers.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 21:58:47 -07:00
Moshe Shemesh d3dbdc9f8d net/mlx5: Lock mlx5 devlink health recovery callback
Change devlink instance locks in mlx5 driver to have devlink health
recovery callback locked, while keeping all driver paths which lead to
devl_ API functions called by the driver locked.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 21:58:47 -07:00
Moshe Shemesh 60d7ceea4b net/mlx4: Lock mlx4 devlink reload callback
Change devlink instance locks in mlx4 driver to have devlink reload
callback locked, while keeping all driver paths which leads to devl_ API
functions called by the mlx4 driver locked.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 21:58:47 -07:00
Moshe Shemesh a8c05514b2 net/mlx4: Use devl_ API for devlink port register / unregister
Use devl_ API to call devl_port_register() and devl_port_unregister()
instead of devlink_port_register() and devlink_port_unregister(). Add
devlink instance lock in mlx4 driver paths to these functions.

This will be used by the downstream patch to invoke mlx4 devlink reload
callbacks with devlink lock held.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 21:58:47 -07:00
Moshe Shemesh 9cb7e94a78 net/mlx4: Use devl_ API for devlink region create / destroy
Use devl_ API to call devl_region_create() and devl_region_destroy()
instead of devlink_region_create() and devlink_region_destroy().
Add devlink instance lock in mlx4 driver paths to these functions.

This will be used by the downstream patch to invoke mlx4 devlink reload
callbacks with devlink lock held.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 21:58:46 -07:00
Moshe Shemesh 84a433a40d net/mlx5: Lock mlx5 devlink reload callbacks
Change devlink instance locks in mlx5 driver to have devlink reload
callbacks locked, while keeping all driver paths which lead to devl_ API
functions called by the driver locked.

Add mlx5_load_one_devl_locked() and mlx5_unload_one_devl_locked() which
are used by the paths which are already locked such as devlink reload
callbacks.

This patch makes the driver use devl_ API also for traps register as
these functions are called from the driver paths parallel to reload that
requires locking now.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 21:58:46 -07:00
Moshe Shemesh c12f4c6ac3 net/mlx5: Move fw reset unload to mlx5_fw_reset_complete_reload
Refactor fw reset code to have the unload driver part done on
mlx5_fw_reset_complete_reload(), so if it was called by the PF which
initiated the reload fw activate flow, the unload part will be handled
by the mlx5_devlink_reload_fw_activate() callback itself and not by the
reset event work.

This will be used by the downstream patch to invoke devlink reload
callbacks with devlink lock held.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 21:58:46 -07:00
vikas 5b6ff128fd bnxt_en: implement callbacks for devlink selftests
Add callbacks
=============
.selftest_check: returns true for flash selftest.
.selftest_run: runs a flash selftest.

Also, refactor NVM APIs so that they can be
used with devlink and ethtool both.

Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 21:56:53 -07:00
Tariq Toukan 624bf09921 net/mlx5e: kTLS, Dynamically re-size TX recycling pool
Let the TLS TX recycle pool be more flexible in size, by continuously
and dynamically allocating and releasing HW resources in response to
changes in the connections rate and load.

Allocate and release pool entries in bulks (16). Use a workqueue to
release/allocate in the background. Allocate a new bulk when the pool
size goes lower than the low threshold (1K). Symmetric operation is done
when the pool size gets greater than the upper threshold (4K).

Every idle pool entry holds: 1 TIS, 1 DEK (HW resources), in addition to
~100 bytes in host memory.

Start with an empty pool to minimize memory and HW resources waste for
non-TLS users that have the device-offload TLS enabled.

Upon a new request, in case the pool is empty, do not wait for a whole bulk
allocation to complete.  Instead, trigger an instant allocation of a single
resource to reduce latency.

Performance tests:
Before: 11,684 CPS
After:  16,556 CPS

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 21:50:55 -07:00
Tariq Toukan c4dfe704f5 net/mlx5e: kTLS, Recycle objects of device-offloaded TLS TX connections
The transport interface send (TIS) object is responsible for performing
all transport related operations of the transmit side.  The ConnectX HW
uses a TIS object to save and access the TLS crypto information and state
of an offloaded TX kTLS connection.

Before this patch, we used to create a new TIS per connection and destroy
it once it’s closed. Every create and destroy of a TIS is a FW command.

Same applies for the private TLS context, where we used to dynamically
allocate and free it per connection.

Resources recycling reduce the impact of the allocation/free operations
and helps speeding up the connection rate.

In this feature we maintain a pool of TX objects and use it to recycle
the resources instead of re-creating them per connection.

A cached TIS popped from the pool is updated to serve the new connection
via the fast-path HW interface, updating the tls static and progress
params. This is a very fast operation, significantly faster than FW
commands.

On recycling, a WQE fence is required after the context params change.
This guarantees that the data is sent after the context has been
successfully updated in hardware, and that the context modification
doesn't interfere with existing traffic.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 21:50:55 -07:00
Tariq Toukan 23b1cf1e3f net/mlx5e: kTLS, Take stats out of OOO handler
Let the caller of mlx5e_ktls_tx_handle_ooo() take care of updating the
stats, according to the returned value.  As the switch/case blocks are
already there, this change saves unnecessary branches in the handler.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 21:50:54 -07:00
Tariq Toukan da6682faa8 net/mlx5e: kTLS, Introduce TLS-specific create TIS
TLS TIS objects have a defined role in mapping and reaching the HW TLS
contexts.  Some standard TIS attributes (like LAG port affinity) are
not relevant for them.

Use a dedicated TLS TIS create function instead of the generic
mlx5e_create_tis.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 21:50:54 -07:00
Jakub Kicinski 272ac32f56 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 18:21:16 -07:00
Lama Kayal 069448b2fd net/mlx5e: Move mlx5e_init_l2_addr to en_main
Move the function declaration of mlx5e_init_l2_addr to en/fs.h, rename
to mlx5e_fs_init_l2_addr to align with the fs API functions naming
convention and let it take mlx5e_flow_steering as arguments while keeping
implementation at en_fs.c file. This helps maintain a clean driver code
and avoids unnecessary dependencies.

Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:55:30 -07:00
Lama Kayal a02c07ea5d net/mlx5e: Split en_fs ndo's and move to en_main
Add inner callee for ndo mlx5e_vlan_rx_add_vid and
mlx5e_vlan_rx_kill_vid, to separate the priv usage from other
flow steering flows.

Move wrapper ndo's into en_main, and split the rest of the functionality
into a separate part inside en_fs.

Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:55:29 -07:00
Lama Kayal 5b031add2f net/mlx5e: Separate mlx5e_set_rx_mode_work and move caller to en_main
Separate mlx5e_set_rx_mode into two, and move caller to en_main while
keeping implementation in en_fs in the newly declared function
mlx5e_fs_set_rx_mode. This; to minimize the coupling of flow_steering
to priv.

Add a parallel boolean member vlan_strip_disable to
mlx5e_flow_steering that's updated similarly as its identical in priv,
thus making it possible to adjust the rx_mode work handler to current
changes.

Also, add state_destroy boolean to mlx5e_flow_steering struct which
replaces the old check : !test_bit(MLX5E_STATE_DESTROYING, &priv->state).
This state member is updated accordingly prior to
INIT_WORK(mlx5e_set_rx_mode_work), This is done for similar purposes as
mentioned earlier and to minimize argument passings.

Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:55:29 -07:00
Lama Kayal 7bb7071568 net/mlx5e: Add mdev to flow_steering struct
Make flow_steering struct contain mlx5_core_dev such that
it becomes self contained and easier to decouple later on this series.
Let its values be initialized in mlx5e_fs_init().

Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:55:29 -07:00
Lama Kayal 6a7bc5d0e1 net/mlx5e: Report flow steering errors with mdev err report API
Let en_fs report errors via mdev error report API, aka mlx5_core_*
macros, thus replace the netdev API reports.
This to minimize netdev coupling to the flow steering struct.

Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:55:28 -07:00
Lama Kayal af8bbf7300 net/mlx5e: Convert mlx5e_flow_steering member of mlx5e_priv to pointer
Make mlx5e_flow_steering member of mlx5e_priv a pointer.
Add dynamic allocation respectively.

Allocate fs for all profiles when initializing profile,
symmetrically deallocate at profile cleanup.

Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:55:28 -07:00
Lama Kayal 454533aa87 net/mlx5e: Allocate VLAN and TC for featured profiles only
Introduce allocation and de-allocation functions for both flow steering
VLAN and TC as part of fs API.
Add allocations of VLAN and TC as nic profile feature, such that
fs_init() will allocate both VLAN and TC only if they're featured in
the profile. VLAN and TC are relevant for nic_profile only.

Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:55:28 -07:00
Lama Kayal 23bde065c3 net/mlx5e: Make mlx5e_tc_table private
Move mlx5e_tc_table struct to en_tc.c thus make it private.
Introduce allocation and deallocation functions as part of the tc API
to allow this switch smoothly.

Convert mlx5e_nic_chain() macro to a function of en_tc.c.

Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:55:27 -07:00
Lama Kayal 65f586c273 net/mlx5e: Convert mlx5e_tc_table member of mlx5e_flow_steering to pointer
Make fs.tc be a pointer and allocate it dynamically.
Add mlx5e_priv pointer to mlx5e_tc_table, and thus get a work-around to
accessing priv via tc when handling tc events inside mlx5e_tc_netdev_event.

Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:55:27 -07:00
Roi Dayan 7d1a5ce46e net/mlx5e: TC, Support tc action api for police
Add support for tc action api for police.
Offloading standalone police action without
a tc rule and reporting stats.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:55:27 -07:00
Roi Dayan f8e9d413a2 net/mlx5e: TC, Separate get/update/replace meter functions
mlx5e_tc_meter_get() to get an existing meter.
mlx5e_tc_meter_update() to update an existing meter without refcount.
mlx5e_tc_meter_replace() to get/create a meter and update if needed.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:55:26 -07:00
Roi Dayan b50ce4350c net/mlx5e: Add red and green counters for metering
Add red and green counters per meter instance.
TC police action is implemented as a meter instance.
The meter counters represent the police action
notexceed/exceed counters.
TC rules using the same meter instance will use
the same counters.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:55:26 -07:00
Roi Dayan e5b1db2741 net/mlx5e: TC, Allocate post meter ft per rule
To support a TC police action notexceed counter and supporting
actions other than drop/pipe there is a need to create separate ft
and rules per rule and not to use a common one created on eswitch init.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:55:26 -07:00
Yevgeny Kliteynik 8920d92b8b net/mlx5: DR, Add support for flow metering ASO
Add support for ASO action of type flow metering
on device that supports STEv1.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Hamdan Igbaria <hamdani@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:55:25 -07:00
Gal Pressman ec082d31c1 net/mlx5e: Fix wrong use of skb_tcp_all_headers() with encapsulation
Use skb_inner_tcp_all_headers() instead of skb_tcp_all_headers() when
transmitting an encapsulated packet in mlx5e_tx_get_gso_ihs().

Fixes: 504148fedb ("net: add skb_[inner_]tcp_all_headers helpers")
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:55:25 -07:00
Shay Drory 42b4f7f66a net/mlx5: Fix driver use of uninitialized timeout
Currently, driver is setting default values to all timeouts during
function setup. The offending commit is using a timeout before
function setup, meaning: the timeout is 0 (or garbage), since no
value have been set.
This may result in failure to probe the driver:
mlx5_function_setup:1034:(pid 69850): Firmware over 4294967296 MS in pre-initializing state, aborting
probe_one:1591:(pid 69850): mlx5_init_one failed with error code -16

Hence, set default values to timeouts during tout_init()

Fixes: 37ca95e62e ("net/mlx5: Increase FW pre-init timeout for health recovery")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:44:41 -07:00
Yevgeny Kliteynik 62d2664351 net/mlx5: DR, Fix SMFS steering info dump format
Fix several issues in SMFS steering info dump:
 - Fix outdated macro value for matcher mask in the SMFS debug dump format.
   The existing value denotes the old format of the matcher mask, as it was
   used during the early stages of development, and it results in wrong
   parsing by the steering dump parser - wrong fields are shown in the
   parsed output.
 - Add the missing destination table to the dumped action.
   The missing dest table handle breaks the ability to associate between
   the "go to table" action and the actual table in the steering info.

Fixes: 9222f0b27d ("net/mlx5: DR, Add support for dumping steering info")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Muhammad Sammar <muhammads@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:44:38 -07:00
Maher Sanalla a6e9085d79 net/mlx5: Adjust log_max_qp to be 18 at most
The cited commit limited log_max_qp to be 17 due to FW capabilities.
Recently, it turned out that there are old FW versions that supported
more than 17, so the cited commit caused a degradation.

Thus, set the maximum log_max_qp back to 18 as it was before the
cited commit.

Fixes: 7f839965b2 ("net/mlx5: Update log_max_qp value to be 17 at most")
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:44:35 -07:00
Vlad Buslov c0063a4370 net/mlx5e: Modify slow path rules to go to slow fdb
While extending available range of supported chains/prios referenced commit
also modified slow path rules to go to FT chain instead of actual slow FDB.
However neither of existing users of the MLX5_ATTR_FLAG_SLOW_PATH
flag (tunnel encap entries with invalid encap and flows with trap action)
need to match on FT chain. After bridge offload was implemented packets of
such flows can also be matched by bridge priority tables which is
undesirable. Restore slow path flows implementation to redirect packets to
slow_fdb.

Fixes: 278d51f243 ("net/mlx5: E-Switch, Increase number of chains and priorities")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:44:32 -07:00
Maxim Mikityanskiy 677e78c8d4 net/mlx5e: Fix calculations related to max MPWQE size
Before commit 76c31e5f75 ("net/mlx5e: Use FW limitation for max MPW
WQEBBs"), the maximum size of MPWQE in WQEBBs was hardcoded as a driver
constant. That commit started using the firmware capability that can
further limit the size, however, it unintentionally changed a few
things:

1. The calculation of MLX5E_MAX_KLM_PER_WQE used the size in DS, which
was replaced by the size in WQEBBs, making the resulting value 4 times
smaller.

2. MLX5E_TX_MPW_MAX_WQEBBS used to be aligned to the cache line size
(either 64 or 128 bytes, i.e. 1 or 2 WQEBBs), but it's no longer the
case if the firmware capability is smaller than the driver maximum.

Fix both issues by using the correct units for MLX5E_MAX_KLM_PER_WQE and
by aligning mlx5e_get_sw_max_sq_mpw_wqebbs after taking the minimum.

Besides fixing the arithmetics in calculation of MLX5E_MAX_KLM_PER_WQE,
also use appropriate constants: `size of BSF * num of DS per WQEBB *
number of WQEBBs` (the calculation before the blamed commit) doesn't
make much sense to calculate the WQE size in bytes, so just use `size of
WQEBB * number of WQEBBs`.

While at it, replace the types that hold the number of WQEBBs by u8.
These values don't exceed 16, and it allows to fill holes in two
structs.

Fixes: 76c31e5f75 ("net/mlx5e: Use FW limitation for max MPW WQEBBs")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:44:29 -07:00
Maxim Mikityanskiy 52586d2f56 net/mlx5e: xsk: Account for XSK RQ UMRs when calculating ICOSQ size
ICOSQ is used to post UMR WQEs for both regular RQ and XSK RQ. However,
space in ICOSQ is reserved only for the regular RQ, which may cause
ICOSQ overflows when using XSK (the most risk is on activating
channels).

This commit fixes the issue by reserving space for XSK UMR WQEs as well.
As XSK may be enabled without restarting the channel and recreating the
ICOSQ, this space is reserved unconditionally.

Fixes: db05815b36 ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:44:26 -07:00
Maxim Mikityanskiy 562696c3c6 net/mlx5e: Fix the value of MLX5E_MAX_RQ_NUM_MTTS
MLX5E_MAX_RQ_NUM_MTTS should be the maximum value, so that
MLX5_MTT_OCTW(MLX5E_MAX_RQ_NUM_MTTS) fits into u16. The current value of
1 << 17 results in MLX5_MTT_OCTW(1 << 17) = 1 << 16, which doesn't fit
into u16. This commit replaces it with the maximum value that still
fits u16.

Fixes: 73281b78a3 ("net/mlx5e: Derive Striding RQ size from MTU")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:44:24 -07:00
Maor Dickman 903f2194f7 net/mlx5e: TC, Fix post_act to not match on in_port metadata
The cited commit changed CT to use multi table actions post act infrastructure instead
of using it own post act infrastructure, this broke decap during VF tunnel offload
(Stack devices) with CT due to wrong match on in_port metadata in the post act table.
This changed only broke VF tunnel offload because it modify the packet in_port metadata
to be VF metadata and it isn't propagate the post act creation.

Fixed by modify post act rules to match only on fte_id and not match on in_port metadata
which isn't needed.

Fixes: a81283263b ("net/mlx5e: Use multi table support for CT and sample actions")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:44:21 -07:00
Gal Pressman 115d9f95ea net/mlx5e: Remove WARN_ON when trying to offload an unsupported TLS cipher/version
The driver reports whether TX/RX TLS device offloads are supported, but
not which ciphers/versions, these should be handled by returning
-EOPNOTSUPP when .tls_dev_add() is called.

Remove the WARN_ON kernel trace when the driver gets a request to
offload a cipher/version that is not supported as it is expected.

Fixes: d2ead1f360 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-28 13:44:18 -07:00
Maciej Fijalkowski 44ece4e1a3 ice: allow toggling loopback mode via ndo_set_features callback
Add support for NETIF_F_LOOPBACK. This feature can be set via:
$ ethtool -K eth0 loopback <on|off>

Feature can be useful for local data path tests.

Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28 11:44:40 -07:00
Maciej Fijalkowski c67672fa26 ice: compress branches in ice_set_features()
Instead of rather verbose comparison of current netdev->features bits vs
the incoming ones from user, let us compress them by a helper features
set that will be the result of netdev->features XOR features. This way,
current, extensive branches:

	if (features & NETIF_F_BIT && !(netdev->features & NETIF_F_BIT))
		set_feature(true);
	else if (!(features & NETIF_F_BIT) && netdev->features & NETIF_F_BIT)
		set_feature(false);

can become:

	netdev_features_t changed = netdev->features ^ features;

	if (changed & NETIF_F_BIT)
		set_feature(!!(features & NETIF_F_BIT));

This is nothing new as currently several other drivers use this
approach, which I find much more convenient.

Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28 11:44:40 -07:00
Michal Wilczynski a419526de6 ice: Fix promiscuous mode not turning off
When trust is turned off for the VF, the expectation is that promiscuous
and allmulticast filters are removed. Currently default VSI filter is not
getting cleared in this flow.

Example:

ip link set enp236s0f0 vf 0 trust on
ip link set enp236s0f0v0 promisc on
ip link set enp236s0f0 vf 0 trust off
/* promiscuous mode is still enabled on VF0 */

Remove switch filters for both cases.
This commit fixes above behavior by removing default VSI filters and
allmulticast filters when vf-true-promisc-support is OFF.

Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28 11:44:40 -07:00
Michal Wilczynski d7393425e7 ice: Introduce enabling promiscuous mode on multiple VF's
In current implementation default VSI switch filter is only able to
forward traffic to a single VSI. This limits promiscuous mode with
private flag 'vf-true-promisc-support' to a single VF. Enabling it on
the second VF won't work. Also allmulticast support doesn't seem to be
properly implemented when vf-true-promisc-support is true.

Use standard ice_add_rule_internal() function that already implements
forwarding to multiple VSI's instead of constructing AQ call manually.

Add switch filter for allmulticast mode when vf-true-promisc-support is
enabled. The same filter is added regardless of the flag - it doesn't
matter for this case.

Remove unnecessary fields in switch structure. From now on book keeping
will be done by ice_add_rule_internal().

Refactor unnecessarily passed function arguments.

To test:
1) Create 2 VM's, and two VF's. Attach VF's to VM's.
2) Enable promiscuous mode on both of them and check if
   traffic is seen on both of them.

Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28 11:44:22 -07:00
Jacob Keller d8fae2504e igb: convert .adjfreq to .adjfine
The 82576 PTP implementation still uses .adjfreq instead of using the newer
.adjfine.

This implementation uses a pre-simplified calculation since the base
increment value for the 82576 is just 16 * 2^19. Converting this into
scaled_ppm is tricky, and makes the intent a bit less clear.

Simply convert to the normal flow of multiplying the base increment value
by the scaled_ppm and then dividing by 1000000ULL << 16. This can be
implemented using mul_u64_u64_div_u64 which can avoid the possible overflow
that might occur for large adjustments.

Use of .adjfine can improve the precision of small adjustments and gets us
one driver closer to removing the old implementation from the kernel
entirely.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28 11:03:01 -07:00
Jacob Keller 5a5542324a ixgbe: convert .adjfreq to .adjfine
Convert the ixgbe PTP frequency adjustment implementations from .adjfreq to
.adjfine. This allows using the scaled parts per million adjustment from
the PTP core and results in a more precise adjustment for small
corrections.

To avoid overflow, use mul_u64_u64_div_u64 to perform the calculation. On
X86 platforms, this will use instructions that perform the operations with
128bit intermediate values. For other architectures, the implementation
will limit the loss of precision as much as possible.

This change slightly improves the precision of frequency adjustments for
all ixgbe based devices, and gets us one driver closer to being able to
remove the older .adjfreq implementation from the kernel.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28 11:02:02 -07:00
Jacob Keller ccd3bf9859 i40e: convert .adjfreq to .adjfine
The i40e driver currently implements the .adjfreq handler for frequency
adjustments. This takes the adjustment parameter in parts per billion. The
PTP core supports .adjfine which provides an adjustment in scaled parts per
million. This has a higher resolution and can result in more precise
adjustments for small corrections.

Convert the existing .adjfreq implementation to the newer .adjfine
implementation. This is trivial since it just requires changing the divisor
from 1000000000ULL to (1000000ULL << 16) in the mul_u64_u64_div_u64 call.

This improves the precision of the adjustments and gets us one driver
closer to removing the old .adjfreq support from the kernel.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28 10:59:38 -07:00
Jacob Keller 3626a690b7 i40e: use mul_u64_u64_div_u64 for PTP frequency calculation
The i40e device has a different clock rate depending on the current link
speed. This requires using a different increment rate for the PTP clock
registers. For slower link speeds, the base increment value is larger.
Directly multiplying the larger increment value by the parts per billion
adjustment might overflow.

To avoid this, the i40e implementation defaults to using the lower
increment value and then multiplying the adjustment afterwards. This causes
a loss of precision for lower link speeds.

We can fix this by using mul_u64_u64_div_u64 instead of performing the
multiplications using standard C operations. On X86, this will use special
instructions that perform the multiplication and division with 128bit
intermediate values. For other architectures, the fallback implementation
will limit the loss of precision for large values. Small adjustments don't
overflow anyways and won't lose precision at all.

This allows first multiplying the base increment value and then performing
the adjustment calculation, since we no longer fear overflowing. It also
makes it easier to convert to the even more precise .adjfine implementation
in a following change.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28 10:58:30 -07:00
Jacob Keller abab010f16 e1000e: convert .adjfreq to .adjfine
The PTP implementation for the e1000e driver uses the older .adjfreq
method. This method takes an adjustment in parts per billion. The newer
.adjfine implementation uses scaled_ppm. The use of scaled_ppm allows for
finer grained adjustments and is preferred over using the older
implementation.

Make use of mul_u64_u64_div_u64 in order to handle possible overflow of the
multiplication used to calculate the desired adjustment to the hardware
increment value.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28 10:57:06 -07:00
Jacob Keller ab8e8db27e e1000e: remove unnecessary range check in e1000e_phc_adjfreq
The e1000e_phc_adjfreq function validates that the input delta is within
the maximum range. This is already handled by the core PTP code and this is
a duplicate and thus unnecessary check. It also complicates refactoring to
use the newer .adjfine implementation, where the input is no longer
specified in parts per billion. Remove the range validation check.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28 10:55:56 -07:00
Jacob Keller 4488df1401 ice: implement adjfine with mul_u64_u64_div_u64
The PTP frequency adjustment code needs to determine an appropriate
adjustment given an input scaled_ppm adjustment.

We calculate the adjustment to the register by multiplying the base
(nominal) increment value by the scaled_ppm and then dividing by the
scaled one million value.

For very large adjustments, this might overflow. To avoid this, both the
scaled_ppm and divisor values are downshifted.

We can avoid that on X86 architectures by using mul_u64_u64_div_u64. This
helper function will perform the multiplication and division with 128bit
intermediate values. We know that scaled_ppm is never larger than the
divisor so this operation will never result in an overflow.

This improves the accuracy of the calculations for large adjustment values
on X86. It is likely an improvement on other architectures as well because
the default implementation of mul_u64_u64_div_u64 is smarter than the
original approach taken in the ice code.

Additionally, this implementation is easier to read, using fewer local
variables and lines of code to implement.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28 10:54:47 -07:00
Dan Carpenter 4d3d3a1b24 stmmac: dwmac-mediatek: fix resource leak in probe
If mediatek_dwmac_clks_config() fails, then call stmmac_remove_config_dt()
before returning.  Otherwise it is a resource leak.

Fixes: fa4b3ca60e ("stmmac: dwmac-mediatek: fix clock issue")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/YuJ4aZyMUlG6yGGa@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 10:43:04 -07:00
Krzysztof Kozlowski 623cd87006 net: cdns,macb: use correct xlnx prefix for Xilinx
Use correct vendor for Xilinx versions of Cadence MACB/GEM Ethernet
controller.  The Versal compatible was not released, so it can be
changed.  Zynq-7xxx and Ultrascale+ has to be kept in new and deprecated
form.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Harini Katakam <harini.katakam@amd.com>
Link: https://lore.kernel.org/r/20220726070802.26579-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-28 13:08:53 +02:00
Dimitris Michailidis 51a83391d7 net/funeth: Fix fun_xdp_tx() and XDP packet reclaim
The current implementation of fun_xdp_tx(), used for XPD_TX, is
incorrect in that it takes an address/length pair and later releases it
with page_frag_free(). It is OK for XDP_TX but the same code is used by
ndo_xdp_xmit. In that case it loses the XDP memory type and releases the
packet incorrectly for some of the types. Assorted breakage follows.

Change fun_xdp_tx() to take xdp_frame and rely on xdp_return_frame() in
reclaim.

Fixes: db37bc177d ("net/funeth: add the data path")
Signed-off-by: Dimitris Michailidis <dmichail@fungible.com>
Link: https://lore.kernel.org/r/20220726215923.7887-1-dmichail@fungible.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-28 12:54:10 +02:00
Paolo Abeni 7d85e9cb40 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
ice: PPPoE offload support

Marcin Szycik says:

Add support for dissecting PPPoE and PPP-specific fields in flow dissector:
PPPoE session id and PPP protocol type. Add support for those fields in
tc-flower and support offloading PPPoE. Finally, add support for hardware
offload of PPPoE packets in switchdev mode in ice driver.

Example filter:
tc filter add dev $PF1 ingress protocol ppp_ses prio 1 flower pppoe_sid \
    1234 ppp_proto ip skip_sw action mirred egress redirect dev $VF1_PR

Changes in iproute2 are required to use the new fields (will be submitted
soon).

ICE COMMS DDP package is required to create a filter in ice.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: Add support for PPPoE hardware offload
  flow_offload: Introduce flow_match_pppoe
  net/sched: flower: Add PPPoE filter
  flow_dissector: Add PPPoE dissectors
====================

Link: https://lore.kernel.org/r/20220726203133.2171332-1-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-28 11:54:56 +02:00
Paolo Abeni 4158e38967 Revert "Merge branch 'octeontx2-minor-tc-fixes'"
This reverts commit 35d099da41, reversing
changes made to 58d8bcd47e.

I wrongly applied that to the net-next tree instead of the intended
target tree (net). Reverting it on net-next.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-28 10:55:42 +02:00
Jakub Kicinski bf84719df7 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-07-26

This series contains updates to ice driver only.

Przemyslaw corrects accounting for VF VLANs to allow for correct number
of VLANs for untrusted VF. He also correct issue with checksum offload
on VXLAN tunnels.

Ani allows for two VSIs to share the same MAC address.

Maciej corrects checked bits for descriptor completion of loopback

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: do not setup vlan for loopback VSI
  ice: check (DD | EOF) bits on Rx descriptor rather than (EOP | RS)
  ice: Fix VSIs unable to share unicast MAC
  ice: Fix tunnel checksum offload with fragmented traffic
  ice: Fix max VLANs available for VF
====================

Link: https://lore.kernel.org/r/20220726204646.2171589-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-27 19:56:28 -07:00
Alejandro Lucero 67c3b611d9 sfc: disable softirqs for ptp TX
Sending a PTP packet can imply to use the normal TX driver datapath but
invoked from the driver's ptp worker. The kernel generic TX code
disables softirqs and preemption before calling specific driver TX code,
but the ptp worker does not. Although current ptp driver functionality
does not require it, there are several reasons for doing so:

   1) The invoked code is always executed with softirqs disabled for non
      PTP packets.
   2) Better if a ptp packet transmission is not interrupted by softirq
      handling which could lead to high latencies.
   3) netdev_xmit_more used by the TX code requires preemption to be
      disabled.

Indeed a solution for dealing with kernel preemption state based on static
kernel configuration is not possible since the introduction of dynamic
preemption level configuration at boot time using the static calls
functionality.

Fixes: f79c957a0b ("drivers: net: sfc: use netdev_xmit_more helper")
Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Link: https://lore.kernel.org/r/20220726064504.49613-1-alejandro.lucero-palau@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-27 18:20:43 -07:00
Jiri Pirko 9ca6a7a5f4 mlxsw: core_linecards: Implement line card device flashing
Implement flash_update() devlink op for the line card devlink instance
to allow user to update line card gearbox FW using MDDT register
and mlxfw.

Example:
$ devlink dev flash auxiliary/mlxsw_core.lc.0 file mellanox/fw-AGB-rel-19_2010_1312-022-EVB.mfa2

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26 13:56:44 -07:00
Jiri Pirko 3fc0c51905 mlxsw: core_linecards: Expose device PSID over device info
Use tunneled MGIR to obtain PSID of line card device and extend
device_info_get() op to fill up the info with that.

Example:

$ devlink dev info auxiliary/mlxsw_core.lc.0
auxiliary/mlxsw_core.lc.0:
  versions:
      fixed:
        hw.revision 0
        fw.psid MT_0000000749
      running:
        ini.version 4
        fw 19.2010.1312

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26 13:56:36 -07:00
Jiri Pirko 8f9b0513a9 mlxsw: reg: Add Management DownStream Device Tunneling Register
The MDDT register allows to deliver query and request messages (PRM
registers, commands) to a DownStream device.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26 13:56:10 -07:00
Jiri Pirko 4da0eb2a75 mlxsw: core_linecards: Probe active line cards for devices and expose FW version
In case the line card is active, go over all possible existing
devices (gearboxes) on it and expose FW version of the flashable one.

Example:

$ devlink dev info auxiliary/mlxsw_core.lc.0
auxiliary/mlxsw_core.lc.0:
  versions:
      fixed:
        hw.revision 0
      running:
        ini.version 4
        fw 19.2010.1312

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26 13:56:10 -07:00
Jiri Pirko 4ea07cf638 mlxsw: reg: Extend MDDQ by device_info
Extend existing MDDQ register by possibility to query information about
devices residing on a line card.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26 13:56:10 -07:00
Jiri Pirko 5ba325fec5 mlxsw: core_linecards: Expose HW revision and INI version
Implement info_get() to expose HW revision of a linecard and loaded INI
version.

Example:

$ devlink dev info auxiliary/mlxsw_core.lc.0
auxiliary/mlxsw_core.lc.0:
  versions:
      fixed:
        hw.revision 0
      running:
        ini.version 4

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26 13:56:06 -07:00
Jiri Pirko bd02fd76d1 mlxsw: core_linecards: Introduce per line card auxiliary device
In order to be eventually able to expose line card gearbox version and
possibility to flash FW, model the line card as a separate device on
auxiliary bus.

Add the auxiliary device for provisioned line card in order to be able
to expose provisioned line card info over devlink dev info. When the
line card becomes active, there may be other additional info added to
the output.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26 13:50:51 -07:00
Maciej Fijalkowski cc019545a2 ice: do not setup vlan for loopback VSI
Currently loopback test is failiing due to the error returned from
ice_vsi_vlan_setup(). Skip calling it when preparing loopback VSI.

Fixes: 0e674aeb0b ("ice: Add handler for ethtool selftest")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-26 13:15:36 -07:00
Maciej Fijalkowski 283d736ff7 ice: check (DD | EOF) bits on Rx descriptor rather than (EOP | RS)
Tx side sets EOP and RS bits on descriptors to indicate that a
particular descriptor is the last one and needs to generate an irq when
it was sent. These bits should not be checked on completion path
regardless whether it's the Tx or the Rx. DD bit serves this purpose and
it indicates that a particular descriptor is either for Rx or was
successfully Txed. EOF is also set as loopback test does not xmit
fragmented frames.

Look at (DD | EOF) bits setting in ice_lbtest_receive_frames() instead
of EOP and RS pair.

Fixes: 0e674aeb0b ("ice: Add handler for ethtool selftest")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-26 13:15:36 -07:00
Anirudh Venkataramanan 5c8e3c7ff3 ice: Fix VSIs unable to share unicast MAC
The driver currently does not allow two VSIs in the same PF domain
to have the same unicast MAC address. This is incorrect in the sense
that a policy decision is being made in the driver when it must be
left to the user. This approach was causing issues when rebooting
the system with VFs spawned not being able to change their MAC addresses.
Such errors were present in dmesg:

[ 7921.068237] ice 0000:b6:00.2 ens2f2: Unicast MAC 6a:0d:e4:70:ca:d1 already
exists on this PF. Preventing setting VF 7 unicast MAC address to 6a:0d:e4:70:ca:d1

Fix that by removing this restriction. Doing this also allows
us to remove some additional code that's checking if a unicast MAC
filter already exists.

Fixes: 47ebc7b024 ("ice: Check if unicast MAC exists before setting VF MAC")
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-26 13:15:36 -07:00
Przemyslaw Patynowski 01658aeead ice: Fix tunnel checksum offload with fragmented traffic
Fix checksum offload on VXLAN tunnels.
In case, when mpls protocol is not used, set l4 header to transport
header of skb. This fixes case, when user tries to offload checksums
of VXLAN tunneled traffic.

Steps for reproduction (requires link partner with tunnels):
ip l s enp130s0f0 up
ip a f enp130s0f0
ip a a 10.10.110.2/24 dev enp130s0f0
ip l s enp130s0f0 mtu 1600
ip link add vxlan12_sut type vxlan id 12 group 238.168.100.100 dev enp130s0f0 dstport 4789
ip l s vxlan12_sut up
ip a a 20.10.110.2/24 dev vxlan12_sut
iperf3 -c 20.10.110.1 #should connect

Offload params: td_offset, cd_tunnel_params were
corrupted, due to l4 header pointing wrong address. NIC would then drop
those packets internally, due to incorrect TX descriptor data,
which increased GLV_TEPC register.

Fixes: 69e66c04c6 ("ice: Add mpls+tso support")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-26 13:15:36 -07:00
Przemyslaw Patynowski 1e308c6fb7 ice: Fix max VLANs available for VF
Legacy VLAN implementation allows for untrusted VF to have 8 VLAN
filters, not counting VLAN 0 filters. Current VLAN_V2 implementation
lowers available filters for VF, by counting in VLAN 0 filter for both
TPIDs.
Fix this by counting only non zero VLAN filters.
Without this patch, untrusted VF would not be able to access 8 VLAN
filters.

Fixes: cc71de8fa1 ("ice: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-26 13:15:18 -07:00
Marcin Szycik cd8efeeed1 ice: Add support for PPPoE hardware offload
Add support for creating PPPoE filters in switchdev mode. Add support
for parsing PPPoE and PPP-specific tc options: pppoe_sid and ppp_proto.

Example filter:
tc filter add dev $PF1 ingress protocol ppp_ses prio 1 flower pppoe_sid \
    1234 ppp_proto ip skip_sw action mirred egress redirect dev $VF1_PR

Changes in iproute2 are required to use the new fields.

ICE COMMS DDP package is required to create a filter as it contains PPPoE
profiles. Added a warning message when loaded DDP package does not contain
required profiles.

Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-26 10:56:32 -07:00
Subbaraya Sundeep 59e1be6f83 octeontx2-pf: Fix UDP/TCP src and dst port tc filters
Check the mask for non-zero value before installing tc filters
for L4 source and destination ports. Otherwise installing a
filter for source port installs destination port too and
vice-versa.

Fixes: 1d4d9e42c2 ("octeontx2-pf: Add tc flower hardware offload on ingress traffic")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-26 13:05:44 +02:00
Sunil Goutham b354eaeec8 octeontx2-pf: cn10k: Fix egress ratelimit configuration
NIX_AF_TLXX_PIR/CIR register format has changed from OcteonTx2
to CN10K. CN10K supports larger burst size. Fix burst exponent
and burst mantissa configuration for CN10K.

Also fixed 'maxrate' from u32 to u64 since 'police.rate_bytes_ps'
passed by stack is also u64.

Fixes: e638a83f16 ("octeontx2-pf: TC_MATCHALL egress ratelimiting offload")
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-26 13:05:44 +02:00
Subbaraya Sundeep d351c90ce2 octeontx2-pf: Fix UDP/TCP src and dst port tc filters
Check the mask for non-zero value before installing tc filters
for L4 source and destination ports. Otherwise installing a
filter for source port installs destination port too and
vice-versa.

Fixes: 1d4d9e42c2 ("octeontx2-pf: Add tc flower hardware offload on ingress traffic")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-26 12:51:52 +02:00
Sunil Goutham 5ec9c514d4 octeontx2-pf: cn10k: Fix egress ratelimit configuration
NIX_AF_TLXX_PIR/CIR register format has changed from OcteonTx2
to CN10K. CN10K supports larger burst size. Fix burst exponent
and burst mantissa configuration for CN10K.

Also fixed 'maxrate' from u32 to u64 since 'police.rate_bytes_ps'
passed by stack is also u64.

Fixes: e638a83f16 ("octeontx2-pf: TC_MATCHALL egress ratelimiting offload")
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-26 12:51:52 +02:00
wangjianli 58d8bcd47e sfc/siena: fix repeated words in comments
Delete the redundant word 'in'.

Signed-off-by: wangjianli <wangjianli@cdjrlc.com>
Link: https://lore.kernel.org/r/20220724075207.21080-1-wangjianli@cdjrlc.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-26 12:35:47 +02:00
wangjianli 63f1b471a0 sfc/falcon: fix repeated words in comments
Delete the redundant word 'in'.

Signed-off-by: wangjianli <wangjianli@cdjrlc.com>
Link: https://lore.kernel.org/r/20220724074746.19550-1-wangjianli@cdjrlc.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-26 12:35:37 +02:00
Christian Marangi 3470079687 net: ethernet: stmicro: stmmac: permit MTU change with interface up
Remove the limitation where the interface needs to be down to change
MTU by releasing and opening the stmmac driver to set the new MTU.
Also call the set_filter function to correctly init the port.
This permits to remove the EBUSY error while the ethernet port is
running permitting a correct MTU change if for example a DSA request
a MTU change for a switch CPU port.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-25 19:38:57 -07:00
Christian Marangi ba39b344e9 net: ethernet: stmicro: stmmac: generate stmmac dma conf before open
Rework the driver to generate the stmmac dma_conf before stmmac_open.
This permits a function to first check if it's possible to allocate a
new dma_config and then pass it directly to __stmmac_open and "open" the
interface with the new configuration.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-25 19:38:57 -07:00
Christian Marangi 8531c80800 net: ethernet: stmicro: stmmac: move dma conf to dedicated struct
Move dma buf conf to dedicated struct. This in preparation for code
rework that will permit to allocate separate dma_conf without affecting
the priv struct.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-25 19:38:57 -07:00
Christian Marangi 7028471edb net: ethernet: stmicro: stmmac: first disable all queues and disconnect in release
Disable all queues and disconnect before tx_disable in stmmac_release to
prevent a corner case where packet may be still queued at the same time
tx_disable is called resulting in kernel panic if some packet still has
to be processed.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-25 19:38:56 -07:00
Christian Marangi f9ec5723c3 net: ethernet: stmicro: stmmac: move queue reset to dedicated functions
Move queue reset to dedicated functions. This aside from a simple
cleanup is also required to allocate a dma conf without resetting the tx
queue while the device is temporarily detached as now the reset is not
part of the dma init function and can be done later in the code flow.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-25 19:38:56 -07:00
Michal Maloszewski 5fcbb71102 i40e: Fix interface init with MSI interrupts (no MSI-X)
Fix the inability to bring an interface up on a setup with
only MSI interrupts enabled (no MSI-X).
Solution is to add a default number of QPs = 1. This is enough,
since without MSI-X support driver enables only a basic feature set.

Fixes: bc6d33c8d9 ("i40e: Fix the number of queues available to be mapped for use")
Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com>
Signed-off-by: Michal Maloszewski <michal.maloszewski@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220722175401.112572-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-25 18:48:41 -07:00
David S. Miller 086f8246ed Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
40GbE Intel Wired LAN Driver Updates 2022-07-22

This series contains updates to i40e and iavf drivers.

Przemyslaw adds a helper function for determining whether TC MQPRIO is
enabled for i40e.

Avinash utilizes the driver's bookkeeping of filters to check for
duplicate filter before sending the request to the PF for iavf.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 14:09:52 +01:00
Lorenzo Bianconi 2830e31477 net: ethernet: mtk-ppe: fix traffic offload with bridged wlan
A typical flow offload scenario for OpenWrt users is routed traffic
received by the wan interface that is redirected to a wlan device
belonging to the lan bridge. Current implementation fails to
fill wdma offload info in mtk_flow_get_wdma_info() since odev device is
the local bridge. Fix the issue running dev_fill_forward_path routine in
mtk_flow_get_wdma_info in order to identify the wlan device.

Tested-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 14:04:58 +01:00
Amit Cohen a168e13f84 mlxsw: spectrum_ptp: Rename mlxsw_sp1_ptp_phc_adjfreq()
The function mlxsw_sp_ptp_phc_adjfreq() configures MTUTC register to adjust
hardware frequency by a given value.

This configuration will be same for Spectrum-2. In preparation for
Spectrum-2 PTP support, rename the function to not be Spectrum-1 specific.
Later, it will be used for Spectrum-2 also.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 13:58:56 +01:00
Amit Cohen 4017d92964 mlxsw: spectrum_ptp: Rename mlxsw_sp_ptp_get_message_types()
Spectrum-1 and Spectrum-2 differ in their time stamping capabilities.
The former can be configured to time stamp only a subset of received PTP
events (e.g., only Sync), whereas the latter will time stamp all PTP
events or none.

In preparation for Spectrum-2 PTP support, rename the function that
parses the hardware time stamping configuration upon %SIOCSHWTSTAMP to
be Spectrum-1 specific.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 13:58:55 +01:00
Amit Cohen 9bfe3c16fc mlxsw: spectrum_ptp: Use 'struct mlxsw_sp_ptp_clock' per ASIC
Currently, there is one shared structure that holds the required
structures for PTP clock. Most of the existing fields are relevant only
for Spectrum-1 (cycles, timecounter, and more). Rename the structure to
be specific for Spectrum-1 and align the existing code. Add a common
structure which includes the structures which will be used also for
Spectrum-2. This structure will be returned from clock_init() operation,
as the definition is shared between all ASICs' operations.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 13:58:55 +01:00
Amit Cohen e8fea346b5 mlxsw: spectrum_ptp: Use 'struct mlxsw_sp_ptp_state' per ASIC
Currently, there is one shared structure that holds the required
structures and details for PTP. Most of the existing fields are relevant
only for Spectrum-1 (hash table, lock for hash table, delayed work, and
more). Rename the structure to be specific for Spectrum-1 and align the
existing code. Add a common structure which includes
'struct mlxsw_sp *mlxsw_sp' and will be returned from ptp_init()
operation, as the definition is shared between all ASICs' operations.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 13:58:55 +01:00
Amit Cohen 9468322963 mlxsw: pci: Simplify FRC clock reading
Currently, the reading of FRC values (high and low) is done using macro
which calls to a function. In addition, to calculate the offset of FRC,
a simple macro is used. This code can be simplified by adding an helper
function and calculating the offset explicitly instead of using an
additional macro for that.

Add the helper function and convert the existing code. This helper will be
used later to read UTC clock.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 13:58:55 +01:00
Amit Cohen 22d950b79e mlxsw: spectrum_ptp: Initialize the clock to zero as part of initialization
As lately recommended in the mailing list[1], set the clock to zero time as
part of initialization.

The idea is that when the clock reads 'Jan 1, 1970', then it is clearly
wrong and user will not mistakenly think that the clock is set correctly.
If as part of initialization, the driver sets the clock, user might see
correct date and time (maybe with a small shift) and assume that there
is no need to sync the clock.

Fix the existing code of Spectrum-1 to set the 'timecounter' to zero.

[1]:
https://lore.kernel.org/netdev/20220201191041.GB7009@hoboy.vegasvil.org/

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 13:58:55 +01:00
Danielle Ratson 33a9583f9a mlxsw: Rename 'read_frc_capable' bit to 'read_clock_capable'
Rename the 'read_frc_capable' bit to 'read_clock_capable' since now it can
be both the FRC and UTC clocks.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 13:58:55 +01:00
Amit Cohen 448e9cb363 mlxsw: resources: Add resource identifier for maximum number of FIDs
Add a resource identifier for maximum number of FIDs so that it could be
later used to query the information from firmware.

In Spectrum-2 and Spectrum-3, the correction field of PTP packets which are
sent as control packets is not updated at egress port. To overcome this
limitation, some packets will be sent as data packets. The header should
include FID, which is supposed to be 'Max FID + port - 1'. As preparation,
add the required resource, to be able to query the value from firmware
later.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 13:58:55 +01:00
Danielle Ratson 81016180e3 mlxsw: spectrum: Fix the shift of FID field in TX header
Currently, the field FID in TX header is defined, but is not used as it is
relevant only for data packets. mlxsw driver currently sends all
host-generated traffic as control packets and not as data packets.

In Spectrum-2 and Spectrum-3, the correction field of PTP packets which
are sent as control packets is not updated at egress port. To overcome this
limitation while adding support for PTP, some packets will be sent as data
packets.

Fix the wrong shift in the definition, to allow using the field later.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 13:58:55 +01:00
Danielle Ratson 291fcb937e mlxsw: Set time stamp type as part of config profile
The type of time stamp field in the CQE is configured via the
CONFIG_PROFILE command during driver initialization. Add the definition
of the relevant fields to the command's payload and set the type to UTC
for Spectrum-2 and above. This configuration can be done as part of the
preparations to PTP support, as the type of the time stamp will not break
any existing behavior.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 13:58:55 +01:00
Danielle Ratson 577d80238f mlxsw: cmd: Add UTC related fields to query firmware command
Add UTC sec and nsec PCI BAR and offset to query firmware command for a
future use.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 13:58:55 +01:00
Danielle Ratson aa98487cc9 mlxsw: pci_hw: Add 'time_stamp' and 'time_stamp_type' fields to CQEv2
The Completion Queue Element version 2 (CQEv2) includes various metadata
fields of packets.

Add 'time_stamp' and 'time_stamp_type' fields along with functions to
extract the seconds and nanoseconds for a future use.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 13:58:55 +01:00
Danielle Ratson 731416e9ae mlxsw: reg: Add Monitoring Time Precision Correction Port Configuration Register
In Spectrum-2, all the packets are time stamped, the MTPCPC register is
used to configure the types of packets that will adjust the correction
field and which port will trap PTP packets.

If ingress correction is set on a port for a given packet type, then
when such a packet is received via the port, the current time stamp is
subtracted from the correction field.

If egress correction is set on a port for a given packet type, then when
such a packet is transmitted via the port, the current time stamp is
added to the correction field.

Assuming the systems is configured correctly, the above means that the
correction field will contain the transient delay between the ports.

Add this register for a future use in order to support PTP in Spectrum-2.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 13:58:55 +01:00
Danielle Ratson 97b05cfb68 mlxsw: reg: Add MTUTC register's fields for supporting PTP in Spectrum-2
The MTUTC register configures the HW UTC counter.

Add the relevant fields and operations to support PTP in Spectrum-2 and
update mlxsw_reg_mtutc_pack() with the new fields for a future use.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 13:58:55 +01:00
Danielle Ratson 1c358fedec mlxsw: Rename mlxsw_reg_mtptptp_pack() to mlxsw_reg_mtptpt_pack()
The right name of the register is MTPTPT, which refers to Monitoring
Precision Time Protocol Trap Register.

Therefore, rename the function mlxsw_reg_mtptptp_pack() to
mlxsw_reg_mtptpt_pack().

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 13:58:54 +01:00
Harini Katakam 8a1c9753f1 net: macb: Update tsu clk usage in runtime suspend/resume for Versal
On Versal TSU clock cannot be disabled irrespective of whether PTP is
used. Hence introduce a new Versal config structure with a "need tsu"
caps flag and check the same in runtime_suspend/resume before cutting
off clocks.

More information on this for future reference:
This is an IP limitation on versions 1p11 and 1p12 when Qbv is enabled
(See designcfg1, bit 3). However it is better to rely on an SoC specific
check rather than the IP version because tsu clk property itself may not
represent actual HW tsu clock on some chip designs.

Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 12:29:54 +01:00
Harini Katakam 1d3ded6425 net: macb: Sort CAPS flags by bit positions
Sort capability flags by the bit position set.

Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 12:29:54 +01:00
Slark Xiao af35f95aca nfp: bpf: Fix typo 'the the' in comment
Replace 'the the' with 'the' in the comment.

Signed-off-by: Slark Xiao <slark_xiao@163.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 10:52:28 +01:00
Lorenzo Bianconi 84b9cd3890 net: ethernet: mtk_eth_soc: add support for page_pool_get_stats
Introduce support for the page_pool stats API into mtk_eth_soc driver.
Report page_pool stats through ethtool.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 10:38:57 +01:00
Lorenzo Bianconi 5886d26fd2 net: ethernet: mtk_eth_soc: add xmit XDP support
Introduce XDP support for XDP_TX verdict and ndo_xdp_xmit function
pointer.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 10:38:57 +01:00
Lorenzo Bianconi 916a6ee836 net: ethernet: mtk_eth_soc: introduce xdp ethtool counters
Report xdp stats through ethtool

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 10:38:57 +01:00
Lorenzo Bianconi 7c26c20da5 net: ethernet: mtk_eth_soc: add basic XDP support
Introduce basic XDP support to mtk_eth_soc driver.
Supported XDP verdicts:
- XDP_PASS
- XDP_DROP
- XDP_REDIRECT

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 10:38:57 +01:00
Lorenzo Bianconi 23233e577e net: ethernet: mtk_eth_soc: rely on page_pool for single page buffers
Rely on page_pool allocator for single page buffers in order to keep
them dma mapped and add skb recycling support.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 10:38:57 +01:00
Jakub Kicinski 502c6f8ced Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2022-07-21

This series contains updates to ice driver only.

Karol adds implementation for GNSS write; data is written to the GNSS
module through TTY device using u-blox UBX protocol.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: add write functionality for GNSS TTY
  ice: add i2c write command
====================

Link: https://lore.kernel.org/r/20220721202842.3276257-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-22 21:55:57 -07:00
Jiri Pirko 1b5995e370 mlxsw: core: Fix use-after-free calling devl_unlock() in mlxsw_core_bus_device_unregister()
Do devl_unlock() before freeing the devlink in
mlxsw_core_bus_device_unregister() function.

Reported-by: Ido Schimmel <idosch@nvidia.com>
Fixes: 72a4c8c94e ("mlxsw: convert driver to use unlocked devlink API during init/fini")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20220721142424.3975704-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-22 14:41:24 -07:00
Avinash Dayanand 40e589ba13 iavf: Check for duplicate TC flower filter before parsing
Record of all the TC flower filters are kept for local book keeping, so
take advantage of that and check for duplicate filter even before sending
a request to the PF driver.

Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com>
Signed-off-by: Jun Zhang <xuejun.zhang@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-22 10:25:51 -07:00
Przemyslaw Patynowski 2313e69c84 i40e: Refactor tc mqprio checks
Refactor bitwise checks for whether TC MQPRIO is enabled
into one single method for improved readability.

Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-22 10:01:00 -07:00
Edward Cree 84e7fc2591 sfc: attach/detach EF100 representors along with their owning PF
Since representors piggy-back on the PF's queues for TX, they can
 only accept new TXes while the PF is up.  Thus, any operation which
 detaches the PF must first detach all its VFreps.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree f72c38fad2 sfc: hook up ef100 representor TX
Implement .ndo_start_xmit() by calling into the parent PF's TX path.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree 02443ab8c9 sfc: support passing a representor to the EF100 TX path
A non-null efv in __ef100_enqueue_skb() indicates that the packet is
 from that representor, should be transmitted with a suitable option
 descriptor (to instruct the switch to deliver it to the representee),
 and should not be accounted to the parent PF's stats or BQL.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree da56552d04 sfc: determine representee m-port for EF100 representors
An MAE port, or m-port, is a port (source/destination for traffic) on
 the Match-Action Engine (the internal switch on EF100).
Representors will use their representee's m-port for two purposes: as
 a destination override on TX from the representor, and as a source
 match in 'default rules' to steer representee traffic (when not
 matched by e.g. a TC flower rule) to representor RX via the parent
 PF's receive queue.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree e1479556f8 sfc: phys port/switch identification for ef100 reps
Requires storing VF index in struct efx_rep.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree 5687eb3466 sfc: add basic ethtool ops to ef100 reps
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree 08135eecd0 sfc: add skeleton ef100 VF representors
No net_device_ops yet, just a placeholder netdev created per VF.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree 95287e1b4e sfc: detect ef100 MAE admin privilege/capability at probe time
One PCIe function per network port (more precisely, per m-port group) is
 responsible for configuring the Match-Action Engine which performs
 switching and packet modification in the slice to support flower/OVS
 offload.  The GRP_MAE bit in the privilege mask indicates whether a
 given function has this capability.
At probe time, call MCDIs to read the calling function's privilege mask,
 and store the GRP_MAE bit in a new ef100_nic_data member.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree 8ca353da9c sfc: update EF100 register descriptions
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Juhee Kang c497885e30 net: marvell: prestera: use netif_is_any_bridge_port instead of open code
The open code which is netif_is_bridge_port() || netif_is_ovs_port() is
defined as a new helper function on netdev.h like netif_is_any_bridge_port
that can check both IFF flags in 1 go. So use netif_is_any_bridge_port()
function instead of open code. This patch doesn't change logic.

Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:35:45 +01:00
Juhee Kang 59ad24714b mlxsw: use netif_is_any_bridge_port() instead of open code
The open code which is netif_is_bridge_port() || netif_is_ovs_port() is
defined as a new helper function on netdev.h like netif_is_any_bridge_port
that can check both IFF flags in 1 go. So use netif_is_any_bridge_port()
function instead of open code. This patch doesn't change logic.

Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:35:45 +01:00
Karol Kolacinski d6b98c8d24 ice: add write functionality for GNSS TTY
Add the possibility to write raw bytes to the GNSS module through the
first TTY device. This allows user to configure the module.

Create a second read-only TTY device.

Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-21 13:25:17 -07:00
Jakub Kicinski 6e0e846ee2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-21 13:03:39 -07:00
Karol Kolacinski fcf9b695a5 ice: add i2c write command
Add the possibility to write to connected i2c devices using the AQ
command. FW may reject the write if the device is not on allowlist.

Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-21 11:22:45 -07:00
Jian Shen 09765fcd3c net: amd8111e: remove repeated dev->features assignement
It's repeated with line 1793-1795, and there isn't any other
handling for it. So remove it.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Link: https://lore.kernel.org/r/20220719142424.4528-1-shenjian15@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-20 21:01:16 -07:00
Jakub Kicinski 47f058ce98 mlx5-updates-2022-07-17
1) Add resiliency for lost completions for PTP TX port timestamp
 
 2) Report Header-data split state via ethtool
 
 3) Decouple HTB code from main regular TX code
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmLXFPYACgkQSD+KveBX
 +j6LXgf9EafSAwlOVVe1znXsFETQo06RnN+av4/GrH8lv9pNfv+NbZDEvryS6l9v
 AV0gTIc68cSk77le7RkNna+iZB1QnvsxgMpGtz4v9TjCmke4ZPduNuXh4Jl6yESb
 t/Bxli2IjcZenzM8iiYex3WronGtWHrraQKwtmrHDpCxTmBGvId4g3AlFoAAUENh
 u5hDtjYN2SmRfTT4J1N1y9EyoIQqXytV/jtiQqmkpxxCkwtAI8AEZuu4UrYCNogy
 xrhag3ZKPx0pqNYGbFPxZZq84GJwi6QODbBqhaqO/NqOSYrE83Vtn+ohlWUGWMhv
 KeEWwYDM0bCF3R2/ONn2xulU2dyUAg==
 =Q+YN
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2022-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2022-07-17

1) Add resiliency for lost completions for PTP TX port timestamp

2) Report Header-data split state via ethtool

3) Decouple HTB code from main regular TX code

* tag 'mlx5-updates-2022-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: CT: Remove warning of ignore_flow_level support for non PF
  net/mlx5e: Add resiliency for PTP TX port timestamp
  net/mlx5: Expose ts_cqe_metadata_size2wqe_counter
  net/mlx5e: HTB, move htb functions to a new file
  net/mlx5e: HTB, change functions name to follow convention
  net/mlx5e: HTB, remove priv from htb function calls
  net/mlx5e: HTB, hide and dynamically allocate mlx5e_htb structure
  net/mlx5e: HTB, move stats and max_sqs to priv
  net/mlx5e: HTB, move section comment to the right place
  net/mlx5e: HTB, move ids to selq_params struct
  net/mlx5e: HTB, reduce visibility of htb functions
  net/mlx5e: Fix mqprio_rl handling on devlink reload
  net/mlx5e: Report header-data split state through ethtool
====================

Link: https://lore.kernel.org/r/20220719203529.51151-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-20 17:54:46 -07:00
Ido Schimmel e5ec6a2513 mlxsw: spectrum_router: Fix IPv4 nexthop gateway indication
mlxsw needs to distinguish nexthops with a gateway from connected
nexthops in order to write the former to the adjacency table of the
device. The check used to rely on the fact that nexthops with a gateway
have a 'link' scope whereas connected nexthops have a 'host' scope. This
is no longer correct after commit 747c143072 ("ip: fix dflt addr
selection for connected nexthop").

Fix that by instead checking the address family of the gateway IP. This
is a more direct way and also consistent with the IPv6 counterpart in
mlxsw_sp_rt6_is_gateway().

Cc: stable@vger.kernel.org
Fixes: 747c143072 ("ip: fix dflt addr selection for connected nexthop")
Fixes: 597cfe4fc3 ("nexthop: Add support for IPv4 nexthops")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 11:00:46 +01:00
Oleksandr Mazur 52323ef754 net: marvell: prestera: add phylink support
For SFP port prestera driver will use kernel
phylink infrastucture to configure port mode based on
the module that has beed inserted

Co-developed-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Co-developed-by: Taras Chornyi <taras.chornyi@plvision.eu>
Signed-off-by: Taras Chornyi <taras.chornyi@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:24:40 +01:00
Kuniyuki Iwashima 3666f666e9 tcp: Fix data-races around sysctl knobs related to SYN option.
While reading these knobs, they can be changed concurrently.
Thus, we need to add READ_ONCE() to their readers.

  - tcp_sack
  - tcp_window_scaling
  - tcp_timestamps

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:49 +01:00
Kuniyuki Iwashima 8895a9c2ac ipv4: Fix data-races around sysctl_fib_multipath_hash_fields.
While reading sysctl_fib_multipath_hash_fields, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: ce5c9c20d3 ("ipv4: Add a sysctl to control multipath hash fields")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:49 +01:00
Kuniyuki Iwashima 7998c12a08 ipv4: Fix data-races around sysctl_fib_multipath_hash_policy.
While reading sysctl_fib_multipath_hash_policy, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: bf4e0a3db9 ("net: ipv4: add support for ECMP hash policy choice")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:49 +01:00
Jakub Kicinski c2fe9ec397 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
1GbE Intel Wired LAN Driver Updates 2022-07-18

This series contains updates to igc driver only.

Kurt Kanzenbach adds support for Qbv schedules where one queue stays open
in consecutive entries.

Sasha removes an unused define and field.

* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  igc: Remove forced_speed_duplex value
  igc: Remove MSI-X PBA Clear register
  igc: Lift TAPRIO schedule restriction
====================

Link: https://lore.kernel.org/r/20220718180109.4114540-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-19 17:45:05 -07:00
Jakub Kicinski 48ea8ea32d Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-07-18

This series contains updates to iavf driver only.

Przemyslaw fixes handling of multiple VLAN requests to account for
individual errors instead of rejecting them all. He removes incorrect
implementations of ETHTOOL_COALESCE_MAX_FRAMES and
ETHTOOL_COALESCE_MAX_FRAMES_IRQ.

He also corrects an issue with NULL pointer caused by improper handling of
dummy receive descriptors. Finally, he corrects debug prints reporting an
unknown state.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  iavf: Fix missing state logs
  iavf: Fix handling of dummy receive descriptors
  iavf: Disallow changing rx/tx-frames and rx/tx-frames-irq
  iavf: Fix VLAN_V2 addition/rejection
====================

Link: https://lore.kernel.org/r/20220718174807.4113582-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-19 17:43:02 -07:00
Lorenzo Bianconi 53eb9b0456 net: ethernet: mtk_ppe: fix possible NULL pointer dereference in mtk_flow_get_wdma_info
odev pointer can be NULL in mtk_flow_offload_replace routine according
to the flower action rules. Fix possible NULL pointer dereference in
mtk_flow_get_wdma_info.

Fixes: a333215e10 ("net: ethernet: mtk_eth_soc: implement flow offloading to WED devices")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/4e1685bc4976e21e364055f6bee86261f8f9ee93.1658137753.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-19 17:27:18 -07:00
Edward Cree 1f17708b47 sfc: update MCDI protocol headers
Link: https://lore.kernel.org/r/cover.1657878101.git.ecree.xilinx@gmail.com
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-19 13:37:04 -07:00
Roi Dayan 22df2e9362 net/mlx5: CT: Remove warning of ignore_flow_level support for non PF
ignore_flow_level isn't supported for SFs, and so it causes
post_act and ct to warn about it per SF.
Apply the warning only for PF.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-19 13:32:54 -07:00
Aya Levin 58a518948f net/mlx5e: Add resiliency for PTP TX port timestamp
PTP TX port timestamp relies on receiving 2 CQEs for each outgoing
packet (WQE). The regular CQE has a less accurate timestamp than the
wire CQE. On link change, the wire CQE may get lost. Let the driver
detect and restore the relation between the CQEs, and re-sync after
timeout.

Add resiliency for this as follows: add id (producer counter)
into the WQE's metadata. This id will be received in the wire
CQE (in wqe_counter field). On handling the wire CQE, if there is no
match, replay the PTP application with the time-stamp from the regular
CQE and restore the sync between the CQEs and their SKBs. This patch
adds 2 ptp counters:
1) ptp_cq0_resync_event: number of times a mismatch was detected between
   the regular CQE and the wire CQE.
2) ptp_cq0_resync_cqe: total amount of missing wire CQEs.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-19 13:32:54 -07:00
Moshe Tal 462b005999 net/mlx5e: HTB, move htb functions to a new file
Move htb related functions and data to a separated file for better
encapsulation.

Signed-off-by: Moshe Tal <moshet@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-19 13:32:53 -07:00
Moshe Tal 3685eed56f net/mlx5e: HTB, change functions name to follow convention
Following the change of the functions to be object like, change also
the names.

Signed-off-by: Moshe Tal <moshet@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-19 13:32:52 -07:00
Moshe Tal 28df4a0117 net/mlx5e: HTB, remove priv from htb function calls
As a step to make htb self-contained replace the passing of priv as a
parameter to htb function calls with members in the htb struct.

Full decoupling the htb from priv will require more work, so for now
leave the priv as one of the members in the htb struct, to be replaced
by channels in a future commit.

Signed-off-by: Moshe Tal <moshet@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-19 13:32:52 -07:00
Saeed Mahameed aaffda6b36 net/mlx5e: HTB, hide and dynamically allocate mlx5e_htb structure
Move structure mlx5e_htb from the main driver include file "en.h" to be
hidden in qos.c where the qos functionality is implemented, forward
declare it for the rest of the driver and allocate it dynamically upon
user demand only.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Moshe Tal <moshet@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
2022-07-19 13:32:51 -07:00
Moshe Tal db83f24d89 net/mlx5e: HTB, move stats and max_sqs to priv
Preparation for dynamic allocation of the HTB struct.
The statistics should be preserved even when the struct is de-allocated.

Signed-off-by: Moshe Tal <moshet@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-19 13:32:51 -07:00
Moshe Tal 66d9593648 net/mlx5e: HTB, move section comment to the right place
mlx5e_get_qos_sq is a part of the SQ lifecycle, so need be under the
title.

Signed-off-by: Moshe Tal <moshet@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-19 13:32:50 -07:00
Moshe Tal 4f8d1d3adc net/mlx5e: HTB, move ids to selq_params struct
HTB id fields are needed for selecting queue. Moving them to the
selq_params struct will simplify synchronization between control flow
and mlx5e_select_queues and will keep the IDs in the hot cacheline of
mlx5e_selq_params.

Replace mlx5e_selq_prepare() with separate functions that change subsets
of parameters, while keeping the rest.

This also will be useful to hide mlx5e_htb structure from the rest of the
driver in a later patch in this series.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Moshe Tal <moshet@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
2022-07-19 13:32:50 -07:00
Saeed Mahameed efe317997e net/mlx5e: HTB, reduce visibility of htb functions
No need to expose all htb tc functions to the main driver file,
expose only the master htb tc function mlx5e_htb_setup_tc()
which selects the internal "now static" function to call.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Moshe Tal <moshet@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
2022-07-19 13:32:49 -07:00
Moshe Tal 0bb7228f70 net/mlx5e: Fix mqprio_rl handling on devlink reload
Keep mqprio_rl data to params and restore the configuration in case of
devlink reload.
Change the location of mqprio_rl resources cleanup so it will be done
also in reload flow.

Also, remove the rl pointer from the params, since this is dynamic object
and saved to priv.

Signed-off-by: Moshe Tal <moshet@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-19 13:32:49 -07:00
Gal Pressman 07071e47da net/mlx5e: Report header-data split state through ethtool
HW-GRO (SHAMPO) packet merger scheme implies header-data split in the
driver, report it through the ethtool interface.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-19 13:32:48 -07:00
Hristo Venev d7241f679a be2net: Fix buffer overflow in be_get_module_eeprom
be_cmd_read_port_transceiver_data assumes that it is given a buffer that
is at least PAGE_DATA_LEN long, or twice that if the module supports SFF
8472. However, this is not always the case.

Fix this by passing the desired offset and length to
be_cmd_read_port_transceiver_data so that we only copy the bytes once.

Fixes: e36edd9d26 ("be2net: add ethtool "-m" option support")
Signed-off-by: Hristo Venev <hristo@venev.name>
Link: https://lore.kernel.org/r/20220716085134.6095-1-hristo@venev.name
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19 11:51:16 +02:00
Maksym Glubokiy 3c6aca3333 net: prestera: acl: add support for 'police' action on egress
Propagate ingress/egress direction for 'police' action down to hardware.

Co-developed-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu>
Signed-off-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu>
Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Link: https://lore.kernel.org/r/20220714083541.1973919-1-maksym.glubokiy@plvision.eu
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19 09:53:40 +02:00
Jakub Kicinski e22c88799f Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2022-07-15

This series contains updates to ice driver only.

Ani updates feature restriction for devices that don't support external
time stamping.

Zhuo Chen removes unnecessary call to pci_aer_clear_nonfatal_status().

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: Remove pci_aer_clear_nonfatal_status() call
  ice: Add EXTTS feature to the feature bitmap
====================

Link: https://lore.kernel.org/r/20220715214642.2968799-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:39:54 -07:00
Ben Dooks 6ee49d629d net: macb: fixup sparse warnings on __be16 ports
The port fields in the ethool flow structures are defined
to be __be16 types, so sparse is showing issues where these
are being passed to htons(). Fix these warnings by passing
them to be16_to_cpu() instead.

These are being used in netdev_dbg() so should only effect
anyone doing debug.

Fixes the following sparse warnings:

drivers/net/ethernet/cadence/macb_main.c:3366:9: warning: cast from restricted __be16
drivers/net/ethernet/cadence/macb_main.c:3366:9: warning: cast from restricted __be16
drivers/net/ethernet/cadence/macb_main.c:3366:9: warning: cast from restricted __be16
drivers/net/ethernet/cadence/macb_main.c:3419:25: warning: cast from restricted __be16
drivers/net/ethernet/cadence/macb_main.c:3419:25: warning: cast from restricted __be16
drivers/net/ethernet/cadence/macb_main.c:3419:25: warning: cast from restricted __be16
drivers/net/ethernet/cadence/macb_main.c:3419:25: warning: cast from restricted __be16

Signed-off-by: Ben Dooks <ben.dooks@sifive.com>
Link: https://lore.kernel.org/r/20220715173009.526126-1-ben.dooks@sifive.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:39:22 -07:00
Maksym Glubokiy 71c47aa98c net: prestera: acl: fix code formatting
Make the code look better.

Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Link: https://lore.kernel.org/r/20220715103806.7108-1-maksym.glubokiy@plvision.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:39:06 -07:00
Wong Vee Khee da791bac10 net: stmmac: remove redunctant disable xPCS EEE call
Disable is done in stmmac_init_eee() on the event of MAC link down.
Since setting enable/disable EEE via ethtool will eventually trigger
a MAC down, removing this redunctant call in stmmac_ethtool.c to avoid
calling xpcs_config_eee() twice.

Fixes: d4aeaed80b ("net: stmmac: trigger PCS EEE to turn off on link down")
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Link: https://lore.kernel.org/r/20220715122402.1017470-1-vee.khee.wong@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:15:45 -07:00
Piotr Skajewski 1e53834ce5 ixgbe: Add locking to prevent panic when setting sriov_numvfs to zero
It is possible to disable VFs while the PF driver is processing requests
from the VF driver.  This can result in a panic.

BUG: unable to handle kernel paging request at 000000000000106c
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 8 PID: 0 Comm: swapper/8 Kdump: loaded Tainted: G I      --------- -
Hardware name: Dell Inc. PowerEdge R740/06WXJT, BIOS 2.8.2 08/27/2020
RIP: 0010:ixgbe_msg_task+0x4c8/0x1690 [ixgbe]
Code: 00 00 48 8d 04 40 48 c1 e0 05 89 7c 24 24 89 fd 48 89 44 24 10 83 ff
01 0f 84 b8 04 00 00 4c 8b 64 24 10 4d 03 a5 48 22 00 00 <41> 80 7c 24 4c
00 0f 84 8a 03 00 00 0f b7 c7 83 f8 08 0f 84 8f 0a
RSP: 0018:ffffb337869f8df8 EFLAGS: 00010002
RAX: 0000000000001020 RBX: 0000000000000000 RCX: 000000000000002b
RDX: 0000000000000002 RSI: 0000000000000008 RDI: 0000000000000006
RBP: 0000000000000006 R08: 0000000000000002 R09: 0000000000029780
R10: 00006957d8f42832 R11: 0000000000000000 R12: 0000000000001020
R13: ffff8a00e8978ac0 R14: 000000000000002b R15: ffff8a00e8979c80
FS:  0000000000000000(0000) GS:ffff8a07dfd00000(0000) knlGS:00000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000106c CR3: 0000000063e10004 CR4: 00000000007726e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <IRQ>
 ? ttwu_do_wakeup+0x19/0x140
 ? try_to_wake_up+0x1cd/0x550
 ? ixgbevf_update_xcast_mode+0x71/0xc0 [ixgbevf]
 ixgbe_msix_other+0x17e/0x310 [ixgbe]
 __handle_irq_event_percpu+0x40/0x180
 handle_irq_event_percpu+0x30/0x80
 handle_irq_event+0x36/0x53
 handle_edge_irq+0x82/0x190
 handle_irq+0x1c/0x30
 do_IRQ+0x49/0xd0
 common_interrupt+0xf/0xf

This can be eventually be reproduced with the following script:

while :
do
    echo 63 > /sys/class/net/<devname>/device/sriov_numvfs
    sleep 1
    echo 0 > /sys/class/net/<devname>/device/sriov_numvfs
    sleep 1
done

Add lock when disabling SR-IOV to prevent process VF mailbox communication.

Fixes: d773d13106 ("ixgbe: Fix memory leak when SR-IOV VFs are direct assigned")
Signed-off-by: Piotr Skajewski <piotrx.skajewski@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220715214456.2968711-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:13:40 -07:00
Dawid Lukwinski f838a63369 i40e: Fix erroneous adapter reinitialization during recovery process
Fix an issue when driver incorrectly detects state
of recovery process and erroneously reinitializes interrupts,
which results in a kernel error and call trace message.

The issue was caused by a combination of two factors:
1. Assuming the EMP reset issued after completing
firmware recovery means the whole recovery process is complete.
2. Erroneous reinitialization of interrupt vector after detecting
the above mentioned EMP reset.

Fixes (1) by changing how recovery state change is detected
and (2) by adjusting the conditional expression to ensure using proper
interrupt reinitialization method, depending on the situation.

Fixes: 4ff0ee1af0 ("i40e: Introduce recovery mode support")
Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com>
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220715214542.2968762-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:13:22 -07:00
Tom Rix 3696c952da net: ethernet: mtk_eth_soc: fix off by one check of ARRAY_SIZE
In mtk_wed_tx_ring_setup(.., int idx, ..), idx is used as an index here
  struct mtk_wed_ring *ring = &dev->tx_ring[idx];

The bounds of idx are checked here
  BUG_ON(idx > ARRAY_SIZE(dev->tx_ring));

If idx is the size of the array, it will pass this check and overflow.
So change the check to >= .

Fixes: 804775dfc2 ("net: ethernet: mtk_eth_soc: add support for Wireless Ethernet Dispatch (WED)")
Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20220716214654.1540240-1-trix@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:12:04 -07:00
Jiri Pirko 72a4c8c94e mlxsw: convert driver to use unlocked devlink API during init/fini
Prepare for devlink reload being called with devlink->lock held and
convert the mlxsw driver to use unlocked devlink API during init and
fini flows. Take devl_lock() in reload_down() and reload_up() ops in the
meantime before reload cmd is converted to take the lock itself.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:10:47 -07:00
Horatiu Vultur 675c807ae2 net: lan966x: Fix usage of lan966x->mac_lock when used by FDB
When the SW bridge was trying to add/remove entries to/from HW, the
access to HW was not protected by any lock. In this way, it was
possible to have race conditions.
Fix this by using the lan966x->mac_lock to protect parallel access to HW
for this cases.

Fixes: 25ee9561ec ("net: lan966x: More MAC table functionality")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:00:00 -07:00
Horatiu Vultur c192468436 net: lan966x: Fix usage of lan966x->mac_lock inside lan966x_mac_irq_handler
The problem with this spin lock is that it was just protecting the list
of the MAC entries in SW and not also the access to the MAC entries in HW.
Because the access to HW is indirect, then it could happen to have race
conditions.
For example when SW introduced an entry in MAC table and the irq mac is
trying to read something from the MAC.
Update such that also the access to MAC entries in HW is protected by
this lock.

Fixes: 5ccd66e01c ("net: lan966x: add support for interrupts from analyzer")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:00:00 -07:00