WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Netanel Belgazal	dd8427a78f	net/ena: change condition for host attribute configuration Move the host info config to be the first admin command that is executed. This change require the driver to remove the 'feature check' from host info configuration flow. The check is removed since the supported features bitmask field is retrieved only after calling ENA_ADMIN_DEVICE_ATTRIBUTES admin command. If set host info is not supported an error will be returned by the device. Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 22:27:06 -05:00
Netanel Belgazal	7102a18ac3	net/ena: change driver's default timeouts The timeouts were too agressive and sometimes cause false alarms. Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 22:27:06 -05:00
Netanel Belgazal	5add6e4a22	net/ena: reduce the severity of ena printouts Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 22:27:06 -05:00
Netanel Belgazal	a8496eb813	net/ena: use READ_ONCE to access completion descriptors Completion descriptors are accessed from the driver and from the device. To avoid reading the old value, use READ_ONCE macro. Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 22:27:06 -05:00
Netanel Belgazal	b1669c9f5a	net/ena: use napi_complete_done() return value Do not unamsk interrupts if we are in busy poll mode. Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 22:27:06 -05:00
Netanel Belgazal	3f6159dbfc	net/ena: fix potential access to freed memory during device reset If the ena driver detects that the device is not behave as expected, it tries to reset the device. The reset flow calls ena_down, which will frees all the resources the driver allocates and then it will reset the device. This flow can cause memory corruption if the device is still writes to the driver's memory space. To overcome this potential race, move the reset before the device resources are freed. Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 22:27:06 -05:00
Netanel Belgazal	d81db24056	net/ena: refactor ena_get_stats64 to be atomic context safe ndo_get_stat64() can be called from atomic context, but the current implementation sends an admin command to retrieve the statistics from the device. This admin command can sleep. This patch re-factors the implementation of ena_get_stats64() to use the {rx,tx}bytes/count from the driver's inner counters, and to obtain the rx drop counter from the asynchronous keep alive (heart bit) event. Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 22:27:06 -05:00
Netanel Belgazal	22b331c9e0	net/ena: fix NULL dereference when removing the driver after device reset failed If for some reason the device stops responding, and the device reset failes to recover the device, the mmio register read data structure will not be reinitialized. On driver removal, the driver will also try to reset the device, but this time the mmio data structure will be NULL. To solve this issue, perform the device reset in the remove function only if the device is runnig. Crash log 54.240382] BUG: unable to handle kernel NULL pointer dereference at (null) [ 54.244186] IP: [<ffffffffc067de5a>] ena_com_reg_bar_read32+0x8a/0x180 [ena_drv] [ 54.244186] PGD 0 [ 54.244186] Oops: 0002 [#1] SMP [ 54.244186] Modules linked in: ena_drv(OE-) snd_hda_codec_generic kvm_intel kvm crct10dif_pclmul ppdev crc32_pclmul ghash_clmulni_intel aesni_intel snd_hda_intel aes_x86_64 snd_hda_controller lrw gf128mul cirrus glue_helper ablk_helper ttm snd_hda_codec drm_kms_helper cryptd snd_hwdep drm snd_pcm pvpanic snd_timer syscopyarea sysfillrect snd parport_pc sysimgblt serio_raw soundcore i2c_piix4 mac_hid lp parport psmouse floppy [ 54.244186] CPU: 5 PID: 1841 Comm: rmmod Tainted: G OE 3.16.0-031600-generic #201408031935 [ 54.244186] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [ 54.244186] task: ffff880135852880 ti: ffff8800bb640000 task.ti: ffff8800bb640000 [ 54.244186] RIP: 0010:[<ffffffffc067de5a>] [<ffffffffc067de5a>] ena_com_reg_bar_read32+0x8a/0x180 [ena_drv] [ 54.244186] RSP: 0018:ffff8800bb643d50 EFLAGS: 00010083 [ 54.244186] RAX: 000000000000deb0 RBX: 0000000000030d40 RCX: 0000000000000003 [ 54.244186] RDX: 0000000000000202 RSI: 0000000000000058 RDI: ffffc90000775104 [ 54.244186] RBP: ffff8800bb643d88 R08: 0000000000000000 R09: cf00000000000000 [ 54.244186] R10: 0000000fffffffe0 R11: 0000000000000001 R12: 0000000000000000 [ 54.244186] R13: ffffc90000765000 R14: ffffc90000775104 R15: 00007fca1fa98090 [ 54.244186] FS: 00007fca1f1bd740(0000) GS:ffff88013fd40000(0000) knlGS:0000000000000000 [ 54.244186] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 54.244186] CR2: 0000000000000000 CR3: 00000000b9cf6000 CR4: 00000000001406e0 [ 54.244186] Stack: [ 54.244186] 0000000000000202 0000005800000286 ffffc90000765000 ffffc90000765000 [ 54.244186] ffff880135f6b000 ffff8800b9360000 00007fca1fa98090 ffff8800bb643db8 [ 54.244186] ffffffffc0680b3d ffff8800b93608c0 ffffc90000765000 ffff880135f6b000 [ 54.244186] Call Trace: [ 54.244186] [<ffffffffc0680b3d>] ena_com_dev_reset+0x1d/0x1b0 [ena_drv] [ 54.244186] [<ffffffffc0678497>] ena_remove+0xa7/0x130 [ena_drv] [ 54.244186] [<ffffffff813d4df6>] pci_device_remove+0x46/0xc0 [ 54.244186] [<ffffffff814c3b7f>] __device_release_driver+0x7f/0xf0 [ 54.244186] [<ffffffff814c4738>] driver_detach+0xc8/0xd0 [ 54.244186] [<ffffffff814c3969>] bus_remove_driver+0x59/0xd0 [ 54.244186] [<ffffffff814c4fde>] driver_unregister+0x2e/0x60 [ 54.244186] [<ffffffff810f0a80>] ? show_refcnt+0x40/0x40 [ 54.244186] [<ffffffff813d4ec3>] pci_unregister_driver+0x23/0xa0 [ 54.244186] [<ffffffffc068413f>] ena_cleanup+0x10/0xed1 [ena_drv] [ 54.244186] [<ffffffff810f3a47>] SyS_delete_module+0x157/0x1e0 [ 54.244186] [<ffffffff81014fb7>] ? do_notify_resume+0xc7/0xd0 [ 54.244186] [<ffffffff81793fad>] system_call_fastpath+0x1a/0x1f [ 54.244186] Code: c3 4d 8d b5 04 01 01 00 4c 89 f7 e8 e1 5a 11 c1 48 89 45 c8 41 0f b7 85 00 01 01 00 8d 48 01 66 2d 52 21 66 41 89 8d 00 01 01 00 <66> 41 89 04 24 0f b7 45 d4 89 45 d0 89 c1 41 0f b7 85 00 01 01 [ 54.244186] RIP [<ffffffffc067de5a>] ena_com_reg_bar_read32+0x8a/0x180 [ena_drv] [ 54.244186] RSP <ffff8800bb643d50> [ 54.244186] CR2: 0000000000000000 [ 54.244186] ---[ end trace 18dd9889b6497810 ]--- Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 22:27:06 -05:00
Netanel Belgazal	422e21e761	net/ena: fix RSS default hash configuration ENA default hash configures IPv4_frag hash twice instead of configure non-IP packets. The bug caused IPv4 fragmented packets to be calculated based on L2 source and destination address instead of L3 source and destination. IPv4 packets can reach to the wrong Rx queue. Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 22:27:06 -05:00
Netanel Belgazal	6e2de20ddc	net/ena: fix ethtool RSS flow configuration ena_flow_data_to_flow_hash and ena_flow_hash_to_flow_type treat the ena_flow_hash_to_flow_type enum as power of two values. Change the values of ena_admin_flow_hash_fields to be power of two values. This bug effect the ethtool set/get rxnfc. ethtool will report wrong values hash fields for get and will configure wrong hash fields in set. Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 22:27:06 -05:00
Netanel Belgazal	6a1ce2fb67	net/ena: fix queues number calculation The ENA driver tries to open a queue per vCPU. To determine how many vCPUs the instance have it uses num_possible_cpus() while it should have use num_online_cpus() instead. Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 22:27:06 -05:00
Netanel Belgazal	fdeea0ad87	net/ena: remove ntuple filter support from device feature list Remove NETIF_F_NTUPLE from netdev->features. The ENA device driver does not support ntuple filtering. Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 22:27:06 -05:00
David S. Miller	b66a8043d0	Merge branch 'enic-vxlan-offload' Govindarajulu Varadarajan says: ==================== enic: add vxlan offload support This series adds vxlan offload support for enic driver. The first patch adds vxlan devcmd for configuring vxland offload parameters. Second patch adds ndo_udp_tunnel_add/del and offload on rx path. There are to modes in which fw supports vxlan offload. mode 0: fcoe bit is set for encapsulated packet. fcoe_fc_crc_ok is set if checksum of csum is ok. This bit is or of ip_csum_ok and tcp_udp_csum_ok mode 2: BIT(0) in rss_hash is set if it is encapsulated packet. BIT(1) is set if outer_ip_csum_ok/ BIT(2) is set if outer_tcp_csum_ok Some hw supports only mode 0, some support mode 0 and 2. Driver gets the supported modes bitmap using get_supported_feature_ver devcmd and selects the highest mode both driver and fw supports. Third patch adds offload support on tx path by adding enic_features_check(). v2: Order local variable declarations from longest to shortest line, on all three patches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 17:24:30 -05:00
Govindarajulu Varadarajan	9c744d1087	enic: add vxlan offload on tx path Define ndo_features_check. Hw supports offload only for ipv4 inner and ipv4 outer pkt. Code refactor for setting inner tcp pseudo csum. Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 17:24:29 -05:00
Govindarajulu Varadarajan	257e738238	enic: add udp_tunnel ndo for vxlan offload Defines enic_udp_tunnel_add/del for configuring vxlan tunnel offload. enic supports offload of only one ipv4/udp port. There are two modes that fw supports for vxlan offload. mode 0: fcoe bit is set for encapsulated packet. fcoe_fc_crc_ok is set if checksum of csum is ok. This bit is or of ip_csum_ok and tcp_udp_csum_ok mode 2: BIT(0) in rss_hash is set if it is encapsulated packet. BIT(1) is set if outer_ip_csum_ok/ BIT(2) is set if outer_tcp_csum_ok tcp_udp_csum_ok/ipv4_csum_ok is set if inner csum is OK. Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 17:24:29 -05:00
Govindarajulu Varadarajan	ca02917982	enic: add devcmds for vxlan offload This patch adds devcmds needed for vxlan offload. Implement 3 new devcmd overlay_offload_ctrl: enable/disable offload overlay_offload_cfg: update offload udp port number get_supported_feature_ver: get hw supported offload version. Each version has different bitmap for csum_ok/encap Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 17:24:29 -05:00
Andrew Lunn	c0e4dadb34	net: dsa: mv88e6xxx: Move forward declaration to where it is needed Move it out from the middle for the #defines to just before it is needed. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 17:11:41 -05:00
Florian Fainelli	50f008e583	net: dsa: Fix duplicate object rule While adding switch.o to the list of DSA object files, we essentially duplicated the previous obj-y line and just added switch.o, remove the duplicate. Fixes: `f515f192ab` ("net: dsa: add switch notifier") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 17:11:09 -05:00
David S. Miller	165f1cc0b5	Merge branch 'qcom-emac-more-ethtool' Timur Tabi says: ==================== net: qcom/emac: add the last ethtool functions These two patches implement the remaining two ethtool functions that are of interest to the Qualcomm EMAC driver. These are the last patches that will be submitted for the 4.11 merge window. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 17:09:20 -05:00
Timur Tabi	038b9404d4	net: qcom/emac: add ethtool support for setting ring parameters Implement the set_ringparam method, which allows the user to specify the size of the TX and RX descriptor rings. The values are constrained to the limits of the hardware. Since the driver does not use separate queues for mini or jumbo frames, attempts to set those values are rejected. If the interface is already running when the setting is changed, then the interface is reset. Signed-off-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 17:09:19 -05:00
Timur Tabi	c4e7beea21	net: qcom/emac: add ethtool support for reading hardware registers Implement the get_regs_len and get_regs ethtool methods. The driver returns the values of selected hardware registers. The make the register offsets known to emac_ethtool, the the register offset macros are all combined into one header file. They were inexplicably and arbitrarily split between two files. Signed-off-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 17:09:19 -05:00
Arnd Bergmann	15c2e10241	ARM: orion: remove unused wnr854t_switch_plat_data The other instances of this structure got removed along with the MDIO device change, but this one was left behind and needs to be removed as well: arch/arm/mach-orion5x/wnr854t-setup.c:109:44: error: 'wnr854t_switch_plat_data' defined but not used [-Werror=unused-variable] static struct dsa_platform_data __initdata wnr854t_switch_plat_data = { Fixes: `575e93f7b5` ("ARM: orion: Register DSA switch as a MDIO device") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 16:59:57 -05:00
David S. Miller	2cbf5b4212	Merge branch 'sctp-sender-stream-reconf-reset-add-streams' Xin Long says: ==================== sctp: add sender-side procedures for stream reconf asoc reset and add streams Patch 4/6 is to implement sender-side procedures for the SSN/TSN Reset Request Parameter described in rfc6525 section 5.1.4, patch 3/6 is ahead of it to define a function to make the request chunk for it. Patch 6/6 is to implement sender-side procedures for the Add Incoming and Outgoing Streams Request Parameter Request Parameter described in rfc6525 section 5.1.5 and 5.1.6, patch 5/6 is ahead of it to define a function to make the request chunk for it. Patch 2/6 is a fix to recover streams states when it fails to send request and Patch 1/6 is to drop some unncessary __packed from some old structures. v1->v2: - put these into a smaller group. - rename some temporary variables in the codes. - rename the titles of the commits and improve some changelogs. v2->v3: - re-split the patchset and make sure it has no dead codes for review. - move some codes into stream.c from socket.c. v3->v4: - add one more patch to fix a send reset stream request issue. - doing actual work only when request is sent successfully. - reduce some indents in sctp_send_add_streams. v4->v5: - close streams before sending request and recover them when sending fails in patch 1/5 and patch 3/5 v5->v6: - add patch 1/6 to drop some unncessary __packed from some old structures. - remove __packed from some new structures in patch 3/6 and 5/6. - define unsigned int outcnt and incnt to make codes smaller in patch 6/6. - use krealloc instead of kcalloc and remove ksize check in patch 6/6, as ksize check is acutally used in krealloc already. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 16:57:39 -05:00
Xin Long	242bd2d519	sctp: implement sender-side procedures for Add Incoming/Outgoing Streams Request Parameter This patch is to implement Sender-Side Procedures for the Add Outgoing and Incoming Streams Request Parameter described in rfc6525 section 5.1.5-5.1.6. It is also to add sockopt SCTP_ADD_STREAMS in rfc6525 section 6.3.4 for users. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 16:57:38 -05:00
Xin Long	78098117f8	sctp: add support for generating stream reconf add incoming/outgoing streams request chunk This patch is to define Add Incoming/Outgoing Streams Request Parameter described in rfc6525 section 4.5 and 4.6. They can be in one same chunk trunk as rfc6525 section 3.1-7 describes, so make them in one function. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 16:57:38 -05:00
Xin Long	a92ce1a42d	sctp: implement sender-side procedures for SSN/TSN Reset Request Parameter This patch is to implement Sender-Side Procedures for the SSN/TSN Reset Request Parameter descibed in rfc6525 section 5.1.4. It is also to add sockopt SCTP_RESET_ASSOC in rfc6525 section 6.3.3 for users. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 16:57:38 -05:00
Xin Long	c56480a1e9	sctp: add support for generating stream reconf ssn/tsn reset request chunk This patch is to define SSN/TSN Reset Request Parameter described in rfc6525 section 4.3. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 16:57:38 -05:00
Xin Long	119aecbae5	sctp: streams should be recovered when it fails to send request. Now when sending stream reset request, it closes the streams to block further xmit of data until this request is completed, then calls sctp_send_reconf to send the chunk. But if sctp_send_reconf returns err, and it doesn't recover the streams' states back, which means the request chunk would not be queued and sent, so the asoc will get stuck, streams are closed and no packet is even queued. This patch is to fix it by recovering the streams' states when it fails to send the request, it is also to fix a return value. Fixes: `7f9d68ac94` ("sctp: implement sender-side procedures for SSN Reset Request Parameter") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 16:57:38 -05:00
Xin Long	9faf1c0fd5	sctp: drop unnecessary __packed from some stream reconf structures commit `85c727b594` ("sctp: drop __packed from almost all SCTP structures") has removed __packed from almost all SCTP structures. But there still are three structures where it should be dropped. This patch is to remove it from some stream reconf structures. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 16:57:38 -05:00
David S. Miller	caa2858cd5	Merge branch 'sfc-more-encap-offloads' Edward Cree says: ==================== sfc: more encap offloads This patch series adds support for RX checksum offload of encapsulated packets. It also adds support for configuring the hardware's lists of UDP ports used for VXLAN and GENEVE encapsulation offloads. Since changing these lists causes the MC to reboot, the driver has been hardened against reboots, which used to be considered an exceptional occurrence but are now normal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 16:47:54 -05:00
Jon Cooper	e5fbd97764	sfc: configure UDP tunnel offload ports Implement ndo_udp_tunnel_{add,del} to update the NIC's list of VXLAN and GENEVE UDP ports. Also reset the port list to empty on driver load and on driver unload, with appropriate flag set on the unload case. These port numbers are used for RX inner checksum offload, and in future will also be used for TX inner checksum offload and encapsulated TSO. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 16:47:53 -05:00
Matthew Slattery	d4e85477cc	sfc: update mcdi_pcol definitions for MC_CMD_SET_TUNNEL_ENCAP_UDP_PORTS Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 16:47:53 -05:00
Jon Cooper	0ca2b46dbb	sfc: call mcdi_reboot_detected() when MC reboots during an MCDI command This function wasn't being called in this particular case when the MC reboots. This caused resource reallocations to not be handled properly and often ended up disabling the interface. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 16:47:53 -05:00
Jon Cooper	8a53140062	sfc: harden driver against MC resets during initial probe This is mainly to prepare for a future overlay networking patch that could cause an MC reset at probe time if the UDP tunnel port list is set immediately upon driver load. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 16:47:53 -05:00
Jon Cooper	da50ae2eae	sfc: set csum_level for encapsulated packets Set the csum_level for encapsulated packets where the encapsulation type, l3 class and l4 class are sets that need it. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 16:47:53 -05:00
Jon Cooper	a0ee354148	sfc: process RX event inner checksum flags Add support for RX checksum offload of encapsulated packets. This essentially just means paying attention to the inner checksum flags in the RX event, and if either checksum flag indicates a fail then don't tell the kernel that checksum offload was successful. Also, count these checksum errors and export the counts to ethtool -S. Test the most common "good" case of RX events with a single bitmask instead of a series of ifs. Move the more specific error checking in to a separate function for clarity, and don't use unlikely() there since we know at least one of the bits is bad. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-09 16:47:53 -05:00
Ido Schimmel	df6dd79be8	mlxsw: spectrum_router: Don't reflect LINKDOWN nexthops The kernel resolves the nexthops for a given route using FIB_LOOKUP_IGNORE_LINKSTATE which means a notification can be sent for a route with one of its nexthops being LINKDOWN. In case IGNORE_ROUTES_WITH_LINKDOWN is set for the nexthop netdev, then we shouldn't reflect the nexthop to the device's table. Once the nexthop netdev's carrier goes up we'll be notified using NH_ADD and reflect it to the device. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:43:59 -05:00
David S. Miller	d9e1661dab	Merge branch 'mlxsw-Reflect-nexthop-status-changes' Jiri Pirko says: ==================== mlxsw: Reflect nexthop status changes Ido says: When the kernel forwards IPv4 packets via multipath routes it doesn't consider nexthops that are dead or linkdown. For example, if the nexthop netdev is administratively down or doesn't have a carrier. Devices capable of offloading such multipath routes need to be made aware of changes in the reflected nexthops' status. Otherwise, the device might forward packets via non-functional nexthops, resulting in packet loss. This patchset aims to fix that. The first 11 patches deal with the necessary restructuring in the mlxsw driver, so that it's able to correctly add and remove nexthops from the device's adjacency table. The 12th patch adds the NH_{ADD,DEL} events to the FIB notification chain. These notifications are sent whenever the kernel decides to add or remove a nexthop from the forwarding plane. Finally, the last three patches add support for these events in the mlxsw driver, which is currently the only driver capable of offloading multipath routes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:19 -05:00
Ido Schimmel	9665b74562	mlxsw: spectrum_router: Flush resources when RIF is deleted When the last IP address is removed from a netdev, its RIF is deleted. However, if user didn't first remove neighbours and nexthops using this interface, then they would still be present in the device's tables. Therefore, whenever a RIF is deleted, make sure all the neighbours and nexthops (adjacency entries) using it are removed from the relevant tables as well. The action associated with any route using this RIF would be refreshed, most likely to trap. If the kernel decides to remove the route (f.e., because all the nexthops are now DEAD), then an event would be sent, causing the route to be removed from the device. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:19 -05:00
Ido Schimmel	ad178c8eef	mlxsw: spectrum_router: Reflect nexthop status changes When a packet hits a multipath route in the device's routing table, a hash is computed over its headers, which is then used to select the appropriate nexthop from the device's adjacency table. There are situations in which the kernel removes a nexthop from a multipath route (e.g., no carrier) and the device should do the same. Upon the reception of NH_{ADD,DEL} events, add or remove a nexthop from the device's adjacency table and refresh all the routes using the nexthop group. If all the nexthops of a multipath route are invalid, then any packet hitting the route would be trapped to the CPU for forwarding. If all the nexthops are DEAD, then the kernel would remove the route entirely. On the other hand, if all the nexthops are merely LINKDOWN, then the kernel would keep the route and forward any incoming packet using a different route. While the last case might sound like a problem, it's expected that a routing daemon running in user space would remove such a route from the FIB as it's dumped with the DEAD flag set. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:18 -05:00
Ido Schimmel	982acb9756	ipv4: fib: Notify about nexthop status changes When a multipath route is hit the kernel doesn't consider nexthops that are DEAD or LINKDOWN when IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN is set. Devices that offload multipath routes need to be made aware of nexthop status changes. Otherwise, the device will keep forwarding packets to non-functional nexthops. Add the FIB_EVENT_NH_{ADD,DEL} events to the fib notification chain, which notify capable devices when they should add or delete a nexthop from their tables. Cc: Roopa Prabhu <roopa@cumulusnetworks.com> Cc: David Ahern <dsa@cumulusnetworks.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:18 -05:00
Ido Schimmel	70ad35067c	mlxsw: spectrum_router: Use trap action only for some route types The device can have one of three actions associated with a route: 1) Remote - packets continue to the adjacency table 2) Local - packets continue to the neighbour table 3) Trap - packets continue to the CPU The first two actions can also trap packets to the CPU, but they do so using a different trap ID, which has a lower traffic class and less allotted bandwidth. We currently use the third action for both RTN_{LOCAL,BROADCAST} routes and RTN_UNICAST routes not pointing to the switch ports. However, packets that merely need to be forwarded by the switch are likely not control packets and can be therefore scheduled towards the CPU using a lower traffic class. Achieve the above by assigning the third action only to local and broadcast routes and have any other route use either of the first two actions, based on whether the route is gatewayed or not. This will also allow us to refresh routes using the local action and have them trap packets when their RIF is no longer valid following a NH_DEL event. One side effect of this patch is that we no longer give special treatment to multipath routes using both switch and non-switch ports towards their nexthops. If at least one of the nexthops can be resolved, then the device will forward the packets instead of trapping them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:18 -05:00
Ido Schimmel	4b41147751	mlxsw: spectrum_router: Determine offload status using generic function The previous patch introduced a generic function to determine whether a route should be offloaded or not. Make use of it here. In the future we're going to add more conditions to this test (e.g., whether TOS is non-zero), so it makes sense to centralize it instead of open coding it in a few places. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:18 -05:00
Ido Schimmel	013b20f953	mlxsw: spectrum_router: More accurately set offload flag We currently set the RTNH_F_OFFLOAD flag for all routes using remote action, but this isn't always correct. If none of the nexthops associated with a gatewayed route can be offloaded into the device, then any packet hitting it would be trapped to the CPU and forwarded by the kernel. Solve this by pushing the setting of the offload flag to after the route was programmed into the device, thereby allowing us to take all the parameters into account. This change will also help us further in the patchset, when we refresh routes following the reception of NH_{ADD,DEL} events. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:17 -05:00
Ido Schimmel	a8c9701427	mlxsw: spectrum_router: Refactor nexthop init routine The nexthop init and de-init functions both have symmetric parts concerned with the reflection of the neighbour entry into the device's adjacency table, in case it's used by a gatewayed route. These sections of code also need to be called when a nexthop is marked as valid / invalid following NH_{ADD,DEL} events. Break these out into appropriate functions, so that they could be invoked following the reception of above events. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:17 -05:00
Ido Schimmel	c8b030774f	mlxsw: spectrum_router: Remove FIB info from FIB entry struct After the previous changes, the FIB info is embedded in every nexthop group struct, which in turn is embedded in every FIB entry struct. We can therefore safely remove the FIB info from the entry struct. This has the added advantage of making the router-related structs more generic and suitable for use with IPv6 offloads. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:17 -05:00
Ido Schimmel	b8399a1e5a	mlxsw: spectrum_router: Store routes in a more generic way Up until now, the only FIB entries that were associated with a nexthop group were routes to remote networks where all the nexthop devices had a valid router interface (RIF). This is in contrast to the FIB code, where all the routes are associated with a FIB info. The same design choice needs to be applied to the driver's cache. Based on the NH_{ADD,DEL} events which will be added later in the patchset, we need to be able to change the action (forward / trap) associated with all the routes using the nexthop group. However, if we can't link between the nexthop and the routes using it, then the above is impossible. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:17 -05:00
Ido Schimmel	b3e8d1ebad	mlxsw: spectrum_router: Add gateway indication to nexthop group The next patch is going to generalize the way in which we store routes. Instead of attaching a nexthop group only to gatewayed routes, one will be attached to each route, in a similar way to the way the FIB code stores its routes. The above means that any function operating on a nexthop group cannot assume the group represents only gatewayed nexthops. One such function is the one that refreshes a nexthop group and updates the adjacency table following nexthop changes. For a nexthop group that doesn't represent any gateways this function would essentially be a NOP, but it would be useful if it did update the action associated with any route using it. This will allow us to later consolidate code paths when a nexthop changes following NH_{ADD,DEL} events. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:16 -05:00
Ido Schimmel	d55409cb28	mlxsw: spectrum_router: Use nexthop's scope to set action type We currently use the scope of the FIB info to distinguish between a direct unicast route and a gatewayed one. However, the kernel is perfectly happy to configure a route with scope UNIVERSE to a directly connected network. Instead, we can rely on the first nexthop's scope to check if the route is gatewayed or not. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:16 -05:00
Ido Schimmel	c53b8e1b5a	mlxsw: spectrum_router: Store nexthops in a hash table Later in the patchset we'll add the NH_{ADD,DEL} events which will let us know when a nexthop is considered to be dead. Based on these events we need to be able to add or remove the nexthop from the device's tables. Therefore, store the private nexthop structs in a hash table and use the kernel's fib_nh struct as the key, so that we'll be able to easily find them when the events are received. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:16 -05:00

1 2 3 4 5 ...

651056 Коммитов Все ветки Поиск

651056 Коммитов

Все ветки