Граф коммитов

649397 Коммитов

Автор SHA1 Сообщение Дата
stephen hemminger b58a185801 netvsc: simplify get next send section
Use kernel for_each_clear_bit macro to simplify finding next
available send section.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:29:01 -05:00
Simon Xiao 6c80f3fc23 netvsc: report per-channel stats in ethtool statistics
Report packets and bytes transferred through a vmbus channel via ethtool.
This supersedes need for per-cpu statistics.

Example:
$ ethtool -S eth0
NIC statistics:
...
     tx_queue_0_packets: 3523179
     tx_queue_0_bytes: 505370920
     rx_queue_0_packets: 41430490
     rx_queue_0_bytes: 62714661254
     tx_queue_1_packets: 0
     tx_queue_1_bytes: 0
     rx_queue_1_packets: 0
     rx_queue_1_bytes: 0
...

Reviewed-by: Long Li <longli@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Simon Xiao <sixiao@microsoft.com>
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:29:01 -05:00
stephen hemminger 793e395555 netvsc: account for packets/bytes transmitted after completion
Most drivers do not increment transmit statistics until after the
transmit is completed. This will also be necessary for BQL support.

Slight additional complexity because the netvsc driver aggregates
multiple packets into one transmit.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:29:01 -05:00
stephen hemminger 46b4f7f5d1 netvsc: eliminate per-device outstanding send counter
Since now keep track of per-queue outstanding sends, we can avoid
one atomic update by removing no longer needed per-device atomic.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:29:01 -05:00
stephen hemminger 2289f0aa70 netvsc: simplify rndis_filter_remove
All caller's already have pointer to netvsc_device so pass it.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:29:00 -05:00
stephen hemminger 2c7f83ca71 netvsc: don't pass void * to internal device_add
All the caller's/callee's know that the format of the device_add
parameter is a netvsc_device_info struct.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:29:00 -05:00
stephen hemminger dc54a08cd3 netvsc: optimize receive path
Do manual optimizations of receive path:
  - remove checks for impossible conditions (but keep checks
    for bad data from host)
  - pass argument down, rather than having callee recompute what
    is already known
  - remove indirection about receive buffer datalength
  - remove dependence on VLAN_TAG_PRESENCE
  - use _hot/_cold and likely/unlikely

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:29:00 -05:00
stephen hemminger b8b835a89b netvsc: group all per-channel state together
Put all the per-channel state together in one data struct.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:28:59 -05:00
stephen hemminger ceaaea0483 netvsc: remove unused variables
Fixes set but never used warnings

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:28:59 -05:00
stephen hemminger d8e18ee0fa netvsc: enhance transmit select_queue
The netvsc select queue function was missing many of the flow caching
features that exist in default tx queue selection. Add the same
logic to remember queue based on socket and implement two level
mapping (like RSS).

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:28:59 -05:00
stephen hemminger ff4a441990 netvsc: allow get/set of RSS indirection table
Allow setting receive indirection table. Also uses the system standard
for initialization.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:28:59 -05:00
stephen hemminger 2b01888d1b netvsc: allow more flexible setting of number of channels
This allows for number of channels to be managed in a manner similar
to existing hardware drivers. It also removes the restriction of
maximum 8 channels and allows as many as the host will allow.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:28:58 -05:00
stephen hemminger 962f3fee83 netvsc: add ethtool ops to get/set RSS key
For some cases it is useful to be able to change RSS key value.
For example, replacing RSS key with a symmetric hash.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:28:58 -05:00
stephen hemminger b5a5dc8dc8 netvsc: report rss field values
Report current components used in RSS hash.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:28:58 -05:00
stephen hemminger b448f4e892 netvsc: report number of rx queues in ethtool
Report actual number of receive queues to ethtool.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:28:58 -05:00
stephen hemminger 23312a3be9 netvsc: negotiate checksum and segmentation parameters
Redo how Hyper-V network driver negotiates offload features. Query the
host to determine offload settings, and use the result.

Also:
  * disable IPv4 header checksum offload (not used by Linux)
  * enable TSO only if host supports
  * enable UDP checksum offload if supported
  * don't advertise support for checksumming of non-IP protocols
  * adjust GSO maximum segment size
  * enable HIGHDMA

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:28:57 -05:00
stephen hemminger 0b307ebd68 netvsc: remove no longer needed receive staging buffers
The ring buffer mapping now handles the wraparound case
inside get_next_pkt_raw. Therefore it is not necessary to have an
additional special receive staging buffer.

See commit 1562edaed8c164ca5199 ("Drivers: hv: ring_buffer: count on
wrap around mappings")

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 16:28:57 -05:00
David S. Miller f2ceab0baf Merge branch 'mv88e6xxx-external-MDIO'
Andrew Lunn says:

====================
External MDIO support for mv88e6xxx

The mv88e6390 family of switches has two MDIO busses, one internal to
the switch and a second one for external usage. Older generations of switches
have a single MDIO bus, which is used both internally and externally.

Refactor the existing MDIO driver code to allow for multiple MDIO
busses, and implement the second MDIO bus on mv88e6390.

This is a rewrite of a patch previously submitted as part of "Batch
3". It has been broken up into 5 smaller patches. A compatible string
is now used in the device tree to indicate the external MDIO bus.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:33:52 -05:00
Andrew Lunn c61a6a71e7 net: dsa: mv88e6xxx: Implement the 6390 external MDIO bus
With all the infrastructure in place, implement access to the external
MDIO bus on the 6390 family.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:33:51 -05:00
Andrew Lunn a3c53be55c net: dsa: mv88e6xxx: Support multiple MDIO busses
The mv88e6390 has multiple MDIO busses. Generalize the parsing of the
device tree to support multiple mdio nodes. The external mdio bus has
a compatible strings to indicate it is external.

Keep a linked list of busses, placing the external mdio bus at the
tail of the list. When within the driver an mdio bus is needed,
e.g. for EEE or SERDES, use the head of the list which should be the
internal bus.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:33:51 -05:00
Andrew Lunn 0dd12d54c4 net: dsa: mv88e6xxx: Add mdio private structure
Have the MDIO bus driver code allocate a private structure and make
the chip a member of it. This will allow us to add further members in
the future.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:33:51 -05:00
Andrew Lunn ee26a2284b net: dsa: mv88e6xxx: Pass mii_bus to all PHY operations
In preparation for supporting multiple MDIO busses, pass the mii_bus
structure to all PHY operations. It will in future then be clear on
which MDIO bus the operation should be performed.

For reads/write from phylib, the mii_bus is readily available. However
some internal code also access the PHY, e.g. for EEE and SERDES. Make
this code use the one and only currently available MDIO bus.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:33:50 -05:00
Andrew Lunn efb3e74da2 net: dsa: mv88e6xxx: Abstract mv88e6165 PHY operations
The mv88e6165 family has the internal PHYs mapped directly onto the
SMI register space as the switch. So the registers can be read
directly. Put a wrapper around this, in preparation for changing the
signature in order to support the external MDIO bus of the 6390.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:33:50 -05:00
Colin Ian King 1464086939 net: sctp: fix array overrun read on sctp_timer_tbl
Table sctp_timer_tbl is missing a TIMEOUT_RECONF string so
add this in. Also compare timeout with the size of the array
sctp_timer_tbl rather than SCTP_EVENT_TIMEOUT_MAX.  Also add
a build time check that SCTP_EVENT_TIMEOUT_MAX is correct
so we don't ever get this kind of mismatch between the table
and SCTP_EVENT_TIMEOUT_MAX in the future.

Kudos to Marcelo Ricardo Leitner for spotting the missing string
and suggesting the build time sanity check.

Fixes CoverityScan CID#1397639 ("Out-of-bounds read")

Fixes: 7b9438de0c ("sctp: add stream reconf timer")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:24:35 -05:00
David S. Miller 7110fe471e Merge branch 'aquantia'
David VomLehn says:

====================
net: ethernet: aquantia: Add AQtion 2.5/5 GB NIC driver

This series introduces the AQtion NIC driver for the aQuantia
AQC107/AQC108 network devices.

v1: Initial version
v2: o Make necessary drivers/net/ethernet changes to integrate software
    o Drop intermediate atlantic directory
    o Remove Makefile things only appropriate to out of tree module
      building
v3: o Move changes to drivers/net/ethernet/{Kconfig,Makefile} to the last
      patch to ensure clean bisection.
    o Removed inline attribute aq_hw_write_req() as it was defined in
      only one .c file.
    o #included pci.h in aq_common.h to get struct pci definition.
    o Modified code to unlock based execution flow rather than using a
      flag.
    o Made a number of functions that were only used in a single file
      static.
    o Cleaned up error and return code handling in various places.
    o Remove AQ_CFG_IP_ALIGN definition.
    o Other minor code clean up.
v4: o Using do_div for 64 bit division.
    o Modified NIC statistics code.
    o Using build_skb instead netdev_alloc_skb for single fragment packets.
    o Removed extra aq_nic.o from Makefile
v5: o Removed extra newline at the end of the files.
v6: o Removed unnecessary cast from void*.
    o Reworked strings array for ethtool statistics.
    o Added stringset == ETH_SS_STATS checking.
    o AQ_OBJ_HEADER replaced to aq_obj_header_s struct.
    o AQ_OBJ_SET/TST/CLR macroses replaced to inline functions.
    o Driver sources placed in to atlantic directory.
    o Fixed compilation warnings (Make W=1)
    o Added firmware version checking.
    o Code cleaning.
v7  o Removed unnecessary cast from memory allocation function (aq_ring.c).
v8  o Switched to using kcalloc instead kzalloc.
    o Now provide bus_info for ethtool
    o Used div() to avoid __bad_udelay build error.

Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:03:42 -05:00
David VomLehn aa13f7cedd net: ethernet: aquantia: Integrate AQtion 2.5/5 GB NIC driver
Modify the drivers/net/ethernet/{Makefile,Kconfig} file to make them a
part of the network drivers build.

Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:03:41 -05:00
David VomLehn ee6d6d0055 net: ethernet: aquantia: Receive side scaling
Add definitions that support receive side scaling.

Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:03:41 -05:00
David VomLehn c5760d03d4 net: ethernet: aquantia: Ethtool support
Add the driver interfaces required for support by the ethtool utility.

Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:03:41 -05:00
David VomLehn 753f4783be net: ethernet: aquantia: Hardware interface and utility functions
Add functions to interface with the hardware and some utility functions.

Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:03:40 -05:00
David VomLehn 98c4c20142 net: ethernet: aquantia: Atlantic hardware abstraction layer
Add common functions for Atlantic hardware abstraction layer.

Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:03:40 -05:00
David VomLehn a4d36e20d0 net: ethernet: aquantia: PCI operations
Add functions that handle the PCI bus interface.

Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:03:40 -05:00
David VomLehn 970a2e9864 net: ethernet: aquantia: Vector operations
Add functions to manululate the vector of receive and transmit rings.

Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel.Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:03:40 -05:00
David VomLehn bab6de8fd1 net: ethernet: aquantia: Atlantic A0 and B0 specific functions.
Add Atlantic A0 and B0 specific functions.

Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:03:39 -05:00
David VomLehn 97bde5c4f9 net: ethernet: aquantia: Support for NIC-specific code
Add support for code specific to the Atlantic NIC.

Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:03:39 -05:00
David VomLehn ef8115356a net: ethernet: aquantia: Low-level hardware interfaces
Add definitions of functions that interface directly with the hardware.

Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel.Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:03:39 -05:00
David VomLehn 018423e90b net: ethernet: aquantia: Add ring support code
Add code to support the transmit and receive ring buffers.

Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:03:39 -05:00
David VomLehn 3a35780f31 net: ethernet: aquantia: Common functions and definitions
Add files containing the functions and definitions used in common in
different functional areas.

Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:03:38 -05:00
David VomLehn 5015024ddf net: ethernet: aquantia: Make and configuration files.
Patches to create the make and configuration files.

Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:03:38 -05:00
Florian Fainelli 82272db84d net: dsa: Drop WARN() in tag_brcm.c
We may be able to see invalid Broadcom tags when the hardware and drivers are
misconfigured, or just while exercising the error path. Instead of flooding
the console with messages, flat out drop the packet.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 15:00:22 -05:00
Stephen Boyd 3ebe8344eb net: ks8851: Drop eeprom_size structure member
After commit 51b7b1c34e (KSZ8851-SNL: Add ethtool support for
EEPROM via eeprom_93cx6, 2011-11-21) this structure member is
unused. Delete it.

Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 14:56:44 -05:00
David S. Miller bef4e179b0 Merge branch 'bpf-misc'
Daniel Borkmann says:

====================
Misc BPF improvements

This series adds various misc improvements to BPF, f.e. allowing
skb_load_bytes() helper to be used with filter/reuseport programs
to facilitate programming, test cases for program tag, etc. For
details, please see individual patches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 14:46:07 -05:00
Daniel Borkmann 3fadc80115 bpf: enable verifier to better track const alu ops
William reported couple of issues in relation to direct packet
access. Typical scheme is to check for data + [off] <= data_end,
where [off] can be either immediate or coming from a tracked
register that contains an immediate, depending on the branch, we
can then access the data. However, in case of calculating [off]
for either the mentioned test itself or for access after the test
in a more "complex" way, then the verifier will stop tracking the
CONST_IMM marked register and will mark it as UNKNOWN_VALUE one.

Adding that UNKNOWN_VALUE typed register to a pkt() marked
register, the verifier then bails out in check_packet_ptr_add()
as it finds the registers imm value below 48. In the first below
example, that is due to evaluate_reg_imm_alu() not handling right
shifts and thus marking the register as UNKNOWN_VALUE via helper
__mark_reg_unknown_value() that resets imm to 0.

In the second case the same happens at the time when r4 is set
to r4 &= r5, where it transitions to UNKNOWN_VALUE from
evaluate_reg_imm_alu(). Later on r4 we shift right by 3 inside
evaluate_reg_alu(), where the register's imm turns into 3. That
is, for registers with type UNKNOWN_VALUE, imm of 0 means that
we don't know what value the register has, and for imm > 0 it
means that the value has [imm] upper zero bits. F.e. when shifting
an UNKNOWN_VALUE register by 3 to the right, no matter what value
it had, we know that the 3 upper most bits must be zero now.
This is to make sure that ALU operations with unknown registers
don't overflow. Meaning, once we know that we have more than 48
upper zero bits, or, in other words cannot go beyond 0xffff offset
with ALU ops, such an addition will track the target register
as a new pkt() register with a new id, but 0 offset and 0 range,
so for that a new data/data_end test will be required. Is the source
register a CONST_IMM one that is to be added to the pkt() register,
or the source instruction is an add instruction with immediate
value, then it will get added if it stays within max 0xffff bounds.
>From there, pkt() type, can be accessed should reg->off + imm be
within the access range of pkt().

  [...]
  from 28 to 30: R0=imm1,min_value=1,max_value=1
    R1=pkt(id=0,off=0,r=22) R2=pkt_end
    R3=imm144,min_value=144,max_value=144
    R4=imm0,min_value=0,max_value=0
    R5=inv48,min_value=2054,max_value=2054 R10=fp
  30: (bf) r5 = r3
  31: (07) r5 += 23
  32: (77) r5 >>= 3
  33: (bf) r6 = r1
  34: (0f) r6 += r5
  cannot add integer value with 0 upper zero bits to ptr_to_packet

  [...]
  from 52 to 80: R0=imm1,min_value=1,max_value=1
    R1=pkt(id=0,off=0,r=34) R2=pkt_end R3=inv
    R4=imm272 R5=inv56,min_value=17,max_value=17
    R6=pkt(id=0,off=26,r=34) R10=fp
  80: (07) r4 += 71
  81: (18) r5 = 0xfffffff8
  83: (5f) r4 &= r5
  84: (77) r4 >>= 3
  85: (0f) r1 += r4
  cannot add integer value with 3 upper zero bits to ptr_to_packet

Thus to get above use-cases working, evaluate_reg_imm_alu() has
been extended for further ALU ops. This is fine, because we only
operate strictly within realm of CONST_IMM types, so here we don't
care about overflows as they will happen in the simulated but also
real execution and interaction with pkt() in check_packet_ptr_add()
will check actual imm value once added to pkt(), but it's irrelevant
before.

With regards to 06c1c04972 ("bpf: allow helpers access to variable
memory") that works on UNKNOWN_VALUE registers, the verifier becomes
now a bit smarter as it can better resolve ALU ops, so we need to
adapt two test cases there, as min/max bound tracking only becomes
necessary when registers were spilled to stack. So while mask was
set before to track upper bound for UNKNOWN_VALUE case, it's now
resolved directly as CONST_IMM, and such contructs are only necessary
when f.e. registers are spilled.

For commit 6b17387307 ("bpf: recognize 64bit immediate loads as
consts") that initially enabled dw load tracking only for nfp jit/
analyzer, I did couple of tests on large, complex programs and we
don't increase complexity badly (my tests were in ~3% range on avg).
I've added a couple of tests similar to affected code above, and
it works fine with verifier now.

Reported-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Gianluca Borello <g.borello@gmail.com>
Cc: William Tu <u9012063@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 14:46:06 -05:00
Daniel Borkmann 62b6466026 bpf: add prog tag test case to bpf selftests
Add the test case used to compare the results from fdinfo with
af_alg's output on the tag. Tests are from min to max sized
programs, with and without maps included.

  # ./test_tag
  test_tag: OK (40945 tests)

Tested on x86_64 and s390x.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 14:46:06 -05:00
Daniel Borkmann d1b662adcd bpf: allow option for setting bpf_l4_csum_replace from scratch
When programs need to calculate the csum from scratch for small UDP
packets and use bpf_l4_csum_replace() to feed the result from helpers
like bpf_csum_diff(), then we need a flag besides BPF_F_MARK_MANGLED_0
that would ignore the case of current csum being 0, and which would
still allow for the helper to set the csum and transform when needed
to CSUM_MANGLED_0.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 14:46:06 -05:00
Daniel Borkmann 2492d3b867 bpf: enable load bytes helper for filter/reuseport progs
BPF_PROG_TYPE_SOCKET_FILTER are used in various facilities such as
for SO_REUSEPORT and packet fanout demuxing, packet filtering, kcm,
etc, and yet the only facility they can use is BPF_LD with {BPF_ABS,
BPF_IND} for single byte/half/word access.

Direct packet access is only restricted to tc programs right now,
but we can still facilitate usage by allowing skb_load_bytes() helper
added back then in 05c74e5e53 ("bpf: add bpf_skb_load_bytes helper")
that calls skb_header_pointer() similarly to bpf_load_pointer(), but
for stack buffers with larger access size.

Name the previous sk_filter_func_proto() as bpf_base_func_proto()
since this is used everywhere else as well, similarly for the ctx
converter, that is, bpf_convert_ctx_access().

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 14:46:05 -05:00
Daniel Borkmann 4faf940dd8 bpf: simplify __is_valid_access test on cb
The __is_valid_access() test for cb[] from 62c7989b24 ("bpf: allow
b/h/w/dw access for bpf's cb in ctx") was done unnecessarily complex,
we can just simplify it the same way as recent fix from 2d071c643f
("bpf, trace: make ctx access checks more robust") did. Overflow can
never happen as size is 1/2/4/8 depending on access.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 14:46:05 -05:00
Arnd Bergmann 187024144c phy: marvell: remove conflicting initializer
One line was apparently pasted incorrectly during a new feature patch:

drivers/net/phy/marvell.c:2090:15: error: initialized field overwritten [-Werror=override-init]
   .features = PHY_GBIT_FEATURES,

I'm removing the extraneous line here to avoid the W=1 warning and restore
the previous flags value, and I'm slightly reordering the lines for consistency
to make it less likely to happen again in the future. The ordering in the
array is still not the same as in the structure definition, instead I picked
the order that is most common in this file and that seems to make more sense
here.

Fixes: 0b04680fda ("phy: marvell: Add support for temperature sensor")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 14:08:46 -05:00
Phil Sutter e1636836e0 net: dummy: Introduce dummy virtual functions
The idea for this was born when testing VF support in iproute2 which was
impeded by hardware requirements. In fact, not every VF-capable hardware
driver implements all netdev ops, so testing the interface is still hard
to do even with a well-sorted hardware shelf.

To overcome this and allow for testing the user-kernel interface, this
patch allows to turn dummy into a PF with a configurable amount of VFs.

Since my patch series 'bus-agnostic-num-vf' has been accepted,
implementing the required interfaces is pretty straightforward: Iff
'num_vfs' module parameter was given a value >0, a dummy bus type is
being registered which implements the 'num_vf()' callback. Additionally,
a dummy parent device common to all dummy devices is registered which
sits on the above dummy bus.

Joint work with Sabrina Dubroca.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 14:07:22 -05:00
Philippe Reynes 8b86b2c1b8 net: broadcom: bnx2x: use new api ethtool_{get|set}_link_ksettings
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 13:49:19 -05:00
David S. Miller 36f877804c Merge branch 'packet-sampling-offload'
Jiri Pirko says:

====================
Add support for offloading packet-sampling

Yotam says:

The first patch introduces the psample module, a netlink channel dedicated
to packet sampling implemented using generic netlink. This module provides
a generic way for kernel modules to sample packets, while not being tied
to any specific subsystem like NFLOG.

The second patch adds the sample tc action, which uses psample to randomly
sample packets that match a classifier. The user can configure the psample
group number, the sampling rate and the packet's truncation (to save
kernel-user traffic).

The last two patches add the support for offloading the matchall-sample
tc command in the mlxsw driver, for ingress qdiscs.

An example for psample usage can be found in the libpsample project at:
https://github.com/Mellanox/libpsample

v1->v2:
- Reword first patch's commit message
- Fix typo in comment in second patch
- Change order of tc_sample uapi enum to match convention
- Rename act_sample action callback tcf_sample -> tcf_sample_act
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24 13:44:29 -05:00