Alternatively one could free the skb, OTOH I don't think this test is
useful so just remove it.
Cc: <linux-rdma@vger.kernel.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Alexei Starovoitov says:
====================
bpf: fix several bugs
First two patches address bugs found by Jann Horn.
Last patch is a minor samples fix spotted during the testing.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
llvm cannot always recognize memset as builtin function and optimize
it away, so just delete it. It was a leftover from testing
of bpf_perf_event_output() with large data structures.
Fixes: 39111695b1 ("samples: bpf: add bpf_perf_event_output example")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit 35578d7984 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter")
introduced clever way to check bpf_helper<->map_type compatibility.
Later on commit a43eec3042 ("bpf: introduce bpf_perf_event_output() helper") adjusted
the logic and inadvertently broke it.
Get rid of the clever bool compare and go back to two-way check
from map and from helper perspective.
Fixes: a43eec3042 ("bpf: introduce bpf_perf_event_output() helper")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
On a system with >32Gbyte of phyiscal memory and infinite RLIMIT_MEMLOCK,
the malicious application may overflow 32-bit bpf program refcnt.
It's also possible to overflow map refcnt on 1Tb system.
Impose 32k hard limit which means that the same bpf program or
map cannot be shared by more than 32k processes.
Fixes: 1be7f75d16 ("bpf: enable non-root eBPF programs")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Rivshin says:
====================
drivers: net: cpsw: phy-handle fixes
This series fixes a number of related issues around using phy-handle
properties in cpsw emac nodes.
Patch 1 fixes a bug if more than one slave is used, and either
slave uses the phy-handle property in the devicetree.
Patch 2 fixes a NULL pointer dereference which can occur if a
phy-handle property is used and of_phy_connect() return NULL,
such as with a bad devicetree.
Patch 3 fixes an issue where the phy-mode property would be ignored
if a phy-handle property was used. This also fixes a bogus error
message that would be emitted.
Patch 4 fixes makes the binding documentation more explicit that
exactly one PHY property should be used, and also marks phy_id as
deprecated.
Patch 5 cleans up the fixed-link case to work like the now-fixed
phy-handle case.
I have tested on the following hardware configurations:
- (EVMSK) dual emac, phy_id property in both slaves
- (EVMSK) dual emac, phy-handle property in both slaves
- (EVMSK) a bad phy-handle property pointing to &mmc1
- (EVMSK) phy_id property with incorrect PHY address
- (BeagleBoneBlack) single emac, phy_id property
- (custom) single emac, fixed-link subnode
Andrew Goodbody reported testing v2 on a board that doesn't use
dual_emac mode, but with 2 PHYs using phy-handle properties [1].
Nicolas Chauvet reported testing v2 on an HP t410 (dm8148).
Markus Brunner reported testing v1 on the following [2]:
- emac0 with phy_id and emac1 with fixed phy
- emac0 with phy-handle and emac1 with fixed phy
- emac0 with fixed phy and emac1 with fixed phy
[1] https://lkml.org/lkml/2016/4/22/537
[2] http://www.spinics.net/lists/netdev/msg357890.html
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If a fixed-link DT subnode is used, the phy_device was looked up so
that a PHY ID string could be constructed and passed to phy_connect().
This is not necessary, as the device_node can be passed directly to
of_phy_connect() instead. This reuses the same codepath as if the
phy-handle DT property was used.
Signed-off-by: David Rivshin <drivshin@allworx.com>
Tested-by: Nicolas Chauvet <kwizart@gmail.com>
Tested-by: Andrew Goodbody <andrew.goodbody@cambrionix.com>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The phy-handle, phy_id, and fixed-link properties are mutually exclusive,
and only one need be specified. Make this clear in the binding doc.
Also mark the phy_id property as deprecated, as phy-handle should be
used instead.
Signed-off-by: David Rivshin <drivshin@allworx.com>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The phy-mode emac property was only being processed in the phy_id
or fixed-link cases. However if phy-handle was specified instead,
an error message would complain about the lack of phy_id or
fixed-link, and then jump past the of_get_phy_mode(). This would
result in the PHY mode defaulting to MII, regardless of what the
devicetree specified.
Fixes: 9e42f71526 ("drivers: net: cpsw: add phy-handle parsing")
Signed-off-by: David Rivshin <drivshin@allworx.com>
Tested-by: Nicolas Chauvet <kwizart@gmail.com>
Tested-by: Andrew Goodbody <andrew.goodbody@cambrionix.com>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If an emac node has a phy-handle property that points to something
which is not a phy, then a segmentation fault will occur when the
interface is brought up. This is because while phy_connect() will
return ERR_PTR() on failure, of_phy_connect() will return NULL.
The common error check uses IS_ERR(), and so missed when
of_phy_connect() fails. The NULL pointer is then dereferenced.
Also, the common error message referenced slave->data->phy_id,
which would be empty in the case of phy-handle. Instead, use the
name of the device_node as a useful identifier. And in the phy_id
case add the error code for completeness.
Fixes: 9e42f71526 ("drivers: net: cpsw: add phy-handle parsing")
Signed-off-by: David Rivshin <drivshin@allworx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 9e42f71526 ("drivers: net: cpsw: add
phy-handle parsing") saved the "phy-handle" phandle into a new cpsw_priv
field. However, phy connections are per-slave, so the phy_node field should
be in cpsw_slave_data rather than cpsw_priv.
This would go unnoticed in a single emac configuration. But in dual_emac
mode, the last "phy-handle" property parsed for either slave would be used
by both of them, causing them both to refer to the same phy_device.
Fixes: 9e42f71526 ("drivers: net: cpsw: add phy-handle parsing")
Signed-off-by: David Rivshin <drivshin@allworx.com>
Tested-by: Nicolas Chauvet <kwizart@gmail.com>
Tested-by: Andrew Goodbody <andrew.goodbody@cambrionix.com>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When newlink creation fails at device-registration, the port->count
is decremented twice. Francesco Ruggeri (fruggeri@arista.com) found
this issue in Macvlan and the same exists in IPvlan driver too.
While fixing this issue I noticed another issue of missing unregister
in case of failure, so adding it to the fix which is similar to the
macvlan fix by Francesco in commit 3083796075 ("macvlan: fix failure
during registration v3")
Reported-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
CC: Eric Dumazet <edumazet@google.com>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pch_gbe_tx_ring.tx_lock is only used in the hard_xmit handler and
in the transmit completion reaper called from NAPI context.
Compile-tested only. Potential victims Cced.
Someone more knowledgeable may check if pch_gbe_tx_queue could
have some use for a mmiowb.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Andy Cress <andy.cress@us.kontron.com>
Cc: bryan@fossetcon.org
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch overloads the DSA master netdev, aka CPU Ethernet MAC to also
include switch-side statistics, which is useful for debugging purposes,
when the switch is not properly connected to the Ethernet MAC (duplex
mismatch, (RG)MII electrical issues etc.).
We accomplish this by retaining the original copy of the master netdev's
ethtool_ops, and just overload the 3 operations we care about:
get_sset_count, get_strings and get_ethtool_stats so as to intercept
these calls and call into the original master_netdev ethtool_ops, plus
our own.
We take this approach as opposed to providing a set of DSA helper
functions that would retrive the CPU port's statistics, because the
entire purpose of DSA is to allow unmodified Ethernet MAC drivers to be
used as CPU conduit interfaces, therefore, statistics overlay in such
drivers would simply not scale.
The new ethtool -S <iface> output would therefore look like this now:
<iface> statistics
p<2 digits cpu port number>_<switch MIB counter names>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP prequeue goal is to defer processing of incoming packets
to user space thread currently blocked in a recvmsg() system call.
Intent is to spend less time processing these packets on behalf
of softirq handler, as softirq handler is unfair to normal process
scheduler decisions, as it might interrupt threads that do not
even use networking.
Current prequeue implementation has following issues :
1) It only checks size of the prequeue against sk_rcvbuf
It was fine 15 years ago when sk_rcvbuf was in the 64KB vicinity.
But we now have ~8MB values to cope with modern networking needs.
We have to add sk_rmem_alloc in the equation, since out of order
packets can definitely use up to sk_rcvbuf memory themselves.
2) Even with a fixed memory truesize check, prequeue can be filled
by thousands of packets. When prequeue needs to be flushed, either
from sofirq context (in tcp_prequeue() or timer code), or process
context (in tcp_prequeue_process()), this adds a latency spike
which is often not desirable.
I added a fixed limit of 32 packets, as this translated to a max
flush time of 60 us on my test hosts.
Also note that all packets in prequeue are not accounted for tcp_mem,
since they are not charged against sk_forward_alloc at this point.
This is probably not a big deal.
Note that this might increase LINUX_MIB_TCPPREQUEUEDROPPED counts,
which is misnamed, as packets are not dropped at all, but rather pushed
to the stack (where they can be either consumed or dropped)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The collect metadata mode does not support GUE nor FOU. This might be
implemented later; until then, we should reject such config.
I think this is okay to be changed. It's unlikely anyone has such
configuration (as it doesn't work anyway) and we may need a way to
distinguish whether it's supported or not by the kernel later.
For backwards compatibility with iproute2, it's not possible to just check
the attribute presence (iproute2 always includes the attribute), the actual
value has to be checked, too.
Fixes: 2e15ea390e ("ip_gre: Add support to collect tunnel metadata.")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petko Manolov says:
====================
pegasus: correct buffer & packet sizes
As noticed by Lincoln Ramsay <a1291762@gmail.com> some old (usb 1.1) Pegasus
based devices may actually return more bytes than the specified in the datasheet
amount. That would not be a problem if the allocated space for the SKB was
equal to the parameter passed to usb_fill_bulk_urb(). Some poor bugger (i
really hope it was not me, but 'git blame' is useless in this case, so anyway)
decided to add '+ 8' to the buffer length parameter. Sometimes the usb transfer
overflows and corrupts the socket structure, leading to kernel panic.
The above doesn't seem to happen for newer (Pegasus2 based) devices which did
help this bug to hide for so long.
The new default is to not include the CRC at the end of each received package.
So far CRC has been ignored which makes no sense to do it in a first place.
The patch is against v4.6-rc5 and was tested on ADM8515 device by transferring
multiple gigabytes of data over a couple of days without any complaints from the
kernel. Please apply it to whatever net tree you deem fit.
Changes since v1:
- split the patch in two parts;
- corrected the subject lines;
Changes since v2:
- do not append CRC by default (based on a discussion with Johannes Berg);
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The default Pegasus setup was to append the status and CRC at the end of each
received packet. The status bits are used to update various stats, but CRC has
been ignored. The new default is to not append CRC at the end of RX packets.
Signed-off-by: Petko Manolov <petkan@mip-labs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
usb_fill_bulk_urb() receives buffer length parameter 8 bytes larger
than what's allocated by alloc_skb(); This seems to be a problem with
older (pegasus usb-1.1) devices, which may silently return more data
than the maximal packet length.
Reported-by: Lincoln Ramsay <a1291762@gmail.com>
Signed-off-by: Petko Manolov <petkan@mip-labs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mac80211 (which will be the first user of the
fq.h) recently started to support software A-MSDU
aggregation. It glues skbuffs together into a
single one so the backlog accounting needs to be
more fine-grained.
To avoid backlog sorting logic duplication split
it up for re-use.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Benc says:
====================
gre: fix lwtunnel support
This patchset fixes a few bugs in ipgre metadata mode implementation.
As an example, in this setup:
ip a a 192.168.1.1/24 dev eth0
ip l a gre1 type gre external
ip l s gre1 up
ip a a 192.168.99.1/24 dev gre1
ip r a 192.168.99.2/32 encap ip dst 192.168.1.2 ttl 10 dev gre1
ping 192.168.99.2
the traffic does not go through before this patchset and does as expected
with it applied.
v3: Back to v1 in order not to break existing users. Dropped patch 3, will
be fixed in iproute2 instead.
v2: Rejecting invalid configuration, added patch 3, dropped patch for
ETH_P_TEB (will target net-next).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In ipgre (i.e. not gretap) + collect metadata mode, the skb was assumed to
contain Ethernet header and was encapsulated as ETH_P_TEB. This is not the
case, the interface is ARPHRD_IPGRE and the protocol to be used for
encapsulation is skb->protocol.
Fixes: 2e15ea390e ("ip_gre: Add support to collect tunnel metadata.")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In ipgre mode (i.e. not gretap) with collect metadata flag set, the tunnel
is incorrectly assumed to be mGRE in NBMA mode (see commit 6a5f44d7a0).
This is not the case, we're controlling the encapsulation addresses by
lwtunnel metadata. And anyway, assigning dev->header_ops in collect metadata
mode does not make sense.
Although it would be more user firendly to reject requests that specify
both the collect metadata flag and a remote/local IP address, this would
break current users of gretap or introduce ugly code and differences in
handling ipgre and gretap configuration. Keep the current behavior of
remote/local IP address being ignored in such case.
v3: Back to v1, added explanation paragraph.
v2: Reject configuration specifying both remote/local address and collect
metadata flag.
Fixes: 2e15ea390e ("ip_gre: Add support to collect tunnel metadata.")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(root user triggerable) error case.
-----BEGIN PGP SIGNATURE-----
iQIcBAABCgAGBQJXIH65AAoJEGt7eEactAAdZmwP/R2UAHltBlYhCEMqcM+8VhPD
VDB3LFTYhOVUtVfFwqAzEoxPDjnGyGgZcjO5RxyCZLokm71KbbHAp3h3GnCVQCHd
dnRej6RD+Kl6n0EoTPCLy7ZAjSjpGBWOTy6MEgrAQnTtL+Q7nUch+z5DXIafTg/w
MOYke/WfD1jHbq2eGHu6HkbY3IUwoSKaEoA8qN20ieJRU7jsaG29RiAvBot2IVTI
g3hTL4FPzwSL5XM0qkoxDLPYA5Mo36Cb5sZ9AjkQCaqP/EemOoFxILGWUyi+17nd
zdF3zZB9lj+CdR+0IbjTjz8b457u1g/JW4dLl+iRqv7clynm3gmz7LivVhBcHogx
usg0hW9tDeZ5wzHj8v+e+C+RqyxtgHxVvYtt8Jh6bTqS8aMO8hor7qPFOcpJPmyz
ZbXThJnsvfaYoWAcvIXUa3Q2kwz2myVLDhlQBgwSi5TzgTDqb2GlYnGhvKOnB5cz
6JL3mZt2vi0Yvb7Lk9YzeEYs5cZq4DFVfx3nHgaVwDZ5GPoUTgIjgSnVLFC/J80a
r09Wmigtjy5qftw6w4pSUcf/Fj4L0BD+GhAZ+Hs5xhjUnlAxTlHxDW0bJ/kMmFzy
9B91YSswBc+3IGdjnsN+bNZ6T0XuvapOLRkC1V8fJDrtyy1Tel/LL9Giy/V13XLr
8vgETcgFJyQ3jnkzyRdr
=EWBn
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-davem-2016-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Just a single fix, for a per-CPU memory leak in a
(root user triggerable) error case.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This is never called with a NULL "buf" and anyway, we dereference 's' on
the lines before so it would Oops before we reach the check.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 13a56b44 ("at803x: Add support for hardware reset") added a
work-around for a hardware bug on the AT8030. However, the work-around
was being called for all 803x PHYs, even those that don't need it.
Function at803x_link_change_notify() checks to make sure that it only
resets the PHY on the 8030, but it makes more sense to not call that
function at all if it isn't needed.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct mlx5e_channel_param is a large structure that is allocated
on the stack of mlx5e_open_channels, and with a recent change
it has grown beyond the warning size for the maximum stack
that a single function should use:
mellanox/mlx5/core/en_main.c: In function 'mlx5e_open_channels':
mellanox/mlx5/core/en_main.c:1325:1: error: the frame size of 1072 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
The function is already using dynamic allocation and is not in
a fast path, so the easiest workaround is to use another kzalloc
for allocating the channel parameters.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: d3c9bc2743 ("net/mlx5e: Added ICO SQs")
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1) check skb size to avoid reading beyond its border when delivering
payloads, by Sven Eckelmann
2) initialize last_seen time in neigh_node object to prevent cleanup
routine from accidentally purge it, by Marek Lindner
3) release "recently added" slave interfaces upon virtual/batman
interface shutdown, by Sven Eckelmann
4) properly decrease router object reference counter upon routing table
update, by Sven Eckelmann
5) release queue slots when purging OGM packets of deactivating slave
interface, by Linus Lüssing
Patch 2 and 3 have no "Fixes:" tag because the offending commits date
back to when batman-adv was not yet officially in the net tree.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJXHt+MAAoJEJ4aZjxxc6bKMEgP/1DZWgQpHs5IM8yW7IQx8CQO
iMkpwfnJcRSOnADC/Z2GtIcz1Df2r+NZcqf5xMMF2CL0xlks024qTHoqeV7Poyel
DmzzETbQFWgdFD22RI70h25T4Yb400PP0saL2TbcVec6CiM57YN3cPbhjZvqzN32
bCIa38kwAGXvNqRzcy5WjDF/rllAoJZ0s055z+kY8WuVOmvOEor+FDmWFr0D8ioP
/utVP9ACA3YHZ39DMDFDsyBp6nMZOgHjpJVfmcubFULHmKvYQ0zMpgX19IVoMsJ6
HEtz9fKN4KPgAFbbPcU0GLg4srsNFmEbTB7Bqhqods+ZYN60M4Z0kexqYz1XuItH
atISvCIe14xHdT6gW32N707yK30DxUKIEpEg5wMXhE+1m041NfrfrcvaEXSLco6d
txsQzd1R4T5ry3V1YXv4znSVPmHvd84ykKrklQZgPA09QIPCCDb7Olp8Lj6mMsmc
OuEYLOfAoOD/KZcRUzY6kWpMRfOJLLXUgwcfSEES8MCaaBGD91YZyrSuHvixmo5V
24JTp0D/X/rkkQjI3a2Pf0dhvdGHAk1g6mElddo86a0UpRbm3qshquAPf+U8QcU0
Kt4rpN9dtOA8yTpnvxG2r04T32yzQQQNIqRGEnugokaJUECFF0mugxENTqcw2vux
uNxMhl36A21czA9s/Iu7
=o059
-----END PGP SIGNATURE-----
Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge
Antonio Quartulli says:
====================
In this patchset you can find the following fixes:
1) check skb size to avoid reading beyond its border when delivering
payloads, by Sven Eckelmann
2) initialize last_seen time in neigh_node object to prevent cleanup
routine from accidentally purge it, by Marek Lindner
3) release "recently added" slave interfaces upon virtual/batman
interface shutdown, by Sven Eckelmann
4) properly decrease router object reference counter upon routing table
update, by Sven Eckelmann
5) release queue slots when purging OGM packets of deactivating slave
interface, by Linus Lüssing
Patch 2 and 3 have no "Fixes:" tag because the offending commits date
back to when batman-adv was not yet officially in the net tree.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
There's no need to calculate rps hash if it was not enabled. So this
patch export rps_needed and check it before trying to get rps
hash. Tests (using pktgen to inject packets to guest) shows this can
improve pps about 13% (when rps is disabled).
Before:
~1150000 pps
After:
~1300000 pps
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
----
Changes from V1:
- Fix build when CONFIG_RPS is not set
Signed-off-by: David S. Miller <davem@davemloft.net>
The size allocated for target->hwinfo and the number of bytes copied in it
should be consistent.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
At forced 100 Full & Half duplex mode, chip may fail to set mode correctly
when cable is switched between long(~50+m) and short one.
As workaround, set to 10 before setting to 100 at forced 100 F/H mode.
Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix rx_bytes, tx_bytes and tx_frames error in netdev.stats.
- rx_bytes counted bytes excluding size of struct ethhdr.
- tx_packets didn't count multiple packets in a single urb
- tx_bytes included 8 bytes of extra commands.
Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The error return err is not initialized and there is a possibility
that err is not assigned causing mv88e6xxx_port_bridge_join to
return a garbage error return status. Fix this by initializing err
to 0.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin KaFai Lau says:
====================
tcp: Make use of MSG_EOR in tcp_sendmsg
v4:
~ Do not set eor bit in do_tcp_sendpages() since there is
no way to pass MSG_EOR from the userland now.
~ Avoid rmw by testing MSG_EOR first in tcp_sendmsg().
~ Move TCP_SKB_CB(skb)->eor test to a new helper
tcp_skb_can_collapse_to() (suggested by Soheil).
~ Add some packetdrill tests.
v3:
~ Separate EOR marking from the SKBTX_ANY_TSTAMP logic.
~ Move the eor bit test back to the loop in tcp_sendmsg and
tcp_sendpage because there could be >1 threads doing
sendmsg.
~ Thanks to Eric Dumazet's suggestions on v2.
~ The TCP timestamp bug fixes are separated into other threads.
v2:
~ Rework based on the recent work
"add TX timestamping via cmsg" by
Soheil Hassas Yeganeh <soheil.kdev@gmail.com>
~ This version takes the MSG_EOR bit as a signal of
end-of-response-message and leave the selective
timestamping job to the cmsg
~ Changes based on the v1 feedback (like avoid
unlikely check in a loop and adding tcp_sendpage
support)
~ The first 3 patches are bug fixes. The fixes in this
series depend on the newly introduced txstamp_ack in
net-next. I will make relevant patches against net after
getting some feedback.
~ The test results are based on the recently posted net fix:
"tcp: Fix SOF_TIMESTAMPING_TX_ACK when handling dup acks"
One potential use case is to use MSG_EOR with
SOF_TIMESTAMPING_TX_ACK to get a more accurate
TCP ack timestamping on application protocol with
multiple outgoing response messages (e.g. HTTP2).
One of our use case is at the webserver. The webserver tracks
the HTTP2 response latency by measuring when the webserver sends
the first byte to the socket till the TCP ACK of the last byte
is received. In the cases where we don't have client side
measurement, measuring from the server side is the only option.
In the cases we have the client side measurement, the server side
data can also be used to justify/cross-check-with the client
side data.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds an eor bit to the TCP_SKB_CB. When MSG_EOR
is passed to tcp_sendmsg, the eor bit will be set at the skb
containing the last byte of the userland's msg. The eor bit
will prevent data from appending to that skb in the future.
The change in do_tcp_sendpages is to honor the eor set
during the previous tcp_sendmsg(MSG_EOR) call.
This patch handles the tcp_sendmsg case. The followup patches
will handle other skb coalescing and fragment cases.
One potential use case is to use MSG_EOR with
SOF_TIMESTAMPING_TX_ACK to get a more accurate
TCP ack timestamping on application protocol with
multiple outgoing response messages (e.g. HTTP2).
Packetdrill script for testing:
~~~~~~
+0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10`
+0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1`
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7>
0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
0.200 < . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
+0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
0.200 write(4, ..., 14600) = 14600
0.200 sendto(4, ..., 730, MSG_EOR, ..., ...) = 730
0.200 sendto(4, ..., 730, MSG_EOR, ..., ...) = 730
0.200 > . 1:7301(7300) ack 1
0.200 > P. 7301:14601(7300) ack 1
0.300 < . 1:1(0) ack 14601 win 257
0.300 > P. 14601:15331(730) ack 1
0.300 > P. 15331:16061(730) ack 1
0.400 < . 1:1(0) ack 16061 win 257
0.400 close(4) = 0
0.400 > F. 16061:16061(0) ack 1
0.400 < F. 1:1(0) ack 16062 win 257
0.400 > . 16062:16062(0) ack 2
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Soheil Hassas Yeganeh says:
====================
tcp: simplify ack tx timestamps
v2:
- Fully remove SKBTX_ACK_TSTAMP, as suggested by Willem de Bruijn.
This patch series aims at removing redundant checks and fields
for ack timestamps for TCP.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The SKBTX_ACK_TSTAMP flag is set in skb_shinfo->tx_flags when
the timestamp of the TCP acknowledgement should be reported on
error queue. Since accessing skb_shinfo is likely to incur a
cache-line miss at the time of receiving the ack, the
txstamp_ack bit was added in tcp_skb_cb, which is set iff
the SKBTX_ACK_TSTAMP flag is set for an skb. This makes
SKBTX_ACK_TSTAMP flag redundant.
Remove the SKBTX_ACK_TSTAMP and instead use the txstamp_ack bit
everywhere.
Note that this frees one bit in shinfo->tx_flags.
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Suggested-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the redundant check for sk->sk_tsflags in tcp_tx_timestamp.
tcp_tx_timestamp() receives the tsflags as a parameter. As a
result the "sk->sk_tsflags || tsflags" is redundant, since
tsflags already includes sk->sk_tsflags plus overrides from
control messages.
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix casting in net_gso_ok. Otherwise the shift on
gso_type << NETIF_F_GSO_SHIFT may hit the 32th bit and make it look like
a INT_MIN, which is then promoted from signed to uint64 which is
0xffffffff80000000, resulting in wrong behavior when it is and'ed with
the feature itself, as in:
This test app:
#include <stdio.h>
#include <stdint.h>
int main(int argc, char **argv)
{
uint64_t feature1;
uint64_t feature2;
int gso_type = 1 << 15;
feature1 = gso_type << 16;
feature2 = (uint64_t)gso_type << 16;
printf("%lx %lx\n", feature1, feature2);
return 0;
}
Gives:
ffffffff80000000 80000000
So that this:
return (features & feature) == feature;
Actually works on more bits than expected and invalid ones.
Fix is to promote it earlier.
Issue noted while rebasing SCTP GSO patch but posting separetely as
someone else may experience this meanwhile.
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When configured in fixed link, the DaVinci emac driver sets the
priv->phydev to NULL and further ioctl calls to the phy_mii_ioctl()
causes the kernel to crash.
Cc: Brian Hutchinson <b.hutchman@gmail.com>
Fixes: 1bb6aa56bb ("net: davinci_emac: Add support for fixed-link PHY")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ath9k
* fix a couple release old throughput regression on ar9281
iwlwifi
* add new device IDs for 8265
* fix a NULL pointer dereference when paging firmware asserts
* remove a WARNING on gscan capabilities
* fix MODULE_FIRMWARE for 8260
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJXHjlZAAoJEG4XJFUm622brcgIAJ127uU9S/X3oM2EK76TdbV+
yZnp8Te57bK+ypXRP7YMaL7gevR3q9TEzQkY6ZQ0iMCZenQouxB/Wg9xBZjJ/ny0
dlVtmwM2sSvYdieTSftKGWLg3FH0FaJ1VXCJirLwrbs4Hbi2FqcGKMrLxflpqgW2
m1FkTZzCbjokaHZgoWJmlDM/UbPtTz8BJgLICRd7hJ4jJ2Mi2k32LCSxsXn0GbUG
naS6bDUGBcKjly9PKomY9y+rj897lm/m3rGKFJPkX/T7Klb1UMBLCj8NQIyF+/AX
vu1fLxMwlFyXnmRJAFPyj9T2Vok3aDjRuA9xck3MufxOFjOhEGJiPjZV3wNNFj8=
=aDY4
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-for-davem-2016-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.6
ath9k
* fix a couple release old throughput regression on ar9281
iwlwifi
* add new device IDs for 8265
* fix a NULL pointer dereference when paging firmware asserts
* remove a WARNING on gscan capabilities
* fix MODULE_FIRMWARE for 8260
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add myself and Edward Cree as maintainers.
Remove Shradha Shah, who is on extended leave.
Cc: David S. Miller <davem@davemloft.net>
Cc: Edward Cree <ecree@solarflare.com>
Cc: Shradha Shah <sshah@solarflare.com>
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When certain firmware variants are selected (via the sfboot utility) the
SFC7000 and SFC8000 series NICs don't support RSS. The driver still
tries (and fails) to insert filters with the RSS flag, and the NIC fails
to pass traffic.
When the firmware reports RSS_LIMITED suppress allocating a default RSS
context. The absence of an RSS context is picked up in filter insertion
and RSS flags are discarded.
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
napi_disable() can not be called with bh disabled, move locking just
around myri10ge_ss_lock_napi() .
Patches fixes following bug:
[ 114.278378] BUG: sleeping function called from invalid context at net/core/dev.c:4383
<snip>
[ 114.313712] Call Trace:
[ 114.314943] [<ffffffff817010ce>] dump_stack+0x19/0x1b
[ 114.317673] [<ffffffff810ce7f3>] __might_sleep+0x173/0x230
[ 114.320566] [<ffffffff815b3117>] napi_disable+0x27/0x90
[ 114.323254] [<ffffffffa01e437f>] myri10ge_close+0xbf/0x3f0 [myri10ge]
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Hyong-Youb Kim <hykim@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mode->hdisplay * (var->bits_per_pixel + 7) gets evaluated before
the division, potentially making the pitch larger than it should
be.
Since the original intention is to do a div-round-up, just use
the macro instead.
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Instead of calling vmw_cmd_ok, call vmw_cmd_dx_cid_check to
validate the context id for query commands.
Signed-off-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>