Граф коммитов

636794 Коммитов

Автор SHA1 Сообщение Дата
Jérémy Lefaure e1465d125d mm, thp: propagation of conditional compilation in khugepaged.c
Commit b46e756f5e ("thp: extract khugepaged from mm/huge_memory.c")
moved code from huge_memory.c to khugepaged.c.  Some of this code should
be compiled only when CONFIG_SYSFS is enabled but the condition around
this code was not moved into khugepaged.c.

The result is a compilation error when CONFIG_SYSFS is disabled:

  mm/built-in.o: In function `khugepaged_defrag_store': khugepaged.c:(.text+0x2d095): undefined reference to `single_hugepage_flag_store'
  mm/built-in.o: In function `khugepaged_defrag_show': khugepaged.c:(.text+0x2d0ab): undefined reference to `single_hugepage_flag_show'

This commit adds the #ifdef CONFIG_SYSFS around the code related to
sysfs.

Link: http://lkml.kernel.org/r/20161114203448.24197-1-jeremy.lefaure@lse.epita.fr
Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 16:32:52 -08:00
Linus Torvalds f513581c35 Two small fixes for MIPI PLLs on sunxi devices and a build fix
for a Broadcom clk driver having unmet dependencies.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJYNhi4AAoJEK0CiJfG5JUlRc8P+wYZRyhm/rCVpMnqCk/2ZaZU
 Oe9hJQe6x85LEeh3+Q/IdgxGYQ3nDklNRtmFQJ62wkCv7WrrIXO7zLvaTJ2JjsGd
 RU8tZTc9v1IsBhZhKlqv8fx3lvTCGf8aCeHg/vGczDfbkwHEwdsgFjwD9115WTdR
 6UenybklZmU+p9k0Th/DrqYN4xVIlbr+So9dy+KbaRdM9ZUvg3pTb2wX3KH7V0qG
 Wytz90FQ9WZChjxR8kQRLyBsJXtgNGgLDe7KTraYUizSE0Do/oVGM/FhQWEWnmJ4
 i1WgkS5Bve9AoOdWWkLPlSz295OwAbW2uQ+U8KYEYsalT2eSM/ppTnX0Mo5CopKh
 vwSjNL/kWTAtwc5majTf8fuDworX6QhCHo9FseF38D/xDYPSiXZsaCYCENY7Hjpt
 ggiFe6KkdSWgMPcns+vvgRRVcNQT+I2kKPUGN2IDfu5r/brHWJvbstnqsmHq75gB
 qlNIVcq5o69B4jr6Qumh/unI9JMER/zo7m9PLVG6LVGU0SqumzAU3XDN4Ifk5dea
 /xsQF1H6M9d5vulEAqToxG5dl3dmIXndLnAYuHHQoUzVsFSxfLfTKxlksZsOgyza
 xdbpzQ3NNZAvfTTxjpsNKDlcrik4Je9bbsXypsvt3QvpLGAz18yirxAr00pmvdbV
 R6RO12nye3q94+8u4uPt
 =a/l6
 -----END PGP SIGNATURE-----

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
 "Two small fixes for MIPI PLLs on sunxi devices and a build fix for a
  Broadcom clk driver having unmet dependencies"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: bcm: Fix unmet Kconfig dependencies for CLK_BCM_63XX
  clk: sunxi-ng: enable so-said LDOs for A33 SoC's pll-mipi clock
  clk: sunxi-ng: sun6i-a31: Enable PLL-MIPI LDOs when ungating it
2016-11-30 15:15:49 -08:00
Jeremy Linton 4c9456df88 arm64: dts: juno: Correct PCI IO window
The PCIe root complex on Juno translates the MMIO mapped
at 0x5f800000 to the PIO address range starting at 0
(which is common because PIO addresses are generally < 64k).
Correct the DT to reflect this.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2016-11-30 23:49:16 +01:00
Jason Wang aa196eed3d macvtap: handle ubuf refcount correctly when meet errors
We trigger uarg->callback() immediately after we decide do datacopy
even if caller want to do zerocopy. This will cause the callback
(vhost_net_zerocopy_callback) decrease the refcount. But when we meet
an error afterwards, the error handling in vhost handle_tx() will try
to decrease it again. This is wrong and fix this by delay the
uarg->callback() until we're sure there's no errors.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 15:06:02 -05:00
Jason Wang af1cc7a2b8 tun: handle ubuf refcount correctly when meet errors
We trigger uarg->callback() immediately after we decide do datacopy
even if caller want to do zerocopy. This will cause the callback
(vhost_net_zerocopy_callback) decrease the refcount. But when we meet
an error afterwards, the error handling in vhost handle_tx() will try
to decrease it again. This is wrong and fix this by delay the
uarg->callback() until we're sure there's no errors.

Reported-by: wangyunjian <wangyunjian@huawei.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 15:06:01 -05:00
Gao Feng 8f679ed88f driver: ipvlan: Remove useless member mtu_adj of struct ipvl_dev
The mtu_adj is initialized to zero when alloc mem, there is no any
assignment to mtu_adj. It is only used in ipvlan_adjust_mtu as one
right value.
So it is useless member of struct ipvl_dev, then remove it.

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 15:01:32 -05:00
Grygorii Strashko 4ccfd6383a net: ethernet: ti: cpsw: fix ASSERT_RTNL() warning during resume
netif_set_real_num_tx/rx_queues() are required to be called with rtnl_lock
taken, otherwise ASSERT_RTNL() warning will be triggered - which happens
now during System resume from suspend:
cpsw_resume()
|- cpsw_ndo_open()
  |- netif_set_real_num_tx/rx_queues()
     |- ASSERT_RTNL();

Hence, fix it by surrounding cpsw_ndo_open() by rtnl_lock/unlock() calls.

Cc: Dave Gerlach <d-gerlach@ti.com>
Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Fixes: commit e05107e6b7 ("net: ethernet: ti: cpsw: add multi queue support")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Tested-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:59:08 -05:00
Souptick Joarder fec668d36d ethernet :mellanox :mlx5: Replace pci_pool_alloc by pci_pool_zalloc
In alloc_cmd_box(), pci_pool_alloc() followed by memset will be
replaced by pci_pool_zalloc()

Signed-off-by: Souptick joarder <jrdr.linux@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:56:37 -05:00
Souptick Joarder 77d1337bf6 ethernet :mellanox :mlx4: Replace pci_pool_alloc by pci_pool_zalloc
In mlx4_alloc_cmd_mailbox(), pci_pool_alloc() followed by memset will be
replaced by pci_pool_zalloc()

Signed-off-by: Souptick joarder <jrdr.linux@gmail.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:56:36 -05:00
Lorenzo Colitti d109e61bfe net: ipv4: Don't crash if passing a null sk to ip_rt_update_pmtu.
Commit e2d118a1cb ("net: inet: Support UID-based routing in IP
protocols.") made __build_flow_key call sock_net(sk) to determine
the network namespace of the passed-in socket. This crashes if sk
is NULL.

Fix this by getting the network namespace from the skb instead.

Fixes: e2d118a1cb ("net: inet: Support UID-based routing in IP protocols.")
Reported-by: Erez Shitrit <erezsh@dev.mellanox.co.il>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:53:59 -05:00
Linus Torvalds e2b588ab60 pwm: Fixes for v4.9
This contains two one-line fixes for issues that were introduced in
 v4.9-rc1.
 -----BEGIN PGP SIGNATURE-----
 
 iQI2BAABCAAgBQJYPx27GRx0aGllcnJ5LnJlZGluZ0BnbWFpbC5jb20ACgkQ3SOs
 138+s6EcRg//bcdAO9u28pAe9FnL/ZRTtnDBW+Xwd6kFijaIDfu5ZRJTz7ZDtIlc
 JxMc6NOqvbxL3pJnxxzZSlLPwI7Su8gwFOaB+FrHhsNn5ocT5izzrBeHQTRgMFN+
 ziJBWMphwFU0y08ie0soiGK/B10X/HfzgzSKnafQc5i06GlpRKft32Mdqq/PgO5z
 qmKhscT3b6vvPKreJ5lNjlwP79N+J151bRrcQmV9Uh4oBW1OGVJoHFhaRDAHYmmF
 bnS9Y1iX4gwtjIUEDwzFEpf3/TJ0cjBe06mW9I9OtTe22Jzo2Rc2Hyi0+JMrKrUF
 sVmP3R07qVVgIe1QJcrvIxaGlDO9ck1HS1PycnYgAb6naonLLXNsrGFn9N7a+tQ1
 rEcwA8i2c7jKBNbZ/xL5qDE60WEZY9px8gQuu8Z3+T16qpd2BYBVwlJOxDYf6Dfe
 qKvOsNg1d5x7pye5GotGhbddCrSstahMfK+NlzMH463P/5/sdymHwRNnjAL9rM0X
 +6AHPC5aRJvgvbf4ku5hO0AI+cPU0y0T7S5dhoFKW4Mfl7F3N+bcmiE0xFiNnXrU
 e9/VsdPyhVDxx4Qfpd3vMudYApoE2dvae0Eyn37hbq0Y9F5ZVE1KAwhcfHU3I4BN
 jbQAozOFogHaJoIjRudqXo3TmZkKZ9pI8/CJ9AfDHNpJH4bdLv3sF+Y=
 =Yr7I
 -----END PGP SIGNATURE-----

Merge tag 'pwm/for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm

Pull pwm fixes from Thierry Reding:
 "This contains two one-line fixes for issues that were introduced in
  v4.9-rc1"

* tag 'pwm/for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
  pwm: Fix device reference leak
  pwm: meson: Add missing spin_lock_init()
2016-11-30 11:53:50 -08:00
Josef Bacik e95489010b bpf: add test for the verifier equal logic bug
This is a test to verify that

bpf: fix states equal logic for varlen access

actually fixed the problem.  The problem was if the register we added to our map
register was UNKNOWN in both the false and true branches and the only thing that
changed was the range then we'd incorrectly assume that the true branch was
valid, which it really wasnt.  This tests this case and properly fails without
my fix in place and passes with it in place.

Signed-off-by: Josef Bacik <jbacik@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:51:54 -05:00
Josef Bacik e2d2afe15e bpf: fix states equal logic for varlen access
If we have a branch that looks something like this

int foo = map->value;
if (condition) {
  foo += blah;
} else {
  foo = bar;
}
map->array[foo] = baz;

We will incorrectly assume that the !condition branch is equal to the condition
branch as the register for foo will be UNKNOWN_VALUE in both cases.  We need to
adjust this logic to only do this if we didn't do a varlen access after we
processed the !condition branch, otherwise we have different ranges and need to
check the other branch as well.

Fixes: 484611357c ("bpf: allow access into map value arrays")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:50:52 -05:00
Hongxu Jia 17a49cd549 netfilter: arp_tables: fix invoking 32bit "iptable -P INPUT ACCEPT" failed in 64bit kernel
Since 09d9686047 ("netfilter: x_tables: do compat validation via
translate_table"), it used compatr structure to assign newinfo
structure.  In translate_compat_table of ip_tables.c and ip6_tables.c,
it used compatr->hook_entry to replace info->hook_entry and
compatr->underflow to replace info->underflow, but not do the same
replacement in arp_tables.c.

It caused invoking 32-bit "arptbale -P INPUT ACCEPT" failed in 64bit
kernel.
--------------------------------------
root@qemux86-64:~# arptables -P INPUT ACCEPT
root@qemux86-64:~# arptables -P INPUT ACCEPT
ERROR: Policy for `INPUT' offset 448 != underflow 0
arptables: Incompatible with this kernel
--------------------------------------

Fixes: 09d9686047 ("netfilter: x_tables: do compat validation via translate_table")
Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-11-30 20:50:23 +01:00
David S. Miller ddf952a1e4 Merge branch 'cpsw-per-channel-shaping'
Ivan Khoronzhuk says:

====================
cpsw: add per channel shaper configuration

This series is intended to allow user to set rate for per channel
shapers at cpdma level. This patchset doesn't have impact on performance.
The rate can be set with:

echo 100 > /sys/class/net/ethX/queues/tx-0/tx_maxrate

Tested on am572xx
Based on net-next/master
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:37:15 -05:00
Ivan Khoronzhuk 8feb0a1965 net: ethernet: ti: cpsw: split tx budget according between channels
Split device budget between channels according to channel rate.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:37:14 -05:00
Ivan Khoronzhuk 342934a558 net: ethernet: ti: cpsw: optimize end of poll cycle
Check budget fullness only after it's updated and update
channel mask only once to keep budget balance between channels.
It's also needed for farther changes.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:37:14 -05:00
Ivan Khoronzhuk 83fcad0c98 net: ethernet: ti: cpsw: add .ndo to set per-queue rate
This patch allows to rate limit queues tx queues for cpsw interface.
The rate is set in absolute Mb/s units and cannot be more a speed
an interface is connected with.

The rate for a tx queue can be tested with:

ethtool -L eth0 rx 4 tx 4

echo 100 > /sys/class/net/eth0/queues/tx-0/tx_maxrate
echo 200 > /sys/class/net/eth0/queues/tx-1/tx_maxrate
echo 50 > /sys/class/net/eth0/queues/tx-2/tx_maxrate
echo 30 > /sys/class/net/eth0/queues/tx-3/tx_maxrate

tc qdisc add dev eth0 root handle 1: multiq

tc filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip\
dport 5001 0xffff action skbedit queue_mapping 0

tc filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip\
dport 5002 0xffff action skbedit queue_mapping 1

tc filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip\
dport 5003 0xffff action skbedit queue_mapping 2

tc filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip\
dport 5004 0xffff action skbedit queue_mapping 3

iperf -c 192.168.2.1 -b 110M -p 5001 -f m -t 60
iperf -c 192.168.2.1 -b 215M -p 5002 -f m -t 60
iperf -c 192.168.2.1 -b 55M -p 5003 -f m -t 60
iperf -c 192.168.2.1 -b 32M -p 5004 -f m -t 60

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:37:14 -05:00
Ivan Khoronzhuk 8f32b90981 net: ethernet: ti: davinci_cpdma: add set rate for a channel
The cpdma has 8 rate limited tx channels. This patch adds
ability for cpdma driver to use 8 tx h/w shapers. If at least one
channel is not rate limited then it must have higher number, this
is because the rate limited channels have to have higher priority
then not rate limited channels. The channel priority is set in low-hi
direction already, so that when a new channel is added with ethtool
and it doesn't have rate yet, it cannot affect on rate limited
channels. It can be useful for TSN streams and just in cases when
h/w rate limited channels are needed.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:37:13 -05:00
Ivan Khoronzhuk 0fc6432cc7 net: ethernet: ti: davinci_cpdma: add weight function for channels
The weight of a channel is needed to split descriptors between
channels. The weight can depend on maximum rate of channels, maximum
rate of an interface or other reasons. The channel weight is in
percentage and is independent for rx and tx channels.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:37:13 -05:00
David S. Miller 0fcba2894c wireless-drivers fixes for 4.9
mwifiex
 
 * properly terminate SSIDs so that uninitalised memory is not printed
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJYPZM9AAoJEG4XJFUm622bnTwH/j7KbWbTLE2H9abYJxne3sWQ
 FOCrGNICgG9HYyLn33k7+dCBHGYa1f5qWO7dIeWhe6LEtXqrWBsxUFZbMrJVU7Te
 sDr3s2364iIPhdLtYl5mM7M75Y2h2pt1XhpErmldCpFpnYad5vZEbIR1n96F3cz6
 0ft6iUJpd3bf+KWxDUc707Vln42optvbcp7gjF+6mdShb0jlFkV9eOa85aJH6v38
 5kKPhLfiv1Qs1sZXPrWc2oQUIc0LDY19sXtw/5DTLe4+r6ybsKlF1o4+b2yOVeiu
 nrm1F/2D/829w3+4iYE63wACPGvyVaKYROtYgquyYkrI+6xyh1fmnout6SiwLe8=
 =oEuI
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-for-davem-2016-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for 4.9

mwifiex

* properly terminate SSIDs so that uninitalised memory is not printed
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:33:44 -05:00
David S. Miller 348bfec21f Merge branch 'qed-XDP-support'
Yuval Mintz says:

====================
qed*: Add XDP support

This patch series is intended to add XDP to the qede driver, although
it contains quite a bit of cleanups, refactorings and infrastructure
changes as well.

The content of this series can be roughly divided into:

 - Datapath improvements - mostly focused on having the datapath utilize
parameters which can be more tightly contained in cachelines.
Patches #1, #2, #8, #9 belong to this group.

 - Refactoring - done mostly in favour of XDP. Patches #3, #4, #5, #9.

 - Infrastructure changes - done in favour of XDP. Paches #6 and #7 belong
to this category [#7 being by far the biggest patch in the series].

 - Actual XDP support - last two patches [#10, #11].
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:06 -05:00
Mintz, Yuval cb6aeb0792 qede: Add support for XDP_TX
Add support for forwarding via XDP. Once the eBPF is attached,
driver would allocate & configure a designated transmission queue
meant solely for forwarding packets. Said queue would share the
receive-queue's interrupt line, and would have it's own Tx statistics.

Infrastructure changes required for this [spread-out through the code]:
 - Determine the DMA direction of the receive buffers based on the presence
of the eBPF program.
 - Turn the sw Tx ring into a union, as regular/XDP queues have different
needs for releasing resources after completion [regular requires the SKB,
XDP requires the transmitted page].

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:05 -05:00
Mintz, Yuval 496e051709 qede: Add basic XDP support
Add support for the ndo_xdp callback. This patch would support XDP_PASS,
XDP_DROP and XDP_ABORTED commands.

This also adds a per Rx queue statistic which counts number of packets
which didn't reach the stack [due to XDP].

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:05 -05:00
Mintz, Yuval 9eb22357d5 qede: Better utilize the qede_[rt]x_queue
Improve the cacheline usage of both queues by reordering -
This reduces the cachelines required for egress datapath processing
from 3 to 2 and those required by ingress datapath processing by 2.

It also changes a couple of datapath related functions that currently
require either the fastpath or the qede_dev, changing them to be based
on the tx/rx queue instead.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:05 -05:00
Mintz, Yuval 8a47253065 qede: Don't check netdevice for rx-hash
Receive-hashing is a fixed feature, so there's no need to check
during the ingress datapath whether it's set or not.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:04 -05:00
Mintz, Yuval 3da7a37ae6 qed*: Handle-based L2-queues.
The driver needs to maintain several FW/HW-indices for each one of
its queues. Currently, that mapping is done by the QED where it uses
an rx/tx array of so-called hw-cids, populating them whenever a new
queue is opened and clearing them upon destruction of said queues.

This maintenance is far from ideal - there's no real reason why
QED needs to maintain such a data-structure. It becomes even worse
when considering the fact that the PF's queues and its child VFs' queues
are all mapped into the same data-structure.
As a by-product, the set of parameters an interface needs to supply for
queue APIs is non-trivial, and some of the variables in the API
structures have different meaning depending on their exact place
in the configuration flow.

This patch re-organizes the way L2 queues are configured and maintained.
In short:
  - Required parameters for queue init are now well-defined.
  - Qed would allocate a queue-cid based on parameters.
    Upon initialization success, it would return a handle to caller.
  - Queue-handle would be maintained by entity requesting queue-init,
    not necessarily qed.
  - All further queue-APIs [update, destroy] would use the opaque
    handle as reference for the queue instead of various indices.

The possible owners of such handles:
  - PF queues [qede] - complete handles based on provided configuration.
  - VF queues [qede] - fw-context-less handles, containing only relative
    information; Only the PF-side would need the absolute indices
    for configuration, so they're omitted here.
  - VF queues [qed, PF-side] - complete handles based on VF initialization.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:04 -05:00
Mintz, Yuval 567b3c127a qede: Revise state locking scheme
As qede utilizes an internal-reload sequence as result of various
configuration changes, the netif state wouldn't always accurately describe
the status of the configuration.
To compensate, we're storing an internal state of the device, which should
only be accessed under the qede_lock.

This patch fixes and improves several state/lock interactions:
  - The internal state should only be checked while locked.
  - While holding lock, it's preferable to check state rather than
    the netdevice's state.
  - The reload sequence is not 'atomic' - unload and subsequent load
    are not in the same critical section.

This also add the 'locked' variant for the reload, which would later be
used by XDP - useful in the case where the correct sequence is 'lock,
check state and re-configure if good', instead of allowing the reload
itself to make the decision regarding the configurability of the device.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:04 -05:00
Mintz, Yuval f4fad34c0e qede: Refactor data-path Rx flow
Driver's NAPI poll is using a long sequence for processing ingress
packets, and it's going to get even longer once we do XDP.
Break down the main loop into a series of sub-functions to allow
better readability of the function.

While we're at it, correct the accounting of the NAPI budget -
currently we're counting only packets passed to the stack against
the budget, even in case those are actually aggregations.
After refactoring every CQE processed would be counted against the budget.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:03 -05:00
Mintz, Yuval 4dbcd64002 qede: Refactor statistics gathering
Refactor logic for gathering statistics into a per-queue function.
This improves readability of the driver statistics' flows.

In addition, this would be required by the XDP forwarding queues
[as we'll need the Txq statistics gathering methods for those as well].

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:03 -05:00
Mintz, Yuval 80439a1704 qede: Remove 'num_tc'.
Driver currently doesn't support multi-CoS, but it contains logic
where multiple transmission queues could be theoretically manipulated.
No point in maintaining the infrastructure at the moment.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:03 -05:00
Mintz, Yuval 6d937acfb3 qed: Optimize qed_chain datapath usage
The chain structure and functions are widely used by the qed* modules,
both for configuration and datapath.
E.g., qede's Tx has one such chain and its Rx has two.

Currently, the strucutre's fields which are required for datapath
related functions [produce/consume] are intertwined with fields which
are required only for configuration purposes [init/destroy/etc.].

This patch re-arranges the chain structure so that all the fields which
are required for datapath usage could reside in a single cacheline instead
of the two which are required today.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:02 -05:00
Mintz, Yuval 01e23015a9 qede: Optimize aggregation information size
Driver needs to maintain a structure per-each concurrent possible
open aggregation, but the structure storing that metadata is far from
being optimized - biggest waste in it is that there are 2 buffer metadata,
one for a replacement buffer when the aggregation begins and the other for
holding the first aggregation's buffer after it begins [as firmware might
still update it]. Those 2 can safely be united into a single metadata
structure.

struct qede_agg_info changes the following:

	/* size: 120, cachelines: 2, members: 9 */
	/* sum members: 114, holes: 1, sum holes: 4 */
	/* padding: 2 */
	/* paddings: 2, sum paddings: 8 */
	/* last cacheline: 56 bytes */
 -->
	/* size: 48, cachelines: 1, members: 9 */
	/* paddings: 1, sum paddings: 4 */
	/* last cacheline: 48 bytes */

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:02 -05:00
Tobias Klauser f54b8cd6ef ehea: Remove unnecessary memset of stats in netdev private data
The memory for netdev private data is allocated using kzalloc/vzalloc in
alloc_netdev_mqs, thus there is no need to zero the stats portion of it
again in the driver's probe function.

In any case, the size for the memset is wrong as the stats member is of
type rtnl_link_stats64, not net_device_stats.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:26:26 -05:00
David S. Miller 7752f72748 Merge branch 'l2tp-fixes'
Guillaume Nault says:

====================
l2tp: fixes for l2tp_ip and l2tp_ip6 socket handling

This series addresses problems found while working on commit 32c231164b
("l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()").

The first three patches fix races in socket's connect, recv and bind
operations. The last two ones fix scenarios where l2tp fails to
correctly lookup its userspace sockets.

Apart from the last patch, which is l2tp_ip6 specific, every patch
fixes the same problem in the L2TP IPv4 and IPv6 code.

All problems fixed by this series exist since the creation of the
l2tp_ip and l2tp_ip6 modules.

Changes since v1:
  * Patch #3: fix possible uninitialised use of 'ret' in l2tp_ip_bind().
====================

Acked-by: James Chapman <jchapman@katalix.com>
2016-11-30 14:14:09 -05:00
Guillaume Nault 31e2f21fb3 l2tp: fix address test in __l2tp_ip6_bind_lookup()
The '!(addr && ipv6_addr_equal(addr, laddr))' part of the conditional
matches if addr is NULL or if addr != laddr.
But the intend of __l2tp_ip6_bind_lookup() is to find a sockets with
the same address, so the ipv6_addr_equal() condition needs to be
inverted.

For better clarity and consistency with the rest of the expression, the
(!X || X == Y) notation is used instead of !(X && X != Y).

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:14:08 -05:00
Guillaume Nault df90e68861 l2tp: fix lookup for sockets not bound to a device in l2tp_ip
When looking up an l2tp socket, we must consider a null netdevice id as
wild card. There are currently two problems caused by
__l2tp_ip_bind_lookup() not considering 'dif' as wild card when set to 0:

  * A socket bound to a device (i.e. with sk->sk_bound_dev_if != 0)
    never receives any packet. Since __l2tp_ip_bind_lookup() is called
    with dif == 0 in l2tp_ip_recv(), sk->sk_bound_dev_if is always
    different from 'dif' so the socket doesn't match.

  * Two sockets, one bound to a device but not the other, can be bound
    to the same address. If the first socket binding to the address is
    the one that is also bound to a device, the second socket can bind
    to the same address without __l2tp_ip_bind_lookup() noticing the
    overlap.

To fix this issue, we need to consider that any null device index, be
it 'sk->sk_bound_dev_if' or 'dif', matches with any other value.
We also need to pass the input device index to __l2tp_ip_bind_lookup()
on reception so that sockets bound to a device never receive packets
from other devices.

This patch fixes l2tp_ip6 in the same way.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:14:08 -05:00
Guillaume Nault d5e3a19093 l2tp: fix racy socket lookup in l2tp_ip and l2tp_ip6 bind()
It's not enough to check for sockets bound to same address at the
beginning of l2tp_ip{,6}_bind(): even if no socket is found at that
time, a socket with the same address could be bound before we take
the l2tp lock again.

This patch moves the lookup right before inserting the new socket, so
that no change can ever happen to the list between address lookup and
socket insertion.

Care is taken to avoid side effects on the socket in case of failure.
That is, modifications of the socket are done after the lookup, when
binding is guaranteed to succeed, and before releasing the l2tp lock,
so that concurrent lookups will always see fully initialised sockets.

For l2tp_ip, 'ret' is set to -EINVAL before checking the SOCK_ZAPPED
bit. Error code was mistakenly set to -EADDRINUSE on error by commit
32c231164b ("l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()").
Using -EINVAL restores original behaviour.

For l2tp_ip6, the lookup is now always done with the correct bound
device. Before this patch, when binding to a link-local address, the
lookup was done with the original sk->sk_bound_dev_if, which was later
overwritten with addr->l2tp_scope_id. Lookup is now performed with the
final sk->sk_bound_dev_if value.

Finally, the (addr_len >= sizeof(struct sockaddr_in6)) check has been
dropped: addr is a sockaddr_l2tpip6 not sockaddr_in6 and addr_len has
already been checked at this point (this part of the code seems to have
been copy-pasted from net/ipv6/raw.c).

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:14:08 -05:00
Guillaume Nault a3c18422a4 l2tp: hold socket before dropping lock in l2tp_ip{, 6}_recv()
Socket must be held while under the protection of the l2tp lock; there
is no guarantee that sk remains valid after the read_unlock_bh() call.

Same issue for l2tp_ip and l2tp_ip6.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:14:07 -05:00
Guillaume Nault 0382a25af3 l2tp: lock socket before checking flags in connect()
Socket flags aren't updated atomically, so the socket must be locked
while reading the SOCK_ZAPPED flag.

This issue exists for both l2tp_ip and l2tp_ip6. For IPv6, this patch
also brings error handling for __ip6_datagram_connect() failures.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:14:07 -05:00
Hariprasad Shenai bb83d62fa8 cxgb4: Add PCI device ID for new adapter
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:11:47 -05:00
Alexei Starovoitov b634d30a79 cgroup, bpf: remove unnecessary #include
this #include is unnecessary and brings whole set of
other headers into cgroup-defs.h. Remove it.

Fixes: 3007098494 ("cgroup: add support for eBPF programs")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Rami Rosen <roszenrami@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 13:58:25 -05:00
Zhang Shengju 18502acd9a neigh: remove duplicate check for same neigh
Currently loop index 'idx' is used as the index in the neigh list of interest.
It's increased only when the neigh is dumped. It's not the absolute index in
the list. Because there is no info to record which neigh has already be scanned
by previous loop. This will cause the filtered out neighs to be scanned mulitple
times.

This patch make idx as the absolute index in the list, it will increase no matter
whether the neigh is filtered. This will prevent the above problem.

And this is in line with other dump functions.

v2:
 - take David Ahern's advice to do simple change

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 13:46:16 -05:00
Mike Rapoport a107bf8b39 isofs: add KERN_CONT to printing of ER records
The ER records are printed without explicit log level presuming line
continuation until "\n".  After the commit 4bcc595ccd (printk:
reinstate KERN_CONT for printing continuation lines), the ER records are
printed a character per line.

Adding KERN_CONT to appropriate printk statements restores the printout
behavior.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-30 10:41:26 -08:00
Nikita Yushchenko 80cca775cd net: fec: cache statistics while device is down
Execution 'ethtool -S' on fec device that is down causes OOPS on Vybrid
board:

Unhandled fault: external abort on non-linefetch (0x1008) at 0xe0898200
pgd = ddecc000
[e0898200] *pgd=9e406811, *pte=400d1653, *ppte=400d1453
Internal error: : 1008 [#1] SMP ARM
...

Reason of OOPS is that fec_enet_get_ethtool_stats() accesses fec
registers while IPG clock is stopped by PM.

Fix that by caching statistics in fec_enet_private. Cache is initialized
at device probe time, and updated at statistics request time if device
is up, and also just before turning device off on down path.

Additional locking is not needed, since cached statistics is accessed
either before device is registered, or under rtnl_lock().

Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 12:44:40 -05:00
Alexei Starovoitov 9bee294f53 samples/bpf: fix include path
Fix the following build error:
HOSTCC  samples/bpf/test_lru_dist.o
../samples/bpf/test_lru_dist.c:25:22: fatal error: bpf_util.h: No such file or directory

This is due to objtree != srctree.
Use srctree, since that's where bpf_util.h is located.

Fixes: e00c7b216f ("bpf: fix multiple issues in selftest suite and samples")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 12:42:28 -05:00
Zhang Shengju 763dfa276c macvtap: replace printk with netdev_err
This patch replaces printk() with netdev_err() for macvtap device.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 12:40:48 -05:00
Haishuang Yan 17b463654f vxlan: fix a potential issue when create a new vxlan fdb entry.
vxlan_fdb_append may return error, so add the proper check,
otherwise it will cause memory leak.

Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>

Changes in v2:
  - Unnecessary to initialize rc to zero.
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 12:02:49 -05:00
Ping Cheng 2425f18081 Input: change KEY_DATA from 0x275 to 0x277
0x275 is used by KEY_FASTREVERSE.

Fixes: 488326947c ("Input: add HDMI CEC specific keycodes")
Signed-off-by: Ping Cheng <ping.cheng@wacom.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2016-11-30 08:59:26 -08:00
David S. Miller 492e9d8c18 Merge branch 'liquidio-VF'
Raghu Vatsavayi says:

====================
liquidio VF operations

This patchseries adds support for VF device specific operations
like mailbox, queues and register access. This V3 patchset also
has changes based on comments form earlier versions:

1) Removed extra 'void *' casting.
2) Fixed all cross compilations issues reported on S390 and Powerpc
   architectures.

Please apply the patches in following order as these patches depend
on each other.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 11:03:10 -05:00