When parsing the FAC_NATIONAL_DIGIS facilities field, it's possible for
a remote host to provide more digipeaters than expected, resulting in
heap corruption. Check against ROSE_MAX_DIGIS to prevent overflows, and
abort facilities parsing on failure.
Additionally, when parsing the FAC_CCITT_DEST_NSAP and
FAC_CCITT_SRC_NSAP facilities fields, a remote host can provide a length
of less than 10, resulting in an underflow in a memcpy size, causing a
kernel panic due to massive heap corruption. A length of greater than
20 results in a stack overflow of the callsign array. Abort facilities
parsing on these invalid length values.
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Length fields provided by a peer for names and attributes may be longer
than the destination array sizes. Validate lengths to prevent stack
buffer overflows.
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Invalid nicknames containing only spaces will result in an underflow in
a memcpy size calculation, subsequently destroying the heap and
panicking.
v2 also catches the case where the provided nickname is longer than the
buffer size, which can result in controllable heap corruption.
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
We clone the child entry in skb_dst_pop before we call
skb_dst_drop(). Otherwise we might kill the child right
before we return it to the caller.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Crypto requests might return asynchronous. In this case we leave
the rcu protected region, so force a refcount on the skb's
destination entry before we enter the xfrm type input/output
handlers.
This fixes a crash when a route is deleted whilst sending IPsec
data that is transformed by an asynchronous algorithm.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "ipv4: Inline fib_semantic_match into check_leaf"
change forgets to return the route errors. check_leaf should
return the same results as fib_table_lookup.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we set up the flow informations in ip_route_newports(), we take
the address informations from the the rt_key_src and rt_key_dst fields
of the rtable. They appear to be empty. So take the address
informations from rt_src and rt_dst instead. This issue was introduced
by commit 5e2b61f784 ("ipv4: Remove
flowi from struct rtable.")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the scope value out of the fib alias entries and into fib_info,
so that we always use the correct scope when recomputing the nexthop
cached source address.
Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Any operation that:
1) Brings up an interface
2) Adds an IP address to an interface
3) Deletes an IP address from an interface
can potentially invalidate the nh_saddr value, requiring
it to be recomputed.
Perform the recomputation lazily using a generation ID.
Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix section mismatch warning by renaming the pci_driver variable to a
recognized (whitelisted) name.
WARNING: drivers/net/pch_gbe/pch_gbe.o(.data+0x1f8): Section mismatch in reference from the variable pch_gbe_pcidev to the variable .devinit.rodata:pch_gbe_pcidev_id
The variable pch_gbe_pcidev references
the variable __devinitconst pch_gbe_pcidev_id
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alessandro Suardi reported that we could not change route metrics :
ip ro change default .... advmss 1400
This regression came with commit 9c150e82ac (Allocate fib metrics
dynamically). fib_metrics is no longer an array, but a pointer to an
array.
Reported-by: Alessandro Suardi <alessandro.suardi@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Alessandro Suardi <alessandro.suardi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoiding abuse of ethtool_drvinfo.driver field.
HW specific info can be retrieved using lspci.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit fd245a4adb (net_sched: move TCQ_F_THROTTLED flag)
added a race.
qdisc_watchdog() is run from softirq, so special care should be taken or
we can lose one state transition (THROTTLED/RUNNING)
Prior to fd245a4adb, we were manipulating q->flags (qdisc->flags &=
~TCQ_F_THROTTLED;) and this manipulation could only race with
qdisc_warn_nonwc().
Since we want to avoid atomic ops in qdisc fast path - it was the
meaning of commit 3711210576 (QDISC_STATE_RUNNING dont need atomic
bit ops) - fix is to move THROTTLE bit into 'state' field, this one
being manipulated with SMP and IRQ safe operations.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Request_mem_region should be used with release_mem_region, not
release_resource.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression x,E;
@@
*x = request_mem_region(...)
... when != release_mem_region(x)
when != x = E
* release_resource(x);
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Request_mem_region should be used with release_mem_region, not
release_resource.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression x,E;
@@
*x = request_mem_region(...)
... when != release_mem_region(x)
when != x = E
* release_resource(x);
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This prevents possible race between bond_enslave and bond_handle_frame
as reported by Nicolas by moving rx_handler register/unregister.
slave->bond is added to hold pointer to master bonding sructure. That
way dev->master is no longer used in bond_handler_frame.
Also, this removes "BUG: scheduling while atomic" message
Reported-by: Nicolas de Pesloüan <nicolas.2p.debian@gmail.com>
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Tested-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rmmod myri10ge crash at free_netdev() -> netif_napi_del(), because napi
structures are already deallocated. To fix call netif_napi_del() before
kfree() at myri10ge_free_slices().
Cc: stable@kernel.org
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doorbell is used according to usage of BlueFlame.
For Blue Flame to work in Ethernet mode QP number should have 0
at bits 6,7.
Allocating range of QPs accordingly.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not allow a kernel consumer to allocate a UAR to serve for blue flame if the
number of available UARs gets below MLX4_NUM_RESERVED_UARS (currently 8). This
will allow userspace apps to open a device file and run things like
ibv_devinfo.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add mlx4_bitmap_avail() to give the number of available resources. We want to
use this as a hint to whether to allocate a resources or not. This patch is
introduced to be used with allocation blue flame registers.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using blue flame can improve latency by allowing the HW to more efficiently
access the WQE. This patch presents two functions that are used to allocate or
release HW resources for using blue flame; the caller need to supply a struct
mlx4_bf object when allocating resources. Consumers that make use of this API
should post doorbells to the UAR object pointed by the initialized struct
mlx4_bf;
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mlx4_en module now uses the new steering mechanism.
The RX packets are now steered through the MCG table instead
of Mac table for unicast, and default entry for multicast.
The feature is enabled through INIT_HCA
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
For Ethernet mode only,
When we want to register QP as promiscuous, it must be added to all the
existing steering entries and also to the default one.
The promiscuous QP might also be on of "real" QPs,
which means we need to monitor every entry to avoid duplicates and ensure
we close an entry when all it has is promiscuous QPs.
Same mechanism both for unicast and multicast.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
The same packet steering mechanism would be used both for IB and Ethernet,
Both multicasts and unicasts.
This commit prepares the general infrastructure for this.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
HW revision is derived from device ID and rev id.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver queries the FW for WOL support.
Ethtool get/set_wol is implemented accordingly.
Only magic packets are supported at the time.
Signed-off-by: Igor Yarovinsky <igory@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each RX ring will have its own interrupt vector, and TX rings will share one
(we mostly use polling for TX completions).
The vectors are assigned first time device is opened, and its name includes
the interface name and ring number.
Signed-off-by: Markuze Alex <markuze@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding a pool of MSI-X vectors and EQs that can be used explicitly by mlx4_core
customers (mlx4_ib, mlx4_en). The consumers will assign their own names to the
interrupt vectors. Those vectors are not opened at mlx4 device initialization,
opened by demand.
Changed the max number of possible EQs according to the new scheme, no longer relies on
on number of cores.
The new functionality is exposed through mlx4_assign_eq() and mlx4_release_eq().
Customers that do not use the new API will get completion vectors as before.
Signed-off-by: Markuze Alex <markuze@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of reseting the module parameters each ifup or mtu change,
they are being set once at device initialization
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 86271e460a introduced a
regression that caused mac80211 queues in stopped state.
ath_drain_all_txq is called in driver flush which would reset
the stopped flag and the mac80211 queues were never started
after that. iperf traffic is completely stalled due to this issue.
Restart the mac80211 queues in driver flush only if the txqs were
drained.
Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
With the recent tx status optimization in mac80211, we bail out as
and and when invalid rate index is found. So the behavior of resetting
rate idx to -1 and count to 0 has changed for the rate indexes that
were not part of the driver's retry series.
This has resulted in ath9k using incorrect rate table index which
caused the system to panic. Ideally ath9k need to loop only for the
indexes that were part of the retry series and so simply use hw->max_rates
as the loop counter.
Pasted the stack trace of the panic issue for reference.
[ 754.093192] BUG: unable to handle kernel paging request at ffff88046a9025b0
[ 754.093256] IP: [<ffffffffa02eac49>] ath_tx_status+0x209/0x2f0 [ath9k]
[ 754.094888] Call Trace:
[ 754.094903] <IRQ>
[ 754.094928] [<ffffffffa051f883>] ieee80211_tx_status+0x203/0x9e0 [mac80211]
[ 754.094975] [<ffffffffa053e305>] ? __ieee80211_wake_queue+0x125/0x140 [mac80211]
[ 754.095017] [<ffffffffa02e66c9>] ath_tx_complete_buf+0x1b9/0x370 [ath9k]
[ 754.095054] [<ffffffffa02e6fcf>] ath_tx_complete_aggr+0x51f/0xb50 [ath9k]
[ 754.095098] [<ffffffffa05382a3>] ? ieee80211_prepare_and_rx_handle+0x173/0xab0 [mac80211]
[ 754.095148] [<ffffffff81350e62>] ? _raw_spin_unlock_irqrestore+0x32/0x40
[ 754.095186] [<ffffffffa02e9735>] ath_tx_tasklet+0x365/0x4b0 [ath9k]
[ 754.095224] [<ffffffff8107a2a2>] ? clockevents_program_event+0x62/0xa0
[ 754.095261] [<ffffffffa02e2628>] ath9k_tasklet+0x168/0x1c0 [ath9k]
[ 754.095298] [<ffffffff8105599b>] tasklet_action+0x6b/0xe0
[ 754.095331] [<ffffffff81056278>] __do_softirq+0x98/0x120
[ 754.095361] [<ffffffff8100cd5c>] call_softirq+0x1c/0x30
[ 754.095393] [<ffffffff8100efb5>] do_softirq+0x65/0xa0
[ 754.095423] [<ffffffff810563fd>] irq_exit+0x8d/0x90
[ 754.095453] [<ffffffff8100ebc1>] do_IRQ+0x61/0xe0
[ 754.095482] [<ffffffff81351413>] ret_from_intr+0x0/0x15
[ 754.095513] <EOI>
[ 754.095531] [<ffffffff81014375>] ? native_sched_clock+0x15/0x70
[ 754.096475] [<ffffffffa02bcfa6>] ? acpi_idle_enter_bm+0x24d/0x285 [processor]
[ 754.096475] [<ffffffffa02bcf9f>] ? acpi_idle_enter_bm+0x246/0x285 [processor]
[ 754.096475] [<ffffffff8127fab2>] cpuidle_idle_call+0x82/0x100
[ 754.096475] [<ffffffff8100a236>] cpu_idle+0xa6/0xf0
[ 754.096475] [<ffffffff81339bc1>] rest_init+0x91/0xa0
[ 754.096475] [<ffffffff814efccd>] start_kernel+0x3fd/0x408
[ 754.096475] [<ffffffff814ef347>] x86_64_start_reservations+0x132/0x136
[ 754.096475] [<ffffffff814ef451>] x86_64_start_kernel+0x106/0x115
[ 754.096475] RIP [<ffffffffa02eac49>] ath_tx_status+0x209/0x2f0 [ath9k]
Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
On hardware busy the scan request pointer should be cleared, as higher
levels will release. This avoids a crash when that pointer is
erroneously used later.
Signed-off-by: Joseph J. Gunn <armadefuego@yahoo.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Clearly a mistake, since pointers won't suddenly
change their value...
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
commit 2c8cec5c10 (Cache learned PMTU information in inetpeer) added
an extra inet_putpeer() call in ip_rt_update_pmtu().
This results in various problems, since we can free one inetpeer, while
it is still in use.
Ref: http://www.spinics.net/lists/netdev/msg159121.html
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 9435eb1cf0
("ipv4: Implement __ip_dev_find using new interface address hash.")
we reimplemented __ip_dev_find() so that it doesn't have to
do a full FIB table lookup.
Instead, it consults a hash table of addresses configured to
interfaces.
This works identically to the old code in all except one case,
and that is for loopback subnets.
The old code would match the loopback device for any IP address
that falls within a subnet configured to the loopback device.
Handle this corner case by doing the FIB lookup.
We could implement this via inet_addr_onlink() but:
1) Someone could configure many addresses to loopback and
inet_addr_onlink() is a simple list traversal.
2) We know the old code works.
Reported-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the current undo logic, cwnd is moderated after it was restored
to the value prior entering fast-recovery. It was moderated first
in tcp_try_undo_recovery then again in tcp_complete_cwr.
Since the undo indicates recovery was false, these moderations
are not necessary. If the undo is triggered when most of the
outstanding data have been acknowledged, the (restored) cwnd is
falsely pulled down to a small value.
This patch removes these cwnd moderations if cwnd is undone
a) during fast-recovery
b) by receiving DSACKs past fast-recovery
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ipv6_dev_get_saddr() is currently called with an uninitialized
destination address. Although in tests it usually seemed to nevertheless
always fetch the right source address, there seems to be a possible race
condition.
Therefore this commit changes this, first setting the destination
address and only after that fetching the source address.
Reported-by: Jan Beulich <JBeulich@novell.com>
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
With recent changes to the driver(switch to new cpdma layer),
the support for buffer descriptor address translation logic
is broken. This affects platforms where the physical address of
the descriptors as seen by the DMA engine is different from the
physical address.
Original Patch adding translation logic support:
Commit: ad021ae886
Signed-off-by: Sriramakrishnan A G <srk@ti.com>
Tested-By: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This avoids explicit cast to avoid 'discards qualifiers'
compiler warning in a netfilter patch that i've been working on.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
If SR-IOV is enabled by firmware, even if it is not enabled in the PCI
capability, TX pushes using write-combining may be corrupted.
We want to know whether it is enabled before mapping the NIC
registers, and even if PCI extended capabilities are not accessible.
Therefore, we look for the MSI capability, which is removed if SR-IOV
is enabled.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Optimize the calling of fib_add_ifaddr for all
secondary addresses after the promoted one to start from
their place, not from the new place of the promoted
secondary. It will save some CPU cycles because we
are sure the promoted secondary was first for the subnet
and all next secondaries do not change their place.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
The secondary address promotion relies on fib_sync_down_addr
to remove all routes created for the secondary addresses when
the old primary address is deleted. It does not happen for cases
when the primary address is also in another subnet. Fix that
by deleting local and broadcast routes for all secondaries while
they are on device list and by faking that all addresses from
this subnet are to be deleted. It relies on fib_del_ifaddr being
able to ignore the IPs from the concerned subnet while checking
for duplication.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>