Delay releasing the node lock when processing a neighbor discovery
message until after the optional discovery response message has been
sent. This helps ensure that any link protocol messages sent by a
link endpoint created as a result of a neighbor discovery request
are received after the discovery response is received, thereby
giving the receiving node a chance to create a peer link endpoint to
consume those link protocol messages, if one does not already exist.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Reworks the appearance of the routine that processes incoming
LINK_CONFIG messages to keep the main logic flow at a consistent level
of indentation, and to add comments outlining the various phases involved
in processing each message. This rework is being done to allow upcoming
enhancements to this routine to be integrated more cleanly.
The diff isn't really readable, so know that it was a case of the
old code being like:
tipc_disc_recv_msg(..)
{
if (in_own_cluster(orig)) {
...
lines and lines of stuff
...
}
}
which is now replaced with the more sane:
tipc_disc_recv_msg(..)
{
if (!in_own_cluster(orig))
return;
...
lines and lines of stuff
...
}
Instances of spin locking within the reindented block were replaced with
the identical tipc_node_[un]lock() abstractions. Note that all these
changes are cosmetic in nature, and do not change the way LINK_CONFIG
messages are processed.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Ensures that the "redundant link exists" field of the LINK_PROTOCOL
messages sent by a link endpoint is set if and only if the sending
node has at least one other working link to the peer node. Previously,
the bit was set only if there were at least 2 working links to the peer
node, meaning the bit was incorrectly left unset in messages sent by a
non-working link endpoint when exactly one alternate working link was
available. The revised code now takes the state of the link sending
the message into account when deciding if an alternate link exists.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
All the other boolean like msg_set_X(m) operations don't
export both a msg_set_X(a) and a msg_clear_X(m), but instead
just have the single msg_set_X(m, val) variant.
Make the redundant_link one consistent by having the set take
a value, and delete the msg_clear_redundant_link() anomoly.
This is a cosmetic change and should not change behaviour.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Function names like "tipc_node_has_redundant_links" are unweildy
and result in long lines even for simple lines. The "has" doesn't
contribute any value add, so dropping that is a slight step in the
right direction. This is a cosmetic change, basic result of:
for i in `grep -l tipc_node_has_ *` ; do sed -i s/tipc_node_has_/tipc_node_/ $i ; done
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Removes support for the timestamp field of TIPC's link protocol messages.
This field was previously used to hold an OS-dependent timestamp value
that was used to assist in debugging early versions of TIPC. The field
has now been deemed unnecessary and has been removed from the latest TIPC
specification. This change has no impact on the operation of TIPC since
the field was set by TIPC, but never referenced.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Relocates network-related variables into the subsystem files where
they are now primarily used (following the recent rework of TIPC's
node table), and converts globals into locals where possible. Changes
the initialization of tipc_num_links from run-time to compile-time,
and eliminates the net_start routine that becomes empty as a result.
Also eliminates the corresponding net_stop routine by moving its
(trivial) content into the one location that called the routine.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Replaces the dynamically allocated array of pointers to the cluster's
node objects with a static hash table. Hash collisions are resolved
using chaining, with a typical hash chain having only a single node,
to avoid degrading performance during processing of incoming packets.
The conversion to a hash table reduces the memory requirements for
TIPC's node table to approximately the same size it had prior to
the previous commit.
In addition to the hash table itself, TIPC now also maintains a
linked list for the node objects, sorted by ascending network address.
This list allows TIPC to continue sending responses to user space
applications that request node and link information in sorted order.
The list also improves performance when name table update messages are
sent by making it easier to identify the nodes that must be notified.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Gets rid of the need for users to specify the maximum number of
cluster nodes supported by TIPC. TIPC now automatically provides
support for all 4K nodes allowed by its addressing scheme.
Note: This change sets TIPC's memory usage to the amount used by
a maximum size node table with 4K entries. An upcoming patch that
converts the node table from a linear array to a hash table will
compact the node table to a more efficient design, but for clarity
it is nice to have all the Kconfig infrastruture go away separately.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Converts the fields of the global "tipc_net" structure into individual
variables. Since the struct was never referenced as a complete unit,
its existence was pointless. This will facilitate upcoming changes to
TIPC's node table and simpify upcoming relocation of the variables so
they are only visible to the files that actually use them.
This change is essentially cosmetic in nature, and doesn't affect the
operation of TIPC.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Removes a race condition that could cause TIPC's internal counter
of the number of links it has to neighboring nodes to have the
incorrect value if two independent threads of control simultaneously
create new link endpoints connecting to two different nodes using two
different bearers. Such under counting would result in TIPC failing to
list the final link(s) in its response to a configuration request to
list all of the node's links. The counter is now updated atomically
to ensure that simultaneous increments do not interfere with each
other.
Thanks go to Peter Butler <pbutler@pt.com> for his assistance in
diagnosing and fixing this problem.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Adds support for the SO_RCVTIMEO socket option to TIPC's socket
receive routines.
Thanks go out to Raj Hegde <rajenhegde@yahoo.ca> for his contribution
to the development and testing this enhancement.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Relocates the code that notifies users of node subscriptions so that
it is adjacent to the rest of the routines that implement TIPC's node
subscription capability. Renames the name table routine that is
invoked by a node subscription to better reflect its purpose and to
be consistent with other, similar name table routines.
These changes are cosmetic in nature, and do not alter the behavior
of TIPC.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Prevents a null pointer dereference from occurring if a node subscription
is triggered at the same time that the subscribing port or publication is
terminating the subscription. The problem arises if the triggering routine
asynchronously activates and deregisters the node subscription while
deregistration is already underway -- the deregistration routine may find
that the pointer it has just verified to be non-NULL is now NULL.
To avoid this race condition the triggering routine now simply marks the
node subscription as defunct (to prevent it from re-activating)
instead of deregistering it. The subscription is now both deregistered
and destroyed only when the subscribing port or publication code terminates
the node subscription.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Introduces a pair of helper routines that convert the network address
for a TIPC node into the network address for its cluster or zone.
This is a cosmetic change designed to avoid future errors caused by
the incorrect use of address bitmasks, and does not alter the existing
operation of TIPC.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Fixes a typo in the calculation of the network address of a node's own
cluster when generating a response to the configuration command that
lists all of the node's links. The correct mask value for a <Z.C.N>
network address uses 1's for the 8-bit zone and 12-bit cluster parts
and 0's for the 12-bit node part.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Enhances TIPC's socket receive routines to support iovec structures
containing more than a single entry. This change leverages existing
sk_buff routines to do most of the work; the only significant change
to TIPC itself is that an sk_buff now records how much data has been
already consumed as an numeric offset, rather than as a pointer to
the first unread data byte.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
To start doing these conversions, we need to add some temporary
flow4_* macros which will eventually go away when all the protocol
code paths are changed to work on AF specific flowi objects.
Signed-off-by: David S. Miller <davem@davemloft.net>
This is just a shorthand which will help in passing around AF
specific flow structures as generic ones.
Signed-off-by: David S. Miller <davem@davemloft.net>
Now we have struct flowi4, flowi6, and flowidn for each address
family. And struct flowi is just a union of them all.
It might have been troublesome to convert flow_cache_uli_match() but
as it turns out this function is completely unused and therefore can
be simply removed.
Signed-off-by: David S. Miller <davem@davemloft.net>
Create two sets of port member accessors, one set prefixed by fl4_*
and the other prefixed by fl6_*
This will let us to create AF optimal flow instances.
It will work because every context in which we access the ports,
we have to be fully aware of which AF the flowi is anyways.
Signed-off-by: David S. Miller <davem@davemloft.net>
I intend to turn struct flowi into a union of AF specific flowi
structs. There will be a common structure that each variant includes
first, much like struct sock_common.
This is the first step to move in that direction.
Signed-off-by: David S. Miller <davem@davemloft.net>
The idea here is this minimizes the number of places one has to edit
in order to make changes to how flows are defined and used.
Signed-off-by: David S. Miller <davem@davemloft.net>
The PFC configuration is not cleared until the device is reset. This
has not been a problem because setting DCB attributes forced a
hardware reset. Now that we no longer require this reset to occur
PFC remains configured even after being disabled until the
device is reset.
This removes a goto in the PFC hardware set routines for 82598 and
82599 devices that was short circuiting the clear.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Implemented ixgbe_ndo_set_vf_bw function which is being used by iproute2
tool. In addition, updated ixgbe_ndo_get_vf_config function to show the
actual rate limit to the user.
The rate limitation can be configured only when the link is up and the
link speed is 10Gb.
The rate limit value can be 0 or ranged between 11 and actual link
speed measured in Mbps. A value of '0' disables the rate limit for
this specific VF.
iproute2 usage will be 'ip link set ethX vf Y rate Z'.
After the command is made, the rate will be changed instantly.
To view the current rate limit, use 'ip link show ethX'.
The rates will be zeroed only upon driver reload or a link speed change.
This feature is being supported by 82599 and X540 devices.
Signed-off-by: Lior Levy <lior.levy@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
DCB provides a guaranteed bandwidth in the case with 0%
bandwidth then no bandwidth is guaranteed. However the
traffic class should still be able to transmit traffic.
For this to work the traffic class must be given the
minimum credits required to send a frame.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
VF Free Running Timer register name missing an F.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Evan Swanson <evan.swanson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change updates the PHY setup code to support 100Mbps capable PHYs
as well as 10G and 1Gbps.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The VF mailbox polling for acks and messages would reset the timer to zero
on a timeout. Under heavy load a timeout may actually occur without being
the result of an error and when this occurs it is not practical to perform
a full VF driver reset on every message timeout. Instead, just return an
error (which is already done) and the VF driver will have an opportunity
to retry the operation.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>