Граф коммитов

28855 Коммитов

Автор SHA1 Сообщение Дата
Andy Adamson 35fa5f7b35 SUNRPC refactor rpcauth_checkverf error returns
Most of the time an error from the credops crvalidate function means the
server has sent us a garbage verifier. The gss_validate function is the
exception where there is an -EACCES case if the user GSS_context on the client
has expired.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-03 15:25:09 -04:00
Andy Adamson 4de6caa270 SUNRPC new rpc_credops to test credential expiry
This patch provides the RPC layer helper functions to allow NFS to manage
data in the face of expired credentials - such as avoiding buffered WRITEs
and COMMITs when the gss context will expire before the WRITEs are flushed
and COMMITs are sent.

These helper functions enable checking the expiration of an underlying
credential key for a generic rpc credential, e.g. the gss_cred gss context
gc_expiry which for Kerberos is set to the remaining TGT lifetime.

A new rpc_authops key_timeout is only defined for the generic auth.
A new rpc_credops crkey_to_expire is only defined for the generic cred.
A new rpc_credops crkey_timeout is only defined for the gss cred.

Set a credential key expiry watermark, RPC_KEY_EXPIRE_TIMEO set to 240 seconds
as a default and can be set via a module parameter as we need to ensure there
is time for any dirty data to be flushed.

If key_timeout is called on a credential with an underlying credential key that
will expire within watermark seconds, we set the RPC_CRED_KEY_EXPIRE_SOON
flag in the generic_cred acred so that the NFS layer can clean up prior to
key expiration.

Checking a generic credential's underlying credential involves a cred lookup.
To avoid this lookup in the normal case when the underlying credential has
a key that is valid (before the watermark), a notify flag is set in
the generic credential the first time the key_timeout is called. The
generic credential then stops checking the underlying credential key expiry, and
the underlying credential (gss_cred) match routine then checks the key
expiration upon each normal use and sets a flag in the associated generic
credential only when the key expiration is within the watermark.
This in turn signals the generic credential key_timeout to perform the extra
credential lookup thereafter.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-03 15:25:08 -04:00
Andy Adamson f1ff0c27fd SUNRPC: don't map EKEYEXPIRED to EACCES in call_refreshresult
The NFS layer needs to know when a key has expired.
This change also returns -EKEYEXPIRED to the application, and the informative
"Key has expired" error message is displayed. The user then knows that
credential renewal is required.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-03 15:25:08 -04:00
Trond Myklebust 280ebcf97c SUNRPC: rpcauth_create needs to know about rpc_clnt clone status
Ensure that we set rpc_clnt->cl_parent before calling rpc_client_register
so that rpcauth_create can find any existing RPCSEC_GSS caches for this
transport.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-02 13:32:48 -04:00
Trond Myklebust eb6dc19d8e RPCSEC_GSS: Share all credential caches on a per-transport basis
Ensure that all struct rpc_clnt for any given socket/rdma channel
share the same RPCSEC_GSS/krb5,krb5i,krb5p caches.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-02 12:48:40 -04:00
Trond Myklebust 414a629598 RPCSEC_GSS: Share rpc_pipes when an rpc_clnt owns multiple rpcsec auth caches
Ensure that if an rpc_clnt owns more than one RPCSEC_GSS-based authentication
mechanism, then those caches will share the same 'gssd' upcall pipe.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-01 11:12:43 -04:00
Trond Myklebust 298fc3558b SUNRPC: Add a helper to allow sharing of rpc_pipefs directory objects
Add support for looking up existing objects and creating new ones if there
is no match.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-01 11:12:43 -04:00
Trond Myklebust c36dcfe1f7 SUNRPC: Remove the rpc_client->cl_dentry
It is now redundant.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-01 11:12:42 -04:00
Trond Myklebust 5f42b016d7 SUNRPC: Remove the obsolete auth-only interface for pipefs dentry management
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-01 11:12:41 -04:00
Trond Myklebust 1917228435 RPCSEC_GSS: Switch auth_gss to use the new framework for pipefs dentries
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-01 11:12:41 -04:00
Trond Myklebust 6739ffb754 SUNRPC: Add a framework to clean up management of rpc_pipefs directories
The current system requires everyone to set up notifiers, manage directory
locking, etc.
What we really want to do is have the rpc_client create its directory,
and then create all the entries.

This patch will allow the RPCSEC_GSS and NFS code to register all the
objects that they want to have appear in the directory, and then have
the sunrpc code call them back to actually create/destroy their pipefs
dentries when the rpc_client creates/destroys the parent.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-30 09:19:38 -04:00
Trond Myklebust 6b2fddd3e7 RPCSEC_GSS: Fix an Oopsable condition when creating/destroying pipefs objects
If an error condition occurs on rpc_pipefs creation, or the user mounts
rpc_pipefs and then unmounts it, then the dentries in struct gss_auth
need to be reset to NULL so that a second call to gss_pipes_dentries_destroy
doesn't try to free them again.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-30 09:19:37 -04:00
Trond Myklebust e726340ac9 RPCSEC_GSS: Further cleanups
Don't pass the rpc_client as a parameter, when what we really want is
the net namespace.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-30 09:19:37 -04:00
Trond Myklebust c219066103 SUNRPC: Replace clnt->cl_principal
The clnt->cl_principal is being used exclusively to store the service
target name for RPCSEC_GSS/krb5 callbacks. Replace it with something that
is stored only in the RPCSEC_GSS-specific code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-30 09:19:36 -04:00
Trond Myklebust bd4a3eb15b RPCSEC_GSS: Clean up upcall message allocation
Optimise away gss_encode_msg: we don't need to look up the pipe
version a second time.

Save the gss target name in struct gss_auth. It is a property of the
auth cache itself, and doesn't really belong in the rpc_client.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-30 09:19:36 -04:00
Trond Myklebust 41b6b4d0b8 SUNRPC: Cleanup rpc_setup_pipedir
The directory name is _always_ clnt->cl_program->pipe_dir_name.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-30 09:19:35 -04:00
Trond Myklebust 1dada8e1f9 SUNRPC: Remove unused struct rpc_clnt field cl_protname
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-30 09:19:35 -04:00
Trond Myklebust 55909f21a1 SUNRPC: Deprecate rpc_client->cl_protname
It just duplicates the cl_program->name, and is not used in any fast
paths where the extra dereference will cause a hit.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-30 09:19:34 -04:00
Jeff Layton 275448eb10 rpc_pipe: convert back to simple_dir_inode_operations
Now that Al has fixed simple_lookup to account for the case where
sb->s_d_op is set, there's no need to keep our own special lookup op.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-23 18:18:53 -04:00
Linus Torvalds 3be542d464 NFS client bugfixes for 3.11
- Fix a regression against NFSv4 FreeBSD servers when creating a new file
 - Fix another regression in rpc_client_register()
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJR6ZKMAAoJEGcL54qWCgDyQX8P/19LKLNKcL+y2zVGjLbXMTq0
 TpyWdBO0ux7QcqnPEDg+Jpvu62IowYiKTtaSOXtHb5BNjQMBo2RKw3B0eMBoCp/z
 6gHmQRD2hMgqwBxBwHceV+dNwueCUiZW7GqaaNh6/3bpGQefegdONnLEifuPogEu
 oZmEuiVrGDfITEF7D4k5+shXCQN4eNH0LFuIQo4XXdCqmK6PwvOsidZ7YwHVC3Mg
 /Jzda2YsCxHj8kPi1xb9skPPAn6g4kdfYfyr/xSY7IviPixrkg/nEEK1b8xHU81e
 a0dd0Yx5kq6fR8LsBvQCHdj2m7doHM15jf5Np5G7VnnaWEjB2y+QftkxWc9lCNU3
 t2fr9YVD7ZG/GGNSFePHAHmBY0OqDB1Htp4vcwEQfzX6CAR3Hel82WVvut62Z6m4
 G5qHjwdqUFhmRN//SWlDpEqSn+pbeCvPhQS60ayN0TLivRsscm/I4yA75odAnn9b
 4su1IcUpqeJGeV6yDyMUqbx4kYZFyCZg/DNkThXiTKOs47A7ogSS9ev2fTB/V+jd
 rroNHNd/U508ze9D6D4ai9vR78uUp4wKNSSBZMCkBtNh0uSApOTgyGVhertB1EKS
 vgAr4T1tc+9t+0qg1Sb+hbKyBM/KaS5zUrPn+APHPoBXPh5PSVBzeNJkpxHRw/V0
 ZxkEgSQKLZSXYb5ab770
 =XE+7
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 - Fix a regression against NFSv4 FreeBSD servers when creating a new
   file
 - Fix another regression in rpc_client_register()

* tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4: Fix a regression against the FreeBSD server
  SUNRPC: Fix another issue with rpc_client_register()
2013-07-20 10:48:24 -07:00
Linus Torvalds ecb2cf1a6b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "A couple interesting SKB fragment handling fixes, plus the usual small
  bits here and there:

   1) Fix 64-bit divide build failure on 32-bit platforms in mlx5, from
      Tim Gardner.

   2) Get rid of a stupid reimplementation on "%*phC" in our sysfs MAC
      address printing helper.

   3) Fix NETIF_F_SG capability advertisement in hyperv driver, if the
      device can't do checksumming offloads then it shouldn't say it can
      do SG either.  From Haiyang Zhang.

   4) bgmac needs to depend on PHYLIB, from Hauke Mehrtens.

   5) Don't leak DMA mappings on mapping failures, from Neil Horman.

   6) We need to reset the transport header of SKBs in ipv4 before we
      attempt to perform early socket demux, just like ipv6 does.  From
      Eric Dumazet.

   7) Add missing locking on vxlan device removal, from Stephen
      Hemminger.

   8) xen-netfront has to make two passes over an SKB to prepare it for
      transfer.  One pass calculates the number of slots needed, the
      second massages the SKB and fills the slots.  Unfortunately, the
      first pass doesn't calculate the number of slots properly so we
      can end up trying to build a MAX_SKB_FRAGS + 1 SKB which doesn't
      work out so well.  Fix from Jan Beulich with help and discussion
      with several others.

   9) Fix a similar problem in tun and macvtap, which have to split up
      scatter-gather elements at PAGE_SIZE boundaries.  Don't do
      zerocopy if it would result in a > MAX_SKB_FRAGS skb.  Fixes from
      Jason Wang.

  10) On receive, once we've decoded the VLAN state completely, clear
      skb->vlan_tci.  Otherwise demuxed tunnels underneath can trigger
      the VLAN code again, corrupting the packet.  Fix from Eric
      Dumazet"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  vlan: fix a race in egress prio management
  vlan: mask vlan prio bits
  macvtap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS
  tuntap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS
  pkt_sched: sch_qfq: remove a source of high packet delay/jitter
  xen-netfront: pull on receive skb may need to happen earlier
  vxlan: add necessary locking on device removal
  hyperv: Fix the NETIF_F_SG flag setting in netvsc
  net: Fix sysfs_format_mac() code duplication.
  be2net: Fix to avoid hardware workaround when not needed
  macvtap: do not assume 802.1Q when send vlan packets
  macvtap: fix the missing ret value of TUNSETQUEUE
  ipv4: set transport header earlier
  mlx5 core: Fix __udivdi3 when compiling for 32 bit arches
  bgmac: add dependency to phylib
  net/irda: fixed style issues in irlan_eth
  ethtool: fixed trailing statements in ethtool
  ndisc: bool initializations should use true and false
  atl1e: unmap partially mapped skb on dma error and free skb
2013-07-18 20:08:47 -07:00
Eric Dumazet 3e3aac4975 vlan: fix a race in egress prio management
egress_priority_map[] hash table updates are protected by rtnl,
and we never remove elements until device is dismantled.

We have to make sure that before inserting an new element in hash table,
all its fields are committed to memory or else another cpu could
find corrupt values and crash.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-18 13:07:13 -07:00
Eric Dumazet d4b812dea4 vlan: mask vlan prio bits
In commit 48cc32d38a
("vlan: don't deliver frames for unknown vlans to protocols")
Florian made sure we set pkt_type to PACKET_OTHERHOST
if the vlan id is set and we could find a vlan device for this
particular id.

But we also have a problem if prio bits are set.

Steinar reported an issue on a router receiving IPv6 frames with a
vlan tag of 4000 (id 0, prio 2), and tunneled into a sit device,
because skb->vlan_tci is set.

Forwarded frame is completely corrupted : We can see (8100:4000)
being inserted in the middle of IPv6 source address :

16:48:00.780413 IP6 2001:16d8:8100:4000:ee1c:0:9d9:bc87 >
9f94:4d95:2001:67c:29f4::: ICMP6, unknown icmp6 type (0), length 64
       0x0000:  0000 0029 8000 c7c3 7103 0001 a0ae e651
       0x0010:  0000 0000 ccce 0b00 0000 0000 1011 1213
       0x0020:  1415 1617 1819 1a1b 1c1d 1e1f 2021 2223
       0x0030:  2425 2627 2829 2a2b 2c2d 2e2f 3031 3233

It seems we are not really ready to properly cope with this right now.

We can probably do better in future kernels :
vlan_get_ingress_priority() should be a netdev property instead of
a per vlan_dev one.

For stable kernels, lets clear vlan_tci to fix the bugs.

Reported-by: Steinar H. Gunderson <sesse@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-18 13:05:23 -07:00
Paolo Valente 87f40dd6ce pkt_sched: sch_qfq: remove a source of high packet delay/jitter
QFQ+ inherits from QFQ a design choice that may cause a high packet
delay/jitter and a severe short-term unfairness. As QFQ, QFQ+ uses a
special quantity, the system virtual time, to track the service
provided by the ideal system it approximates. When a packet is
dequeued, this quantity must be incremented by the size of the packet,
divided by the sum of the weights of the aggregates waiting to be
served. Tracking this sum correctly is a non-trivial task, because, to
preserve tight service guarantees, the decrement of this sum must be
delayed in a special way [1]: this sum can be decremented only after
that its value would decrease also in the ideal system approximated by
QFQ+. For efficiency, QFQ+ keeps track only of the 'instantaneous'
weight sum, increased and decreased immediately as the weight of an
aggregate changes, and as an aggregate is created or destroyed (which,
in its turn, happens as a consequence of some class being
created/destroyed/changed). However, to avoid the problems caused to
service guarantees by these immediate decreases, QFQ+ increments the
system virtual time using the maximum value allowed for the weight
sum, 2^10, in place of the dynamic, instantaneous value. The
instantaneous value of the weight sum is used only to check whether a
request of weight increase or a class creation can be satisfied.

Unfortunately, the problems caused by this choice are worse than the
temporary degradation of the service guarantees that may occur, when a
class is changed or destroyed, if the instantaneous value of the
weight sum was used to update the system virtual time. In fact, the
fraction of the link bandwidth guaranteed by QFQ+ to each aggregate is
equal to the ratio between the weight of the aggregate and the sum of
the weights of the competing aggregates. The packet delay guaranteed
to the aggregate is instead inversely proportional to the guaranteed
bandwidth. By using the maximum possible value, and not the actual
value of the weight sum, QFQ+ provides each aggregate with the worst
possible service guarantees, and not with service guarantees related
to the actual set of competing aggregates. To see the consequences of
this fact, consider the following simple example.

Suppose that only the following aggregates are backlogged, i.e., that
only the classes in the following aggregates have packets to transmit:
one aggregate with weight 10, say A, and ten aggregates with weight 1,
say B1, B2, ..., B10. In particular, suppose that these aggregates are
always backlogged. Given the weight distribution, the smoothest and
fairest service order would be:
A B1 A B2 A B3 A B4 A B5 A B6 A B7 A B8 A B9 A B10 A B1 A B2 ...

QFQ+ would provide exactly this optimal service if it used the actual
value for the weight sum instead of the maximum possible value, i.e.,
11 instead of 2^10. In contrast, since QFQ+ uses the latter value, it
serves aggregates as follows (easy to prove and to reproduce
experimentally):
A B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 A A A A A A A A A A B1 B2 ... B10 A A ...

By replacing 10 with N in the above example, and by increasing N, one
can increase at will the maximum packet delay and the jitter
experienced by the classes in aggregate A.

This patch addresses this issue by just using the above
'instantaneous' value of the weight sum, instead of the maximum
possible value, when updating the system virtual time.  After the
instantaneous weight sum is decreased, QFQ+ may deviate from the ideal
service for a time interval in the order of the time to serve one
maximum-size packet for each backlogged class. The worst-case extent
of the deviation exhibited by QFQ+ during this time interval [1] is
basically the same as of the deviation described above (but, without
this patch, QFQ+ suffers from such a deviation all the time). Finally,
this patch modifies the comment to the function qfq_slot_insert, to
make it coherent with the fact that the weight sum used by QFQ+ can
now be lower than the maximum possible value.

[1] P. Valente, "Extending WF2Q+ to support a dynamic traffic mix",
Proceedings of AAA-IDEA'05, June 2005.

Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-18 13:02:00 -07:00
Linus Torvalds 3f334c2081 Merge branch 'cpuinit_phase2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
Pull phase two of __cpuinit removal from Paul Gortmaker:
 "With the __cpuinit infrastructure removed earlier, this group of
  commits only removes the function/data tagging that was done with the
  various (now no-op) __cpuinit related prefixes.

  Now that the dust has settled with yesterday's v3.11-rc1, there
  hopefully shouldn't be any new users leaking back in tree, but I think
  we can leave the harmless no-op stubs there for a release as a
  courtesy to those who still have out of tree stuff and weren't paying
  attention.

  Although the commits are against the recent tag to allow for minor
  context refreshes for things like yesterday's v3.11-rc1~ slab content,
  the patches have been largely unchanged for weeks, aside from such
  trivial updates.

  For detail junkies, the largely boring and mostly irrelevant history
  of the patches can be viewed at:

    http://git.kernel.org/cgit/linux/kernel/git/paulg/cpuinit-delete.git

  If nothing else, I guess it does at least demonstrate the level of
  involvement required to shepherd such a treewide change to completion.

  This is the same repository of patches that has been applied to the
  end of the daily linux-next branches for the past several weeks"

* 'cpuinit_phase2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (28 commits)
  block: delete __cpuinit usage from all block files
  drivers: delete __cpuinit usage from all remaining drivers files
  kernel: delete __cpuinit usage from all core kernel files
  rcu: delete __cpuinit usage from all rcu files
  net: delete __cpuinit usage from all net files
  acpi: delete __cpuinit usage from all acpi files
  hwmon: delete __cpuinit usage from all hwmon files
  cpufreq: delete __cpuinit usage from all cpufreq files
  clocksource+irqchip: delete __cpuinit usage from all related files
  x86: delete __cpuinit usage from all x86 files
  score: delete __cpuinit usage from all score files
  xtensa: delete __cpuinit usage from all xtensa files
  openrisc: delete __cpuinit usage from all openrisc files
  m32r: delete __cpuinit usage from all m32r files
  hexagon: delete __cpuinit usage from all hexagon files
  frv: delete __cpuinit usage from all frv files
  cris: delete __cpuinit usage from all cris files
  metag: delete __cpuinit usage from all metag files
  tile: delete __cpuinit usage from all tile files
  sh: delete __cpuinit usage from all sh files
  ...
2013-07-18 10:50:26 -07:00
Linus Torvalds 61f98b0fca Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfixes from Bruce Fields:
 "Just three minor bugfixes"

* 'for-3.11' of git://linux-nfs.org/~bfields/linux:
  svcrdma: underflow issue in decode_write_list()
  nfsd4: fix minorversion support interface
  lockd: protect nlm_blocked access in nlmsvc_retry_blocked
2013-07-17 13:43:55 -07:00
David S. Miller ae8e9c5a1a net: Fix sysfs_format_mac() code duplication.
It's just a duplicate implementation of "%*phC".  Thanks to Joe
Perches for showing that we had exactly this support in the
lib/vsprintf.c code already.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-16 17:09:22 -07:00
Eric Dumazet 21d1196a35 ipv4: set transport header earlier
commit 45f00f99d6 ("ipv4: tcp: clean up tcp_v4_early_demux()") added a
performance regression for non GRO traffic, basically disabling
IP early demux.

IPv6 stack resets transport header in ip6_rcv() before calling
IP early demux in ip6_rcv_finish(), while IPv4 does this only in
ip_local_deliver_finish(), _after_ IP early demux.

GRO traffic happened to enable IP early demux because transport header
is also set in inet_gro_receive()

Instead of reverting the faulty commit, we can make IPv4/IPv6 behave the
same : transport_header should be set in ip_rcv() instead of
ip_local_deliver_finish()

ip_local_deliver_finish() can also use skb_network_header_len() which is
faster than ip_hdrlen()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-16 12:59:28 -07:00
Dragos Foianu b6a82dd233 net/irda: fixed style issues in irlan_eth
Applied fixes suggested by checkpatch.pl

Signed-off-by: Dragos Foianu <dragos.foianu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-16 12:16:03 -07:00
Dragos Foianu 8a73125c36 ethtool: fixed trailing statements in ethtool
Applied fixes suggested by checkpatch.pl.

Signed-off-by: Dragos Foianu <dragos.foianu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-16 12:14:51 -07:00
Daniel Baluta f2f79cca13 ndisc: bool initializations should use true and false
Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-16 12:13:31 -07:00
Dan Carpenter b2781e1021 svcrdma: underflow issue in decode_write_list()
My static checker marks everything from ntohl() as untrusted and it
complains we could have an underflow problem doing:

	return (u32 *)&ary->wc_array[nchunks];

Also on 32 bit systems the upper bound check could overflow.

Cc: stable@vger.kernel.org
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-15 11:46:23 -04:00
Trond Myklebust 1540c5d3cb SUNRPC: Fix another issue with rpc_client_register()
Fix the error pathway if rpcauth_create() fails.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-15 10:15:54 -04:00
Paul Gortmaker 013dbb325b net: delete __cpuinit usage from all net files
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications.  For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out.  Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

This removes all the net/* uses of the __cpuinit macros
from all C files.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-07-14 19:36:58 -04:00
Linus Torvalds 41d9884c44 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs stuff from Al Viro:
 "O_TMPFILE ABI changes, Oleg's fput() series, misc cleanups, including
  making simple_lookup() usable for filesystems with non-NULL s_d_op,
  which allows us to get rid of quite a bit of ugliness"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  sunrpc: now we can just set ->s_d_op
  cgroup: we can use simple_lookup() now
  efivarfs: we can use simple_lookup() now
  make simple_lookup() usable for filesystems that set ->s_d_op
  configfs: don't open-code d_alloc_name()
  __rpc_lookup_create_exclusive: pass string instead of qstr
  rpc_create_*_dir: don't bother with qstr
  llist: llist_add() can use llist_add_batch()
  llist: fix/simplify llist_add() and llist_add_batch()
  fput: turn "list_head delayed_fput_list" into llist_head
  fs/file_table.c:fput(): add comment
  Safer ABI for O_TMPFILE
2013-07-14 11:42:26 -07:00
Al Viro dae3794fd6 sunrpc: now we can just set ->s_d_op
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-14 17:55:39 +04:00
Al Viro d3db90b0a4 __rpc_lookup_create_exclusive: pass string instead of qstr
... and use d_hash_and_lookup() instead of open-coding it, for fsck sake...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-14 17:09:57 +04:00
Al Viro a95e691f9c rpc_create_*_dir: don't bother with qstr
just pass the name

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-14 17:02:28 +04:00
Linus Torvalds be9c6d9169 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Just a bunch of small fixes and tidy ups:

   1) Finish the "busy_poll" renames, from Eliezer Tamir.

   2) Fix RCU stalls in IFB driver, from Ding Tianhong.

   3) Linearize buffers properly in tun/macvtap zerocopy code.

   4) Don't crash on rmmod in vxlan, from Pravin B Shelar.

   5) Spinlock used before init in alx driver, from Maarten Lankhorst.

   6) A sparse warning fix in bnx2x broke TSO checksums, fix from Dmitry
      Kravkov.

   7) Dummy and ifb driver load failure paths can oops, fixes from Tan
      Xiaojun and Ding Tianhong.

   8) Correct MTU calculations in IP tunnels, from Alexander Duyck.

   9) Account all TCP retransmits in SNMP stats properly, from Yuchung
      Cheng.

  10) atl1e and via-rhine do not handle DMA mapping failures properly,
      from Neil Horman.

  11) Various equal-cost multipath route fixes in ipv6 from Hannes
      Frederic Sowa"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits)
  ipv6: only static routes qualify for equal cost multipathing
  via-rhine: fix dma mapping errors
  atl1e: fix dma mapping warnings
  tcp: account all retransmit failures
  usb/net/r815x: fix cast to restricted __le32
  usb/net/r8152: fix integer overflow in expression
  net: access page->private by using page_private
  net: strict_strtoul is obsolete, use kstrtoul instead
  drivers/net/ieee802154: don't use devm_pinctrl_get_select_default() in probe
  drivers/net/ethernet/cadence: don't use devm_pinctrl_get_select_default() in probe
  drivers/net/can/c_can: don't use devm_pinctrl_get_select_default() in probe
  net/usb: add relative mii functions for r815x
  net/tipc: use %*phC to dump small buffers in hex form
  qlcnic: Adding Maintainers.
  gre: Fix MTU sizing check for gretap tunnels
  pkt_sched: sch_qfq: remove forward declaration of qfq_update_agg_ts
  pkt_sched: sch_qfq: improve efficiency of make_eligible
  gso: Update tunnel segmentation to support Tx checksum offload
  inet: fix spacing in assignment
  ifb: fix oops when loading the ifb failed
  ...
2013-07-13 17:42:22 -07:00
Hannes Frederic Sowa 307f2fb95e ipv6: only static routes qualify for equal cost multipathing
Static routes in this case are non-expiring routes which did not get
configured by autoconf or by icmpv6 redirects.

To make sure we actually get an ecmp route while searching for the first
one in this fib6_node's leafs, also make sure it matches the ecmp route
assumptions.

v2:
a) Removed RTF_EXPIRE check in dst.from chain. The check of RTF_ADDRCONF
   already ensures that this route, even if added again without
   RTF_EXPIRES (in case of a RA announcement with infinite timeout),
   does not cause the rt6i_nsiblings logic to go wrong if a later RA
   updates the expiration time later.

v3:
a) Allow RTF_EXPIRES routes to enter the ecmp route set. We have to do so,
   because an pmtu event could update the RTF_EXPIRES flag and we would
   not count this route, if another route joins this set. We now filter
   only for RTF_GATEWAY|RTF_ADDRCONF|RTF_DYNAMIC, which are flags that
   don't get changed after rt6_info construction.

Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-12 16:29:54 -07:00
Yuchung Cheng 24ab6bec80 tcp: account all retransmit failures
Change snmp RETRANSFAILS stat to include timeout retransmit failures
in addition to other loss recoveries.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-12 16:15:56 -07:00
Sunghan Suh 40dadff265 net: access page->private by using page_private
Signed-off-by: Sunghan Suh <sunghan.suh@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-12 16:10:34 -07:00
“Cosmin 92338dc2fb net: strict_strtoul is obsolete, use kstrtoul instead
patch found using checkpatch.pl

Signed-off-by: Cosmin Stanescu <cosmin90stanescu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-12 16:09:14 -07:00
Andy Shevchenko d77e41e127 net/tipc: use %*phC to dump small buffers in hex form
Instead of passing each byte by stack let's use nice specifier for that.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-11 17:03:36 -07:00
Alexander Duyck 8c91e162e0 gre: Fix MTU sizing check for gretap tunnels
This change fixes an MTU sizing issue seen with gretap tunnels when non-gso
packets are sent from the interface.

In my case I was able to reproduce the issue by simply sending a ping of
1421 bytes with the gretap interface created on a device with a standard
1500 mtu.

This fix is based on the fact that the tunnel mtu is already adjusted by
dev->hard_header_len so it would make sense that any packets being compared
against that mtu should also be adjusted by hard_header_len and the tunnel
header instead of just the tunnel header.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Reported-by: Cong Wang <amwang@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-11 16:12:03 -07:00
Paolo Valente 88d4f419a4 pkt_sched: sch_qfq: remove forward declaration of qfq_update_agg_ts
This patch removes the forward declaration of qfq_update_agg_ts, by moving
the definition of the function above its first call. This patch also
removes a useless forward declaration of qfq_schedule_agg.

Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-11 13:01:07 -07:00
Paolo Valente 87f1369d6e pkt_sched: sch_qfq: improve efficiency of make_eligible
In make_eligible, a mask is used to decide which groups must become eligible:
the i-th group becomes eligible only if the i-th bit of the mask (from the
right) is set. The mask is computed by left-shifting a 1 by a given number of
places, and decrementing the result.  The shift is performed on a ULL to avoid
problems in case the number of places to shift is higher than 31.  On a 32-bit
machine, this is more costly than working on an UL. This patch replaces such a
costly operation with two cheaper branches.

The trick is based on the following fact: in case of a shift of at least 32
places, the resulting mask has at least the 32 less significant bits set,
whereas the total number of groups is lower than 32.  As a consequence, in this
case it is enough to just set the 32 less significant bits of the mask with a
cheaper ~0UL. In the other case, the shift can be safely performed on a UL.

Reported-by: David S. Miller <davem@davemloft.net>
Reported-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-11 13:01:07 -07:00
Alexander Duyck cdbaa0bb26 gso: Update tunnel segmentation to support Tx checksum offload
This change makes it so that the GRE and VXLAN tunnels can make use of Tx
checksum offload support provided by some drivers via the hw_enc_features.
Without this fix enabling GSO means sacrificing Tx checksum offload and
this actually leads to a performance regression as shown below:

            Utilization
            Send
Throughput  local         GSO
10^6bits/s  % S           state
  6276.51   8.39          enabled
  7123.52   8.42          disabled

To resolve this it was necessary to address two items.  First
netif_skb_features needed to be updated so that it would correctly handle
the Trans Ether Bridging protocol without impacting the need to check for
Q-in-Q tagging.  To do this it was necessary to update harmonize_features
so that it used skb_network_protocol instead of just using the outer
protocol.

Second it was necessary to update the GRE and UDP tunnel segmentation
offloads so that they would reset the encapsulation bit and inner header
offsets after the offload was complete.

As a result of this change I have seen the following results on a interface
with Tx checksum enabled for encapsulated frames:

            Utilization
            Send
Throughput  local         GSO
10^6bits/s  % S           state
  7123.52   8.42          disabled
  8321.75   5.43          enabled

v2: Instead of replacing refrence to skb->protocol with
    skb_network_protocol just replace the protocol reference in
    harmonize_features to allow for double VLAN tag checks.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-11 12:18:49 -07:00
Linus Torvalds 1466b77a7b NFS client updates for Linux 3.11 (part 2)
Highlights include:
 - Fix an_rpc pipefs regression that causes a deadlock on mount
 - Readdir optimisations by Scott Mayhew and Jeff Layton
 - clean up the rpc_pipefs dentry operation setup
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJR3vVIAAoJEGcL54qWCgDyBWEP/0blqSlJId4zZj4xDviRFqJ4
 93C7b/Vn7LrAcNCgDQsPkkzTwAX5yTB1H5eNtMuyggAdGj89d4n0jXgBniIMHmqI
 Pjrr/XMQ65NddehrO491N01iJSfP9wE3CizJodnAv4VxMRO3xqiJG85lcnoLOFea
 V1FnEFUu9oi8e93cQt2fe6KdmTu/SuRqlqR7WPGyTFgS26x1l8nkp2OQgulit5Up
 lWuaxg4xbKOdj1jfUDXZhWUnDtkFjxyGxnKR63aA2X1DEGCUTJ6gB3tAl9pvnUb2
 RTQF3GVj+Bm/E3gE6ULJvqOjhsgWYjLAZn6hDA3yNAIiFyV7aA6gwK4oKy/B47a6
 tFEN2O1EupWzCqGyHhTArk+oEBLfUv/EgFyo7+Y0YIFV4sQTu5RbaZ0nQ2geY6LA
 50q2GH57tkXTs859gtBPQgKzgRF1ulkF1FDY9EYQHyGiUbNxBfx+6/2OI04ubQt3
 1gKUmm9w1WVzYGmHcHbxsXPT53NtAnHXW4ExcMgpaZ1YOPuIILm78ZuAw78XB/dd
 mvXRtbhVt/gs7qZAQQPp1iHIv+vnJ0KgjO62gbuTIRftw5jwWrpWcfYMUUZrMnyM
 kn326z3f4gn/vSDZI7J4tOfG1Uc7eNy+cJxStjtiNWTs3UzuWJKzJH0rZnoNZdei
 xAkLhjIUEybAqIpXJuGH
 =NqQf
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.11-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull second set of NFS client updates from Trond Myklebust:
 "This mainly contains some small readdir optimisations that had
  dependencies on Al Viro's readdir rewrite.  There is also a fix for a
  nasty deadlock which surfaced earlier in this merge window.

  Highlights include:
   - Fix an_rpc pipefs regression that causes a deadlock on mount
   - Readdir optimisations by Scott Mayhew and Jeff Layton
   - clean up the rpc_pipefs dentry operation setup"

* tag 'nfs-for-3.11-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  SUNRPC: Fix a deadlock in rpc_client_register()
  rpc_pipe: rpc_dir_inode_operations can be static
  NFS: Allow nfs_updatepage to extend a write under additional circumstances
  NFS: Make nfs_readdir revalidate less often
  NFS: Make nfs_attribute_cache_expired() non-static
  rpc_pipe: set dentry operations at d_alloc time
  nfs: set verifier on existing dentries in nfs_prime_dcache
2013-07-11 12:11:35 -07:00
Camelia Groza 3b8ccd4473 inet: fix spacing in assignment
Found using checkpatch.pl

Signed-off-by: Camelia Groza <camelia.groza@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-11 12:02:39 -07:00