Updates the references to spec documents throughout the code, taking into
account that
* the DCCP, CCID 2, and CCID 3 drafts all became RFCs in March this year
* RFC 1063 was obsoleted by RFC 1191
* draft-ietf-tcpimpl-pmtud-0x.txt was published as an Informational
RFC, RFC 2923 on 2000-09-22.
All references verified.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As reported by Mark Dowd <Mark_Dowd@McAfee.com>, ip6_tables is susceptible
to a fragmentation attack causing false negatives on extension header matches.
When extension headers occur in the non-first fragment after the fragment
header (possibly with an incorrect nexthdr value in the fragment header)
a rule looking for this extension header will never match.
Drop fragments that are at offset 0 and don't contain the final protocol
header regardless of the ruleset, since this should not happen normally.
Since all extension headers are before the protocol header this makes sure
an extension header is either not present or in the first fragment, where
we can properly parse it.
With help from Yasuyuki KOZAKAI <yasuyuki.kozakai@toshiba.co.jp>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
As reported by Mark Dowd <Mark_Dowd@McAfee.com>, ip6_tables is susceptible
to a fragmentation attack causing false negatives on protocol matches.
When the protocol header doesn't follow the fragment header immediately,
the fragment header contains the protocol number of the next extension
header. When the extension header and the protocol header are sent in
a second fragment a rule like "ip6tables .. -p udp -j DROP" will never
match.
Drop fragments that are at offset 0 and don't contain the final protocol
header regardless of the ruleset, since this should not happen normally.
With help from Yasuyuki KOZAKAI <yasuyuki.kozakai@toshiba.co.jp>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
xfrm_state_num needs to be increased for XFRM_STATE_ACQ states created
by xfrm_state_find() to prevent the counter from going negative when
the state is destroyed.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
memcpy 4 bytes to address of auto unsigned long variable followed
by comparison with u32 is a bloody bad idea.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The networking emulator can queue SKBs for a very long
time, so if you're using netem on the sender side for
large bandwidth/delay product testing, the SKB socket
send queue sizes become artificially larger.
Correct this by calling skb_orphan() in netem_enqueue().
Signed-off-by: David S. Miller <davem@davemloft.net>
Based upon a patch from Jesper Juhl. Try to match the
TCP IPv6 code this was copied from as much as possible,
so that it's easy to see where to add the ipv6 pktoptions
support code.
Signed-off-by: David S. Miller <davem@davemloft.net>
I think I got the cause for the Oops observed in
http://www.mail-archive.com/dccp@vger.kernel.org/msg00578.html
The problem is always with applications listening on PF_INET6 sockets. Apart
from the mentioned oops, I observed another one one, triggered at irregular
intervals via timer interrupt:
run_timer_softirq -> dccp_keepalive_timer
-> inet_csk_reqsk_queue_prune
-> reqsk_free
-> dccp_v6_reqsk_destructor
The latter function is the problem and is also the last function to be called
in said kernel panic.
In any case, there is a real problem with allocating the right request_sock
which is what this patch tackles.
It fixes the following problem:
- application listens on PF_INET6
- DCCPv4 packet comes in, is handed over to dccp_v4_do_rcv, from there
to dccp_v4_conn_request
Now: socket is PF_INET6, packet is IPv4. The following code then furnishes the
connection with IPv6 - request_sock operations:
req = reqsk_alloc(sk->sk_prot->rsk_prot);
The first problem is that all further incoming packets will get a Reset since
the connection can not be looked up.
The second problem is worse:
--> reqsk_alloc is called instead of inet6_reqsk_alloc
--> consequently inet6_rsk_offset is never set (dangling pointer)
--> the request_sock_ops are nevertheless still dccp6_request_ops
--> destructor is called via reqsk_free
--> dccp_v6_reqsk_destructor tries to free random memory location (inet6_rsk_offset not set)
--> panic
I have tested this for a while, DCCP sockets are now handled correctly in all
three scenarios (v4/v6 only/v4-mapped).
Commiter note: I've added the dccp_request_sock_ops forward declaration to keep
the tree building and to reduce the size of the patch for 2.6.19,
later I'll move the functions to the top of the affected source
code to match what we have in the TCP counterpart, where this
problem hasn't existed in the first place, dumb me not to have
done the same thing on DCCP land 8)
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (36 commits)
[Bluetooth] Fix HID disconnect NULL pointer dereference
[Bluetooth] Add missing entry for Nokia DTL-4 PCMCIA card
[Bluetooth] Add support for newer ANYCOM USB dongles
[NET]: Can use __get_cpu_var() instead of per_cpu() in loopback driver.
[IPV4] inet_peer: Group together avl_left, avl_right, v4daddr to speedup lookups on some CPUS
[TCP]: One NET_INC_STATS() could be NET_INC_STATS_BH in tcp_v4_err()
[NETFILTER]: Missing check for CAP_NET_ADMIN in iptables compat layer
[NETPOLL]: initialize skb for UDP
[IPV6]: Fix route.c warnings when multiple tables are disabled.
[TG3]: Bump driver version and release date.
[TG3]: Add lower bound checks for tx ring size.
[TG3]: Fix set ring params tx ring size implementation
[NET]: reduce per cpu ram used for loopback stats
[IPv6] route: Fix prohibit and blackhole routing decision
[DECNET]: Fix input routing bug
[TCP]: Bound TSO defer time
[IPv4] fib: Remove unused fib_config members
[IPV6]: Always copy rt->u.dst.error when copying a rt6_info.
[IPV6]: Make IPV6_SUBTREES depend on IPV6_MULTIPLE_TABLES.
[IPV6]: Clean up BACKTRACK().
...
This patch is suitable for just about any 2.6 kernel. It should go in
2.6.19 and 2.6.18.2 and possible even the .17 and .16 stable series.
This is a long standing bug that seems to have only recently become
apparent, presumably due to increasing use of NFS over TCP - many
distros seem to be making it the default.
The SK_CONN bit gets set when a listening socket may be ready
for an accept, just as SK_DATA is set when data may be available.
It is entirely possible for svc_tcp_accept to be called with neither
of these set. It doesn't happen often but there is a small race in
svc_sock_enqueue as SK_CONN and SK_DATA are tested outside the
spin_lock. They could be cleared immediately after the test and
before the lock is gained.
This normally shouldn't be a problem. The sockets are non-blocking so
trying to read() or accept() when ther is nothing to do is not a problem.
However: svc_tcp_recvfrom makes the decision "Should I accept() or
should I read()" based on whether SK_CONN is set or not. This usually
works but is not safe. The decision should be based on whether it is
a TCP_LISTEN socket or a TCP_CONNECTED socket.
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: <stable@kernel.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Yes, this actually passed tests the way it was.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When submitting a request to a fast portmapper (such as the local rpcbind
daemon), the request can complete before the parent task is even queued up on
xprt->binding. Fix this by queuing before submitting the rpcbind request.
Test plan:
Connectathon locking test with UDP.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The latest HID disconnect sequence change introduced a NULL pointer
dereference. For the quirk to handle buggy remote HID implementations,
it is enough to wait for a potential control channel disconnect from
the remote side and it is also enough to wait only 500 msecs.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
I believe this NET_INC_STATS() call can be replaced by
NET_INC_STATS_BH(), a little bit cheaper.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 32bit compatibility layer has no CAP_NET_ADMIN check in
compat_do_ipt_get_ctl, which for example allows to list the current
iptables rules even without having that capability (the non-compat
version requires it). Other capabilities might be required to exploit
the bug (eg. CAP_NET_RAW to get the nfnetlink socket?), so a plain user
can't exploit it, but a setup actually using the posix capability system
might very well hit such a constellation of granted capabilities.
Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Need to fully initialize skb to keep lower layers and queueing happy.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
WE-21 changed the ABI for the SIOC[SG]IW{ESSID,NICKN} ioctls by dropping
NULL termination. This patch adds compatibility code so that WE-21 can
work properly with WE-20 (and older) tools.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Lookups resolving to ip6_blk_hole_entry must result in silently
discarding the packets whereas an ip6_pkt_prohibit_entry is
supposed to cause an ICMPV6_ADM_PROHIBITED message to be sent.
Thanks to Kim Nordlund <kim.nordlund@nokia.com> for noticing
this bug.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a silly bug that has been in the input routing code
for some time. It results in trying to send to a node directly when
the origin of the packet is via the default router.
Its been tested by Alan Kemmerer <alan.kemmerer@mittalsteel.com> who
reported the bug and its a fairly obvious fix for a typo.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch limits the amount of time you will defer sending a TSO segment
to less than two clock ticks, or the time between two acks, whichever is
longer.
On slow links, deferring causes significant bursts. See attached plots,
which show RTT through a 1 Mbps link with a 100 ms RTT and ~100 ms queue
for (a) non-TSO, (b) currnet TSO, and (c) patched TSO. This burstiness
causes significant jitter, tends to overflow queues early (bad for short
queues), and makes delay-based congestion control more difficult.
Deferring by a couple clock ticks I believe will have a relatively small
impact on performance.
Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
As IPV6_SUBTREES can't work without IPV6_MULTIPLE_TABLES have IPV6_SUBTREES
depend on it.
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
The fn check is unnecessary as fn can never be NULL in BACKTRACK().
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
As ip6_route_output() never returns NULL, error checking must be done by
looking at dst->error in stead of comparing dst against NULL.
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch causes TIPC to return an error message when it receives
an unrecognized configuration command. (Previously, the sender
received no feedback.)
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch allows a TIPC application to cancel an existing
topology service subscription by re-requesting the subscription
with the TIPC_SUB_CANCEL filter bit set. (All other bits of
the cancel request must match the original subscription request.)
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a minor bug that prevents "tipc-config -l" from
displaying the multicast link if a TIPC node has never successfully
established at least one unicast link.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch corrects an issue wherein a previouly failed node could
not reestablish a links to a non-failing node in the TIPC network
until the latter node detected the link failure itself (which might
be configured to take up to 30 seconds). The non-failing node now
responds to link setup requests from a previously failed node in at
most 1 second, allowing it to detect the link failure more quickly.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch tivially re-orders the entries in TIPC's list of local
publications so that applications will receive publication events
in the order they were published.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enhances TIPC's Ethernet support to include VLAN interfaces.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch allows the compiler to optimize out any code that tries to
send debugging output to the null print buffer (TIPC_NULL), a capability
that was unintentionally broken during the recent print buffer rework.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a simple test so TIPC doesn't try waking up processes
waiting on a socket if there are none waiting.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TIPC now rejects and logs link setup requests from node <Z.C.N> if the
receiving node already has a functional link to that node on the associated
interface, or if the requestor is using the same <Z.C.N> as the receiver.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The stream socket send code was not initializing some required fields
of the temporary msghdr structure it was utilizing; this is now fixed.
A check has also been added to detect if a user illegally specifies
a destination address when sending on an established stream connection.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change modifies TIPC's print buffer code as follows:
1) Now supports small print buffers (min. size reduced from 512 bytes to 64)
2) Now uses TIPC_NULL print buffer structure to indicate null device
instead of NULL pointer (this simplified error handling)
3) Fixed misuse of console buffer structure by tipc_dump()
4) Added and corrected comments in various places
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is possible for the ->fopen callback from lockd into nfsd to find that an
answer cannot be given straight away (an upcall is needed) and so the request
has to be 'dropped', to be retried later. That error status is not currently
propagated back.
So:
Change nlm_fopen to return nlm error codes (rather than a private
protocol) and define a new nlm_drop_reply code.
Cause nlm_drop_reply to cause the rpc request to get rpc_drop_reply
when this error comes back.
Cause svc_process to drop a request which returns a status of
rpc_drop_reply.
[akpm@osdl.org: fix warning storm]
Cc: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make net_random() more widely available by calling it random32
akpm: hopefully this will permit the removal of carta_random32. That needs
confirmation from Stephane - this code looks somewhat more computationally
expensive, and has a different (ie: callee-stateful) interface.
[akpm@osdl.org: lots of build fixes, cleanups]
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix a slab corruption in ieee80211softmac_auth(). The size of a buffer
was miscomputed.
see http://bugzilla.kernel.org/show_bug.cgi?id=7245
Acked-by: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Laurent Riffard <laurent.riffard@free.fr>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This fixes some race conditions in the WirelessExtension
handling and association handling code.
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The bt_proto array needs to be protected by some kind of locking to
prevent a race condition between bt_sock_create and bt_sock_register.
And in addition all calls to sk_alloc need to be made GFP_ATOMIC now.
Signed-off-by: Masatake YAMATO <jet@gyve.org>
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
If the DLC device is no longer attached to the TTY device, then it
makes no sense to go through with changing the termios settings.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
When the connection lookup for the device structure fails, the reference
count for the HCI device needs to be decremented.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Bluetooth HID specification demands that the interrupt channel
shall be disconnected first. This is needed to pass the qualification
tests.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Most Bluetooth chips don't support concurrent connect requests, because
this would involve a multiple baseband page with only one radio. In the
case an upper layer like L2CAP requests a concurrent connect these chips
return the error "Command Disallowed" for the second request. If this
happens it the responsibility of the Bluetooth core to queue the request
and try again after the previous connect attempt has been completed.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Bluetooth subsystem currently uses a platform device for devices
with no parent. It is a better idea to use the new virtual devices
tree for these.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Some return values of the driver core register and create functions
are not handled and so might cause unexpected problems.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
There exists no attempt do deal with the fact that a structure with
a uint32_t followed by a pointer is going to be different for 32-bit
and 64-bit userspace. Any 32-bit process trying to use it will be
failing with -EFAULT if it's lucky; suffering from having data dumped
at a random address if it's not.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This is missing the MODULE_LICENSE statements and taints the kernel
upon loading. License is obvious from the beginning of the file.
Signed-off-by: Jan Dittmer <jdi@l4x.org>
Signed-off-by: Joerg Roedel <joro-lkml@zlug.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes rt6_lookup() to provide the source address in the flow
and sets RT6_LOOKUP_F_HAS_SADDR whenever it is present in
the flow.
Avoids unnecessary prefix comparisons by checking for a prefix
length first.
Fixes the rule logic to not match packets if a source selector
has been specified but no source address is available.
Thanks to Kim Nordlund <kim.nordlund@nokia.com> for working
on this patch with me.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Missing counter bump when hashing in a new ACQ
xfrm_state.
Now that we have two spots to do the hash grow
check, break it out into a helper function.
Signed-off-by: David S. Miller <davem@davemloft.net>
The CIPSO passthrough mapping had a problem when sending categories which
would cause no or incorrect categories to be sent on the wire with a packet.
This patch fixes the problem which was a simple off-by-one bug.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Fix several places in the CIPSO code where it was dereferencing fields which
did not have valid pointers by moving those pointer dereferences into code
blocks where the pointers are valid.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Flush the forwarding table when carrier is lost. This helps for
availability because we don't want to forward to a downed device and
new packets may come in on other links.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove (compilation-breaking) debugging messages introduced at early
development stage.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
CONNSECMARK needs conntrack, add missing dependency to fix linking error
with CONNSECMARK=y and CONNTRACK=m.
Reported by Toralf Förster <toralf.foerster@gmx.de>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even though the tos field is only a single byte large, the values need to
be converted to net-endian for the checkum update so they are in the
corrent byte position. Also fix incorrect endian annotations.
Reported by Stephane Chazelas <Stephane_Chazelas@yahoo.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use rb_first() to get first entry in rb tree.
Signed-off-by: Akinbou Mita <akinobu.mita@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb is the netlink query, nskb is the reply message.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
They are not necessarily initialized to zero by the compiler,
for example when using run-time initializers of automatic
on-stack variables.
Noticed by Eric Dumazet and Patrick McHardy.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch contains the changes to net/ipv6/addrconf.c to remove sit
specific code if the sit driver is not selected.
Signed-off-by: Joerg Roedel <joro-lkml@zlug.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the driver of the IPv6-in-IPv4 tunnel driver (sit)
from the IPv6 module. It adds an option to Kconfig which makes it
possible to compile it as a seperate module.
Signed-off-by: Joerg Roedel <joro-lkml@zlug.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If more than one file descriptor was sent with an SCM_RIGHTS message,
and on the receiving end, after installing a nonzero (but not all)
file descritpors the process runs out of fds, then the already
installed fds will be lost (userspace will have no way of knowing
about them).
The following patch makes sure, that at least the already installed
fds are sent to userspace. It doesn't solve the issue of losing file
descriptors in case of an EFAULT on the userspace buffer.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Show the true receive buffer usage.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When doing receiver buffer accounting, we always used skb->truesize.
This is problematic when processing bundled DATA chunks because for
every DATA chunk that could be small part of one large skb, we would
charge the size of the entire skb. The new approach is to store the
size of the DATA chunk we are accounting for in the sctp_ulpevent
structure and use that stored value for accounting.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This treats the security errors encountered in the case of
socket policy matching, the same as how these are treated in
the case of main/sub policies, which is to return a full lookup
failure.
Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
Currently when an IPSec policy rule doesn't specify a security
context, it is assumed to be "unlabeled" by SELinux, and so
the IPSec policy rule fails to match to a flow that it would
otherwise match to, unless one has explicitly added an SELinux
policy rule allowing the flow to "polmatch" to the "unlabeled"
IPSec policy rules. In the absence of such an explicitly added
SELinux policy rule, the IPSec policy rule fails to match and
so the packet(s) flow in clear text without the otherwise applicable
xfrm(s) applied.
The above SELinux behavior violates the SELinux security notion of
"deny by default" which should actually translate to "encrypt by
default" in the above case.
This was first reported by Evgeniy Polyakov and the way James Morris
was seeing the problem was when connecting via IPsec to a
confined service on an SELinux box (vsftpd), which did not have the
appropriate SELinux policy permissions to send packets via IPsec.
With this patch applied, SELinux "polmatching" of flows Vs. IPSec
policy rules will only come into play when there's a explicit context
specified for the IPSec policy rule (which also means there's corresponding
SELinux policy allowing appropriate domains/flows to polmatch to this context).
Secondly, when a security module is loaded (in this case, SELinux), the
security_xfrm_policy_lookup() hook can return errors other than access denied,
such as -EINVAL. We were not handling that correctly, and in fact
inverting the return logic and propagating a false "ok" back up to
xfrm_lookup(), which then allowed packets to pass as if they were not
associated with an xfrm policy.
The solution for this is to first ensure that errno values are
correctly propagated all the way back up through the various call chains
from security_xfrm_policy_lookup(), and handled correctly.
Then, flow_cache_lookup() is modified, so that if the policy resolver
fails (typically a permission denied via the security module), the flow
cache entry is killed rather than having a null policy assigned (which
indicates that the packet can pass freely). This also forces any future
lookups for the same flow to consult the security module (e.g. SELinux)
for current security policy (rather than, say, caching the error on the
flow cache entry).
This patch: Fix the selinux side of things.
This makes sure SELinux polmatching of flow contexts to IPSec policy
rules comes into play only when an explicit context is associated
with the IPSec policy rule.
Also, this no longer defaults the context of a socket policy to
the context of the socket since the "no explicit context" case
is now handled properly.
Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
When a security module is loaded (in this case, SELinux), the
security_xfrm_policy_lookup() hook can return an access denied permission
(or other error). We were not handling that correctly, and in fact
inverting the return logic and propagating a false "ok" back up to
xfrm_lookup(), which then allowed packets to pass as if they were not
associated with an xfrm policy.
The way I was seeing the problem was when connecting via IPsec to a
confined service on an SELinux box (vsftpd), which did not have the
appropriate SELinux policy permissions to send packets via IPsec.
The first SYNACK would be blocked, because of an uncached lookup via
flow_cache_lookup(), which would fail to resolve an xfrm policy because
the SELinux policy is checked at that point via the resolver.
However, retransmitted SYNACKs would then find a cached flow entry when
calling into flow_cache_lookup() with a null xfrm policy, which is
interpreted by xfrm_lookup() as the packet not having any associated
policy and similarly to the first case, allowing it to pass without
transformation.
The solution presented here is to first ensure that errno values are
correctly propagated all the way back up through the various call chains
from security_xfrm_policy_lookup(), and handled correctly.
Then, flow_cache_lookup() is modified, so that if the policy resolver
fails (typically a permission denied via the security module), the flow
cache entry is killed rather than having a null policy assigned (which
indicates that the packet can pass freely). This also forces any future
lookups for the same flow to consult the security module (e.g. SELinux)
for current security policy (rather than, say, caching the error on the
flow cache entry).
Signed-off-by: James Morris <jmorris@namei.org>
Testing revealed a problem with the NetLabel cache where a cached entry could
be freed while in use by the LSM layer causing an oops and other problems.
This patch fixes that problem by introducing a reference counter to the cache
entry so that it is only freed when it is no longer in use.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
This annotation makes it possible to assign a subclass on lock init. This
annotation is meant to reduce the _nested() annotations by assigning a
default subclass.
One could do without this annotation and rely on lockdep_set_class()
exclusively, but that would require a manual stack of struct lock_class_key
objects.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There is some confusion about the meaning of 'bufsz' for a sunrpc server.
In some cases it is the largest message that can be sent or received. In
other cases it is the largest 'payload' that can be included in a NFS
message.
In either case, it is not possible for both the request and the reply to be
this large. One of the request or reply may only be one page long, which
fits nicely with NFS.
So we remove 'bufsz' and replace it with two numbers: 'max_payload' and
'max_mesg'. Max_payload is the size that the server requests. It is used
by the server to check the max size allowed on a particular connection:
depending on the protocol a lower limit might be used.
max_mesg is the largest single message that can be sent or received. It is
calculated as the max_payload, rounded up to a multiple of PAGE_SIZE, and
with PAGE_SIZE added to overhead. Only one of the request and reply may be
this size. The other must be at most one page.
Cc: Greg Banks <gnb@sgi.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* master.kernel.org:/pub/scm/linux/kernel/git/davej/configh:
Remove all inclusions of <linux/config.h>
Manually resolved trivial path conflicts due to removed files in
the sound/oss/ subdirectory.
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[XFRM]: BEET mode
[TCP]: Kill warning in tcp_clean_rtx_queue().
[NET_SCHED]: Remove old estimator implementation
[ATM]: [zatm] always *pcr in alloc_shaper()
[ATM]: [ambassador] Change the return type to reflect reality
[ATM]: kmalloc to kzalloc patches for drivers/atm
[TIPC]: fix printk warning
[XFRM]: Clearing xfrm_policy_count[] to zero during flush is incorrect.
[XFRM] STATE: Use destination address for src hash.
[NEIGH]: always use hash_mask under tbl lock
[UDP]: Fix MSG_PROBE crash
[UDP6]: Fix flowi clobbering
[NET_SCHED]: Revert "HTB: fix incorrect use of RB_EMPTY_NODE"
[NETFILTER]: ebt_mark: add or/and/xor action support to mark target
[NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function
[NETFILTER]: Honour source routing for LVS-NAT
[NETFILTER]: add type parameter to ip_route_me_harder
[NETFILTER]: Kconfig: fix xt_physdev dependencies
The rpc reply has multiple levels of error returns. The code here contributes
to the confusion by using "accept_statp" for a pointer to what the rfc (and
wireshark, etc.) refer to as the "reply_stat". (The confusion is compounded
by the fact that the rfc also has an "accept_stat" which follows the
reply_stat in the succesful case.)
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If the request is denied after gss_accept was called, we shouldn't try to wrap
the reply. We were checking the accept_stat but not the reply_stat.
To check the reply_stat in _release, we need a pointer to before (rather than
after) the verifier, so modify body_start appropriately.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Factor out some common code from the integrity and privacy cases.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The NFSACL patches introduced support for multiple RPC services listening on
the same transport. However, only the first of these services was registered
with portmapper. This was perfectly fine for nfsacl, as you traditionally do
not want these to show up in a portmapper listing.
The patch below changes the default behavior to always register all services
listening on a given transport, but retains the old behavior for nfsacl
services.
Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>