Conflicts:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
Minor conflict between the BCM_CNIC define removal in net-next
and a bug fix added to net. Based upon a conflict resolution
patch posted by Stephen Rothwell.
Signed-off-by: David S. Miller <davem@davemloft.net>
As suggested by Eric, we could introduce a helper function
for ipv6 too, to avoid checking if rt is NULL before
dst_release().
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
#if defined(CONFIG_FOO) || defined(CONFIG_FOO_MODULE)
can be replaced by
#if IS_ENABLED(CONFIG_FOO)
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WARNING: net/ipv6/netfilter/nf_defrag_ipv6.o(.text+0xe0): Section mismatch in
reference from the function nf_ct_net_init() to the function
.init.text:nf_ct_frag6_sysctl_register()
The function nf_ct_net_init() references the function
__init nf_ct_frag6_sysctl_register().
In case nf_conntrack_ipv6 is compiled as a module, nf_ct_net_init could be
called after the init code and data are unloaded. Therefore remove the
"__net_init" annotation from nf_ct_frag6_sysctl_register().
Signed-off-by: Hein Tibosch <hein_tibosch@yahoo.es>
Acked-by: Cong Wang <amwang@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ICMP tuples have id in src and type/code in dst.
So comparing src.u.all with dst.u.all will always fail here
and ip_xfrm_me_harder() is called for every ICMP packet,
even if there was no NAT.
Signed-off-by: Ulrich Weber <ulrich.weber@sophos.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
fix copy-paste error introduced in linux-next commit
"ipv6: add a new namespace for nf_conntrack_reasm"
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Acked-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Combine more modules since the actual code is so small anyway that the
kmod metadata and the module in its loaded state totally outweighs the
combined actual code size.
IP_NF_TARGET_REDIRECT becomes a compat option; IP6_NF_TARGET_REDIRECT
is completely eliminated since it has not see a release yet.
Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Combine more modules since the actual code is so small anyway that the
kmod metadata and the module in its loaded state totally outweighs the
combined actual code size.
IP_NF_TARGET_NETMAP becomes a compat option; IP6_NF_TARGET_NETMAP
is completely eliminated since it has not see a release yet.
Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* NF_NAT_IPV6 requires IP6_NF_IPTABLES
* IP6_NF_TARGET_MASQUERADE, IP6_NF_TARGET_NETMAP, IP6_NF_TARGET_REDIRECT
and IP6_NF_TARGET_NPT require NF_NAT_IPV6.
This change just mirrors what IPv4 does in Kconfig, for consistency.
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michal Kubeček <mkubecek@suse.cz>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Two years ago, Shan Wei tried to fix this:
http://patchwork.ozlabs.org/patch/43905/
The problem is that RFC2460 requires an ICMP Time
Exceeded -- Fragment Reassembly Time Exceeded message should be
sent to the source of that fragment, if the defragmentation
times out.
"
If insufficient fragments are received to complete reassembly of a
packet within 60 seconds of the reception of the first-arriving
fragment of that packet, reassembly of that packet must be
abandoned and all the fragments that have been received for that
packet must be discarded. If the first fragment (i.e., the one
with a Fragment Offset of zero) has been received, an ICMP Time
Exceeded -- Fragment Reassembly Time Exceeded message should be
sent to the source of that fragment.
"
As Herbert suggested, we could actually use the standard IPv6
reassembly code which follows RFC2460.
With this patch applied, I can see ICMP Time Exceeded sent
from the receiver when the sender sent out 3/4 fragmented
IPv6 UDP packet.
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michal Kubeček <mkubecek@suse.cz>
Cc: David Miller <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: netfilter-devel@vger.kernel.org
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As pointed by Michal, it is necessary to add a new
namespace for nf_conntrack_reasm code, this prepares
for the second patch.
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michal Kubeček <mkubecek@suse.cz>
Cc: David Miller <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: netfilter-devel@vger.kernel.org
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes this build error:
net/ipv6/netfilter/nf_nat_l3proto_ipv6.c: In function 'nf_nat_ipv6_csum_recalc':
net/ipv6/netfilter/nf_nat_l3proto_ipv6.c:144:4: error: implicit declaration of function 'csum_ipv6_magic' [-Werror=implicit-function-declaration]
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
ICMPv6 error messages are tracked by extracting the conntrack tuple of
the inner packet and looking up the corresponding conntrack entry. Tuple
extraction uses the ->get_l4proto() callback, which in case of fragments
returns NEXTHDR_FRAGMENT instead of the upper protocol, even for the
first fragment when the entire next header is present, resulting in a
failure to find the correct connection tracking entry.
This patch changes ipv6_get_l4proto() to use ipv6_skip_exthdr() instead
of nf_ct_ipv6_skip_exthdr() in order to skip fragment headers when the
fragment offset is zero.
Signed-off-by: Patrick McHardy <kaber@trash.net>
The IPv6 conntrack fragmentation currently has a couple of shortcomings.
Fragmentes are collected in PREROUTING/OUTPUT, are defragmented, the
defragmented packet is then passed to conntrack, the resulting conntrack
information is attached to each original fragment and the fragments then
continue their way through the stack.
Helper invocation occurs in the POSTROUTING hook, at which point only
the original fragments are available. The result of this is that
fragmented packets are never passed to helpers.
This patch improves the situation in the following way:
- If a reassembled packet belongs to a connection that has a helper
assigned, the reassembled packet is passed through the stack instead
of the original fragments.
- During defragmentation, the largest received fragment size is stored.
On output, the packet is refragmented if required. If the largest
received fragment size exceeds the outgoing MTU, a "packet too big"
message is generated, thus behaving as if the original fragments
were passed through the stack from an outside point of view.
- The ipv6_helper() hook function can't receive fragments anymore for
connections using a helper, so it is switched to use ipv6_skip_exthdr()
instead of the netfilter specific nf_ct_ipv6_skip_exthdr() and the
reassembled packets are passed to connection tracking helpers.
The result of this is that we can properly track fragmented packets, but
still generate ICMPv6 Packet too big messages if we would have before.
This patch is also required as a precondition for IPv6 NAT, where NAT
helpers might enlarge packets up to a point that they require
fragmentation. In that case we can't generate Packet too big messages
since the proper MTU can't be calculated in all cases (f.i. when
changing textual representation of a variable amount of addresses),
so the packet is transparently fragmented iff the original packet or
fragments would have fit the outgoing MTU.
IPVS parts by Jesper Dangaard Brouer <brouer@redhat.com>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
This quiets the coccinelle warnings:
net/bridge/netfilter/ebtable_filter.c:107:1-3: WARNING: PTR_RET can be used
net/bridge/netfilter/ebtable_nat.c:107:1-3: WARNING: PTR_RET can be used
net/ipv6/netfilter/ip6table_filter.c:65:1-3: WARNING: PTR_RET can be used
net/ipv6/netfilter/ip6table_mangle.c💯1-3: WARNING: PTR_RET can be used
net/ipv6/netfilter/ip6table_raw.c:44:1-3: WARNING: PTR_RET can be used
net/ipv6/netfilter/ip6table_security.c:62:1-3: WARNING: PTR_RET can be used
net/ipv4/netfilter/iptable_filter.c:72:1-3: WARNING: PTR_RET can be used
net/ipv4/netfilter/iptable_mangle.c:107:1-3: WARNING: PTR_RET can be used
net/ipv4/netfilter/iptable_raw.c:51:1-3: WARNING: PTR_RET can be used
net/ipv4/netfilter/iptable_security.c:70:1-3: WARNING: PTR_RET can be used
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch generalizes nf_ct_l4proto_net by splitting it into chunks and
moving the corresponding protocol part to where it really belongs to.
To clarify, note that we follow two different approaches to support per-net
depending if it's built-in or run-time loadable protocol tracker.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
Split sysctl function into smaller chucks to cleanup code and prepare
patches to reduce ifdef pollution.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
l4proto->init contain quite redundant code. We can simplify this
by adding a new parameter l3proto.
This patch prepares that code simplification.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
There are good reasons to supports helpers in user-space instead:
* Rapid connection tracking helper development, as developing code
in user-space is usually faster.
* Reliability: A buggy helper does not crash the kernel. Moreover,
we can monitor the helper process and restart it in case of problems.
* Security: Avoid complex string matching and mangling in kernel-space
running in privileged mode. Going further, we can even think about
running user-space helpers as a non-root process.
* Extensibility: It allows the development of very specific helpers (most
likely non-standard proprietary protocols) that are very likely not to be
accepted for mainline inclusion in the form of kernel-space connection
tracking helpers.
This patch adds the infrastructure to allow the implementation of
user-space conntrack helpers by means of the new nfnetlink subsystem
`nfnetlink_cthelper' and the existing queueing infrastructure
(nfnetlink_queue).
I had to add the new hook NF_IP6_PRI_CONNTRACK_HELPER to register
ipv[4|6]_helper which results from splitting ipv[4|6]_confirm into
two pieces. This change is required not to break NAT sequence
adjustment and conntrack confirmation for traffic that is enqueued
to our user-space conntrack helpers.
Basic operation, in a few steps:
1) Register user-space helper by means of `nfct':
nfct helper add ftp inet tcp
[ It must be a valid existing helper supported by conntrack-tools ]
2) Add rules to enable the FTP user-space helper which is
used to track traffic going to TCP port 21.
For locally generated packets:
iptables -I OUTPUT -t raw -p tcp --dport 21 -j CT --helper ftp
For non-locally generated packets:
iptables -I PREROUTING -t raw -p tcp --dport 21 -j CT --helper ftp
3) Run the test conntrackd in helper mode (see example files under
doc/helper/conntrackd.conf
conntrackd
4) Generate FTP traffic going, if everything is OK, then conntrackd
should create expectations (you can check that with `conntrack':
conntrack -E expect
[NEW] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
[DESTROY] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
This confirms that our test helper is receiving packets including the
conntrack information, and adding expectations in kernel-space.
The user-space helper can also store its private tracking information
in the conntrack structure in the kernel via the CTA_HELP_INFO. The
kernel will consider this a binary blob whose layout is unknown. This
information will be included in the information that is transfered
to user-space via glue code that integrates nfnetlink_queue and
ctnetlink.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds namespace support for cttimeout.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Since the sysctl data for l[3|4]proto now resides in pernet nf_proto_net.
We can now remove this unused fields from struct nf_contrack_l[3,4]proto.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds namespace support for IPv6 protocol tracker.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds namespace support for ICMPv6 protocol tracker.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch prepares the namespace support for layer 3 protocol trackers.
Basically, this modifies the following interfaces:
* nf_ct_l3proto_[un]register_sysctl.
* nf_conntrack_l3proto_[un]register.
We add a new nf_ct_l3proto_net is used to get the pernet data of l3proto.
This adds rhe new struct nf_ip_net that is used to store the sysctl header
and l3proto_ipv4,l4proto_tcp(6),l4proto_udp(6),l4proto_icmp(v6) because the
protos such tcp and tcp6 use the same data,so making nf_ip_net as a field
of netns_ct is the easiest way to manager it.
This patch also adds init_net to struct nf_conntrack_l3proto to initial
the layer 3 protocol pernet data.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch prepares the namespace support for layer 4 protocol trackers.
Basically, this modifies the following interfaces:
* nf_ct_[un]register_sysctl
* nf_conntrack_l4proto_[un]register
to include the namespace parameter. We still use init_net in this patch
to prepare the ground for follow-up patches for each layer 4 protocol
tracker.
We add a new net_id field to struct nf_conntrack_l4proto that is used
to store the pernet_operations id for each layer 4 protocol tracker.
Note that AF_INET6's protocols do not need to do sysctl compat. Thus,
we only register compat sysctl when l4proto.l3proto != AF_INET6.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Standardize the net core ratelimited logging functions.
Coalesce formats, align arguments.
Change a printk then vprintk sequence to use printf extension %pV.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the flags parameter to ipv6_find_hdr. This flags
allows us to:
* know if this is a fragment.
* stop at the AH header, so the information contained in that header
can be used for some specific packet handling.
This patch also adds the offset parameter for inspection of one
inner IPv6 header that is contained in error messages.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch removes ip_queue support which was marked as obsolete
years ago. The nfnetlink_queue modules provides more advanced
user-space packet queueing mechanism.
This patch also removes capability code included in SELinux that
refers to ip_queue. Otherwise, we break compilation.
Several warning has been sent regarding this to the mailing list
in the past month without anyone rising the hand to stop this
with some strong argument.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This results in code with less boiler plate that is a bit easier
to read.
Additionally stops us from using compatibility code in the sysctl
core, hastening the day when the compatibility code can be removed.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This makes it clearer which sysctls are relative to your current network
namespace.
This makes it a little less error prone by not exposing sysctls for the
initial network namespace in other namespaces.
This is the same way we handle all of our other network interfaces to
userspace and I can't honestly remember why we didn't do this for
sysctls right from the start.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of "unsigned int" is preferred to bare "unsigned" in net tree.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We may hit this in xt_LOG:
net/built-in.o:xt_LOG.c:function dump_ipv6_packet:
error: undefined reference to 'ip6t_ext_hdr'
happens with these config options:
CONFIG_NETFILTER_XT_TARGET_LOG=y
CONFIG_IP6_NF_IPTABLES=m
ip6t_ext_hdr is fairly small and it is called in the packet path.
Make it static inline.
Reported-by: Simon Kirby <sim@netnation.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
These macros contain a hidden goto, and are thus extremely error
prone and make code hard to audit.
Signed-off-by: David S. Miller <davem@davemloft.net>
It used to be an int, and it got changed to a bool parameter at least
7 years ago. It happens that NF_ACCEPT and NF_DROP are 0 and 1, so
this works, but it's unclear, and the check that it's in range is not
required.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the infrastructure to add fine timeout tuning
over nfnetlink. Now you can use the NFNL_SUBSYS_CTNETLINK_TIMEOUT
subsystem to create/delete/dump timeout objects that contain some
specific timeout policy for one flow.
The follow up patches will allow you attach timeout policy object
to conntrack via the CT target and the conntrack extension
infrastructure.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch defines a new interface for l4 protocol trackers:
unsigned int *(*get_timeouts)(struct net *net);
that is used to return the array of unsigned int that contains
the timeouts that will be applied for this flow. This is passed
to the l4proto->new(...) and l4proto->packet(...) functions to
specify the timeout policy.
This interface allows per-net global timeout configuration
(although only DCCP supports this by now) and it will allow
custom custom timeout configuration by means of follow-up
patches.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ipt_LOG and ip6_LOG have a lot of common code, merge them
to reduce duplicate code.
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* 'for-linus' of git://selinuxproject.org/~jmorris/linux-security:
capabilities: remove __cap_full_set definition
security: remove the security_netlink_recv hook as it is equivalent to capable()
ptrace: do not audit capability check when outputing /proc/pid/stat
capabilities: remove task_ns_* functions
capabitlies: ns_capable can use the cap helpers rather than lsm call
capabilities: style only - move capable below ns_capable
capabilites: introduce new has_ns_capabilities_noaudit
capabilities: call has_ns_capability from has_capability
capabilities: remove all _real_ interfaces
capabilities: introduce security_capable_noaudit
capabilities: reverse arguments to security_capable
capabilities: remove the task from capable LSM hook entirely
selinux: sparse fix: fix several warnings in the security server cod
selinux: sparse fix: fix warnings in netlink code
selinux: sparse fix: eliminate warnings for selinuxfs
selinux: sparse fix: declare selinux_disable() in security.h
selinux: sparse fix: move selinux_complete_init
selinux: sparse fix: make selinux_secmark_refcount static
SELinux: Fix RCU deref check warning in sel_netport_insert()
Manually fix up a semantic mis-merge wrt security_netlink_recv():
- the interface was removed in commit fd77846152 ("security: remove
the security_netlink_recv hook as it is equivalent to capable()")
- a new user of it appeared in commit a38f7907b9 ("crypto: Add
userspace configuration API")
causing no automatic merge conflict, but Eric Paris pointed out the
issue.
Once upon a time netlink was not sync and we had to get the effective
capabilities from the skb that was being received. Today we instead get
the capabilities from the current task. This has rendered the entire
purpose of the hook moot as it is now functionally equivalent to the
capable() call.
Signed-off-by: Eric Paris <eparis@redhat.com>
module_param(bool) used to counter-intuitively take an int. In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.
It's time to remove the int/unsigned int option. For this version
it'll simply give a warning, but it'll break next kernel version.
(Thanks to Joe Perches for suggesting coccinelle for 0/1 -> true/false).
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is not merged with the ipv4 match into xt_rpfilter.c
to avoid ipv6 module dependency issues.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
While parsing through IPv6 extension headers, fragment headers are
skipped making them invisible to the caller. This reports the
fragment offset of the last header in order to make it possible to
determine whether the packet is fragmented and, if so whether it is
a first or last fragment.
Signed-off-by: Jesse Gross <jesse@nicira.com>