WSL2-Linux-Kernel/net/ipv4
Neil Horman 16fcec35e7 [NETFILTER]: Fix/improve deadlock condition on module removal netfilter
So I've had a deadlock reported to me.  I've found that the sequence of
events goes like this:

1) process A (modprobe) runs to remove ip_tables.ko

2) process B (iptables-restore) runs and calls setsockopt on a netfilter socket,
increasing the ip_tables socket_ops use count

3) process A acquires a file lock on the file ip_tables.ko, calls remove_module
in the kernel, which in turn executes the ip_tables module cleanup routine,
which calls nf_unregister_sockopt

4) nf_unregister_sockopt, seeing that the use count is non-zero, puts the
calling process into uninterruptible sleep, expecting the process using the
socket option code to wake it up when it exits the kernel

4) the user of the socket option code (process B) in do_ipt_get_ctl, calls
ipt_find_table_lock, which in this case calls request_module to load
ip_tables_nat.ko

5) request_module forks a copy of modprobe (process C) to load the module and
blocks until modprobe exits.

6) Process C. forked by request_module process the dependencies of
ip_tables_nat.ko, of which ip_tables.ko is one.

7) Process C attempts to lock the request module and all its dependencies, it
blocks when it attempts to lock ip_tables.ko (which was previously locked in
step 3)

Theres not really any great permanent solution to this that I can see, but I've
developed a two part solution that corrects the problem

Part 1) Modifies the nf_sockopt registration code so that, instead of using a
use counter internal to the nf_sockopt_ops structure, we instead use a pointer
to the registering modules owner to do module reference counting when nf_sockopt
calls a modules set/get routine.  This prevents the deadlock by preventing set 4
from happening.

Part 2) Enhances the modprobe utilty so that by default it preforms non-blocking
remove operations (the same way rmmod does), and add an option to explicity
request blocking operation.  So if you select blocking operation in modprobe you
can still cause the above deadlock, but only if you explicity try (and since
root can do any old stupid thing it would like....  :)  ).

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-11 11:28:26 +02:00
..
ipvs [NETFILTER]: Fix/improve deadlock condition on module removal netfilter 2007-09-11 11:28:26 +02:00
netfilter [NETFILTER]: Fix/improve deadlock condition on module removal netfilter 2007-09-11 11:28:26 +02:00
Kconfig [IPV4]: The scheduled removal of multipath cached routing support. 2007-07-10 22:05:57 -07:00
Makefile [IPV4]: The scheduled removal of multipath cached routing support. 2007-07-10 22:05:57 -07:00
af_inet.c [TCP]: Invoke tcp_sendmsg() directly, do not use inet_sendmsg(). 2007-08-02 19:42:28 -07:00
ah4.c [IPSEC] AH4: Update IPv4 options handling to conform to RFC 4302. 2007-08-26 18:35:33 -07:00
arp.c [IPV4]: Cleanup call to __neigh_lookup() 2007-07-14 20:51:44 -07:00
cipso_ipv4.c [CIPSO]: Fix several unaligned kernel accesses in the CIPSO engine. 2007-06-08 13:33:10 -07:00
datagram.c [IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP 2007-06-03 18:08:50 -07:00
devinet.c [IPV4] devinet: show all addresses assigned to interface 2007-09-11 10:41:04 +02:00
esp4.c [XFRM]: Add module alias for transformation type. 2007-07-10 22:15:43 -07:00
fib_frontend.c [NET] IPV4: Fix whitespace errors. 2007-07-19 10:43:47 +09:00
fib_hash.c mm: Remove slab destructors from kmem_cache_create(). 2007-07-20 10:11:58 +09:00
fib_lookup.h [RTNETLINK]: Fix sending netlink message when replace route. 2007-05-24 16:36:53 -07:00
fib_rules.c [NETLINK]: Mark netlink policies const 2007-06-07 13:40:10 -07:00
fib_semantics.c [IPV4]: The scheduled removal of multipath cached routing support. 2007-07-10 22:05:57 -07:00
fib_trie.c mm: Remove slab destructors from kmem_cache_create(). 2007-07-20 10:11:58 +09:00
icmp.c [ICMP]: Fix icmp_errors_use_inbound_ifaddr sysctl 2007-06-03 18:08:51 -07:00
igmp.c [IPV4]: Convert IPv4 devconf to an array 2007-06-07 13:39:13 -07:00
inet_connection_sock.c [TCP]: Use default 32768-61000 outgoing port range in all cases. 2007-06-03 18:08:43 -07:00
inet_diag.c [NETLINK]: Switch cb_lock spinlock to mutex and allow to override it 2007-04-25 22:29:03 -07:00
inet_hashtables.c [NET] IPV4: Fix whitespace errors. 2007-02-10 23:19:39 -08:00
inet_timewait_sock.c [INET_SOCK]: make net/ipv4/inet_timewait_sock.c:__inet_twsk_kill() static 2007-07-14 19:00:59 -07:00
inetpeer.c [IPV4]: Fix inetpeer gcc-4.2 warnings 2007-07-20 19:39:17 -07:00
ip_forward.c [NET] IPV4: Fix whitespace errors. 2007-07-19 10:43:47 +09:00
ip_fragment.c [SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iph 2007-04-25 22:25:10 -07:00
ip_gre.c [NET]: Avoid copying writable clones in tunnel drivers 2007-07-10 22:19:05 -07:00
ip_input.c [IPV4] SNMP: Support InMcastPkts and InBcastPkts 2007-04-30 00:58:29 -07:00
ip_options.c [IPV4] ip_options.c: kmalloc + memset conversion to kzalloc 2007-07-31 14:06:45 -07:00
ip_output.c [IPV4]: Clean up duplicate includes in net/ipv4/ 2007-08-13 22:52:02 -07:00
ip_sockglue.c [NET]: Fix IP_ADD/DROP_MEMBERSHIP to handle only connectionless 2007-08-26 18:35:35 -07:00
ipcomp.c [XFRM]: Add module alias for transformation type. 2007-07-10 22:15:43 -07:00
ipconfig.c [IPCONFIG]: ip_auto_config fix 2007-08-13 22:51:59 -07:00
ipip.c [NET]: Avoid copying writable clones in tunnel drivers 2007-07-10 22:19:05 -07:00
ipmr.c mm: Remove slab destructors from kmem_cache_create(). 2007-07-20 10:11:58 +09:00
netfilter.c [SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iph 2007-04-25 22:25:10 -07:00
proc.c [IPV4]: Convert IPv4 devconf to an array 2007-06-07 13:39:13 -07:00
protocol.c [IPV4]: align inet_protos[] on SMP 2007-04-25 22:28:20 -07:00
raw.c [IPV4] raw.c: kmalloc + memset conversion to kzalloc 2007-08-02 19:42:26 -07:00
route.c [IPV4] route.c: mostly kmalloc + memset conversion to k[cz]alloc 2007-08-02 19:42:27 -07:00
syncookies.c [SK_BUFF]: Introduce tcp_hdr(), remove skb->h.th 2007-04-25 22:25:26 -07:00
sysctl_net_ipv4.c [IPV4]: Convert IPv4 devconf to an array 2007-06-07 13:39:13 -07:00
tcp.c [TCP]: Invoke tcp_sendmsg() directly, do not use inet_sendmsg(). 2007-08-02 19:42:28 -07:00
tcp_bic.c [TCP]: congestion control API pass RTT in microseconds 2007-07-31 02:27:57 -07:00
tcp_cong.c [TCP]: remove unused argument to cong_avoid op 2007-07-18 01:46:58 -07:00
tcp_cubic.c [TCP]: cubic - eliminate use of receive time stamp 2007-07-31 02:27:58 -07:00
tcp_diag.c
tcp_highspeed.c [TCP]: remove unused argument to cong_avoid op 2007-07-18 01:46:58 -07:00
tcp_htcp.c [TCP]: H-TCP maxRTT estimation at startup 2007-08-07 18:29:05 -07:00
tcp_hybla.c [TCP]: remove unused argument to cong_avoid op 2007-07-18 01:46:58 -07:00
tcp_illinois.c [TCP]: congestion control API pass RTT in microseconds 2007-07-31 02:27:57 -07:00
tcp_input.c [TCP]: 'dst' can be NULL in tcp_rto_min() 2007-08-31 14:39:44 -07:00
tcp_ipv4.c [TCP]: Invoke tcp_sendmsg() directly, do not use inet_sendmsg(). 2007-08-02 19:42:28 -07:00
tcp_lp.c [TCP]: congestion control API pass RTT in microseconds 2007-07-31 02:27:57 -07:00
tcp_minisocks.c [SK_BUFF]: Introduce tcp_hdr(), remove skb->h.th 2007-04-25 22:25:26 -07:00
tcp_output.c [NET] IPV4: Fix whitespace errors. 2007-07-19 10:43:47 +09:00
tcp_probe.c jprobes: remove JPROBE_ENTRY() 2007-07-19 10:04:44 -07:00
tcp_scalable.c [TCP]: remove unused argument to cong_avoid op 2007-07-18 01:46:58 -07:00
tcp_timer.c [TCP]: Use LIMIT_NETDEBUG in tcp_retransmit_timer(). 2007-06-07 13:40:08 -07:00
tcp_vegas.c [TCP]: congestion control API pass RTT in microseconds 2007-07-31 02:27:57 -07:00
tcp_vegas.h [TCP]: congestion control API pass RTT in microseconds 2007-07-31 02:27:57 -07:00
tcp_veno.c [TCP]: congestion control API pass RTT in microseconds 2007-07-31 02:27:57 -07:00
tcp_westwood.c [TCP]: congestion control API pass RTT in microseconds 2007-07-31 02:27:57 -07:00
tcp_yeah.c [TCP]: congestion control API pass RTT in microseconds 2007-07-31 02:27:57 -07:00
tunnel4.c [IPSEC]: Changing API of xfrm4_tunnel_register. 2007-02-13 12:54:47 -08:00
udp.c [UDP]: Fix length check. 2007-07-10 23:06:43 -07:00
udp_impl.h [UDP]: Revert 2-pass hashing changes. 2007-06-07 13:40:50 -07:00
udplite.c [UDP]: Revert 2-pass hashing changes. 2007-06-07 13:40:50 -07:00
xfrm4_input.c [UDP]: Cleanup UDP encapsulation code 2007-07-10 22:16:53 -07:00
xfrm4_mode_beet.c [XFRM]: beet: fix worst case header_len calculation 2007-04-25 22:28:39 -07:00
xfrm4_mode_transport.c [SK_BUFF]: unions of just one member don't get anything done, kill them 2007-04-25 22:26:20 -07:00
xfrm4_mode_tunnel.c [IPSEC]: Fix panic when using inter address familiy IPsec on loopback. 2007-05-31 01:23:28 -07:00
xfrm4_output.c [SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iph 2007-04-25 22:25:10 -07:00
xfrm4_policy.c [NET]: cleanup extra semicolons 2007-04-25 22:29:24 -07:00
xfrm4_state.c [IPSEC]: exporting xfrm_state_afinfo 2007-02-08 12:39:00 -08:00
xfrm4_tunnel.c [XFRM]: Add module alias for transformation type. 2007-07-10 22:15:43 -07:00