The napi structs allocated at drivier initializatio need to be
free'ed when releasing the drivers resources.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ibmvnic driver has multiple handlers for resetting the driver
depending on the reason the reset is needed (failover, lpm,
fatal erors,...). All of the reset handlers do essentially the same
thing, this patch moves this work to a common reset handler.
By doing this we also allow the driver to better handle situations
where we can get a reset while handling a reset.
The updated reset handling works by adding a reset work item to the
list of resets and then scheduling work to perform the reset. This
step is necessary because we can receive a reset in interrupt context
and we want to handle the reset out of interrupt context.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the is_closed flag in the ibmvnic adapter strcut with a
more comprehensive state field that tracks the current state of
the driver.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move all of the calls to initialize resources for the driver to
a separate routine.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the final cifsFileInfo_put() is called from cifsiod and an oplock
break work is queued, lockdep complains loudly:
=============================================
[ INFO: possible recursive locking detected ]
4.11.0+ #21 Not tainted
---------------------------------------------
kworker/0:2/78 is trying to acquire lock:
("cifsiod"){++++.+}, at: flush_work+0x215/0x350
but task is already holding lock:
("cifsiod"){++++.+}, at: process_one_work+0x255/0x8e0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock("cifsiod");
lock("cifsiod");
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by kworker/0:2/78:
#0: ("cifsiod"){++++.+}, at: process_one_work+0x255/0x8e0
#1: ((&wdata->work)){+.+...}, at: process_one_work+0x255/0x8e0
stack backtrace:
CPU: 0 PID: 78 Comm: kworker/0:2 Not tainted 4.11.0+ #21
Workqueue: cifsiod cifs_writev_complete
Call Trace:
dump_stack+0x85/0xc2
__lock_acquire+0x17dd/0x2260
? match_held_lock+0x20/0x2b0
? trace_hardirqs_off_caller+0x86/0x130
? mark_lock+0xa6/0x920
lock_acquire+0xcc/0x260
? lock_acquire+0xcc/0x260
? flush_work+0x215/0x350
flush_work+0x236/0x350
? flush_work+0x215/0x350
? destroy_worker+0x170/0x170
__cancel_work_timer+0x17d/0x210
? ___preempt_schedule+0x16/0x18
cancel_work_sync+0x10/0x20
cifsFileInfo_put+0x338/0x7f0
cifs_writedata_release+0x2a/0x40
? cifs_writedata_release+0x2a/0x40
cifs_writev_complete+0x29d/0x850
? preempt_count_sub+0x18/0xd0
process_one_work+0x304/0x8e0
worker_thread+0x9b/0x6a0
kthread+0x1b2/0x200
? process_one_work+0x8e0/0x8e0
? kthread_create_on_node+0x40/0x40
ret_from_fork+0x31/0x40
This is a real warning. Since the oplock is queued on the same
workqueue this can deadlock if there is only one worker thread active
for the workqueue (which will be the case during memory pressure when
the rescuer thread is handling it).
Furthermore, there is at least one other kind of hang possible due to
the oplock break handling if there is only worker. (This can be
reproduced without introducing memory pressure by having passing 1 for
the max_active parameter of cifsiod.) cifs_oplock_break() can wait
indefintely in the filemap_fdatawait() while the cifs_writev_complete()
work is blocked:
sysrq: SysRq : Show Blocked State
task PC stack pid father
kworker/0:1 D 0 16 2 0x00000000
Workqueue: cifsiod cifs_oplock_break
Call Trace:
__schedule+0x562/0xf40
? mark_held_locks+0x4a/0xb0
schedule+0x57/0xe0
io_schedule+0x21/0x50
wait_on_page_bit+0x143/0x190
? add_to_page_cache_lru+0x150/0x150
__filemap_fdatawait_range+0x134/0x190
? do_writepages+0x51/0x70
filemap_fdatawait_range+0x14/0x30
filemap_fdatawait+0x3b/0x40
cifs_oplock_break+0x651/0x710
? preempt_count_sub+0x18/0xd0
process_one_work+0x304/0x8e0
worker_thread+0x9b/0x6a0
kthread+0x1b2/0x200
? process_one_work+0x8e0/0x8e0
? kthread_create_on_node+0x40/0x40
ret_from_fork+0x31/0x40
dd D 0 683 171 0x00000000
Call Trace:
__schedule+0x562/0xf40
? mark_held_locks+0x29/0xb0
schedule+0x57/0xe0
io_schedule+0x21/0x50
wait_on_page_bit+0x143/0x190
? add_to_page_cache_lru+0x150/0x150
__filemap_fdatawait_range+0x134/0x190
? do_writepages+0x51/0x70
filemap_fdatawait_range+0x14/0x30
filemap_fdatawait+0x3b/0x40
filemap_write_and_wait+0x4e/0x70
cifs_flush+0x6a/0xb0
filp_close+0x52/0xa0
__close_fd+0xdc/0x150
SyS_close+0x33/0x60
entry_SYSCALL_64_fastpath+0x1f/0xbe
Showing all locks held in the system:
2 locks held by kworker/0:1/16:
#0: ("cifsiod"){.+.+.+}, at: process_one_work+0x255/0x8e0
#1: ((&cfile->oplock_break)){+.+.+.}, at: process_one_work+0x255/0x8e0
Showing busy workqueues and worker pools:
workqueue cifsiod: flags=0xc
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
in-flight: 16:cifs_oplock_break
delayed: cifs_writev_complete, cifs_echo_request
pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=0s workers=3 idle: 750 3
Fix these problems by creating a a new workqueue (with a rescuer) for
the oplock break work.
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
As with 618763958b, an open directory may have a NULL private_data
pointer prior to readdir. CIFS_ENUMERATE_SNAPSHOTS must check for this
before dereference.
Fixes: 834170c859 ("Enable previous version support")
Signed-off-by: David Disseldorp <ddiss@suse.de>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <smfrench@gmail.com>
The server may respond with success, and an output buffer less than
sizeof(struct smb_snapshot_array) in length. Do not leak the output
buffer in this case.
Fixes: 834170c859 ("Enable previous version support")
Signed-off-by: David Disseldorp <ddiss@suse.de>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <smfrench@gmail.com>
This reverts commit bbd6411513.
I've been sitting on this revert for too long and it unfortunately
missed 4.11. It's also the reason why I haven't merged ring-based
dirty tracking for 4.12.
Using kvm_vcpu_memslots in kvm_gfn_to_hva_cache_init and
kvm_vcpu_write_guest_offset_cached means that the MSR value can
now be used to access SMRAM, simply by making it point to an SMRAM
physical address. This is problematic because it lets the guest
OS overwrite memory that it shouldn't be able to touch.
Cc: stable@vger.kernel.org
Fixes: bbd6411513
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The top level tools/Makefile includes kvm_stat as a target in help, but
the actual target is missing.
Signed-off-by: Justin M. Forbes <jforbes@fedoraproject.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS/OVS fixes for net
The following patchset contains a rather large batch of Netfilter, IPVS
and OVS fixes for your net tree. This includes fixes for ctnetlink, the
userspace conntrack helper infrastructure, conntrack OVS support,
ebtables DNAT target, several leaks in error path among other. More
specifically, they are:
1) Fix reference count leak in the CT target error path, from Gao Feng.
2) Remove conntrack entry clashing with a matching expectation, patch
from Jarno Rajahalme.
3) Fix bogus EEXIST when registering two different userspace helpers,
from Liping Zhang.
4) Don't leak dummy elements in the new bitmap set type in nf_tables,
from Liping Zhang.
5) Get rid of module autoload from conntrack update path in ctnetlink,
we don't need autoload at this late stage and it is happening with
rcu read lock held which is not good. From Liping Zhang.
6) Fix deadlock due to double-acquire of the expect_lock from conntrack
update path, this fixes a bug that was introduced when the central
spinlock got removed. Again from Liping Zhang.
7) Safe ct->status update from ctnetlink path, from Liping. The expect_lock
protection that was selected when the central spinlock was removed was
not really protecting anything at all.
8) Protect sequence adjustment under ct->lock.
9) Missing socket match with IPv6, from Peter Tirsek.
10) Adjust skb->pkt_type of DNAT'ed frames from ebtables, from
Linus Luessing.
11) Don't give up on evaluating the expression on new entries added via
dynset expression in nf_tables, from Liping Zhang.
12) Use skb_checksum() when mangling icmpv6 in IPv6 NAT as this deals
with non-linear skbuffs.
13) Don't allow IPv6 service in IPVS if no IPv6 support is available,
from Paolo Abeni.
14) Missing mutex release in error path of xt_find_table_lock(), from
Dan Carpenter.
15) Update maintainers files, Netfilter section. Add Florian to the
file, refer to nftables.org and change project status from Supported
to Maintained.
16) Bail out on mismatching extensions in element updates in nf_tables.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of the private end_io handlers and data, and just use the
regular block IO path for these requests. This removes a lot of
redundant code.
Signed-off-by: Jens Axboe <axboe@fb.com>
We don't decode the internal tag to the proper group or tag
indx. This works fine because we have hard wired it as 0 for now,
but could break if we get rid of that.
Signed-off-by: Jens Axboe <axboe@fb.com>
If gcc (e.g. 4.1.2) decides not to inline total_extension_size(), the
build will fail with:
net/built-in.o: In function `nf_conntrack_init_start':
(.text+0x9baf6): undefined reference to `__compiletime_assert_1893'
or
ERROR: "__compiletime_assert_1893" [net/netfilter/nf_conntrack.ko] undefined!
Fix this by forcing inlining of total_extension_size().
Fixes: b3a5db109e ("netfilter: conntrack: use u8 for extension sizes again")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
On 32-bit:
lib/test_bpf.c:4772: warning: integer constant is too large for ‘unsigned long’ type
lib/test_bpf.c:4772: warning: integer constant is too large for ‘unsigned long’ type
lib/test_bpf.c:4773: warning: integer constant is too large for ‘unsigned long’ type
lib/test_bpf.c:4773: warning: integer constant is too large for ‘unsigned long’ type
lib/test_bpf.c:4787: warning: integer constant is too large for ‘unsigned long’ type
lib/test_bpf.c:4787: warning: integer constant is too large for ‘unsigned long’ type
lib/test_bpf.c:4801: warning: integer constant is too large for ‘unsigned long’ type
lib/test_bpf.c:4801: warning: integer constant is too large for ‘unsigned long’ type
lib/test_bpf.c:4802: warning: integer constant is too large for ‘unsigned long’ type
lib/test_bpf.c:4802: warning: integer constant is too large for ‘unsigned long’ type
On 32-bit systems, "long" is only 32-bit.
Replace the "UL" suffix by "ULL" to fix this.
Fixes: 85f68fe898 ("bpf, arm64: implement jiting of BPF_XADD")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for Telit ME910 PID 0x1100.
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now tg3 NIC's stats will be cleared after ifdown/ifup. bond_get_stats traverse
its salves to get statistics,cumulative the increment.If a tg3 NIC is added to
bonding as a slave,ifdown/ifup will cause bonding's stats become tremendous value
(ex.1638.3 PiB) because of negative increment.
Fixes: 92feeabf3f ("tg3: Save stats across chip resets")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-D__x86_64__ workaround was used to make /usr/include/features.h
to follow expected path through the system include headers.
This is not portable.
Instead define dummy stubs.h which is used by 'clang -target bpf'
Fixes: 6882804c91 ("selftests/bpf: add a test for overlapping packet range checks")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
With clang/llvm 4.0+, the test case is able to generate
the following pattern:
....
440: (b7) r1 = 15
441: (05) goto pc+73
515: (79) r6 = *(u64 *)(r10 -152)
516: (bf) r7 = r10
517: (07) r7 += -112
518: (bf) r2 = r7
519: (0f) r2 += r1
520: (71) r1 = *(u8 *)(r8 +0)
521: (73) *(u8 *)(r2 +45) = r1
....
commit 332270fdc8 ("bpf: enhance verifier to understand stack
pointer arithmetic") improved verifier to handle such a pattern.
This patch adds a C test case to actually generate such a pattern.
A dummy tracepoint interface is used to load the program
into the kernel.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Small follow-up to d74a32acd5 ("xdp: use netlink extended ACK reporting")
in order to let drivers all use the same NL_SET_ERR_MSG_MOD() helper macro
for reporting. This also ensures that we consistently add the driver's
prefix for dumping the report in user space to indicate that the error
message is driver specific and not coming from core code. Furthermore,
NL_SET_ERR_MSG_MOD() now reuses NL_SET_ERR_MSG() and thus makes all macros
check the pointer as suggested.
References: https://www.spinics.net/lists/netdev/msg433267.html
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrey reported a warning triggered by the rcu code:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 5911 at lib/debugobjects.c:289
debug_print_object+0x175/0x210
ODEBUG: activate active (active state 1) object type: rcu_head hint:
(null)
Modules linked in:
CPU: 1 PID: 5911 Comm: a.out Not tainted 4.11.0-rc8+ #271
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16
dump_stack+0x192/0x22d lib/dump_stack.c:52
__warn+0x19f/0x1e0 kernel/panic.c:549
warn_slowpath_fmt+0xe0/0x120 kernel/panic.c:564
debug_print_object+0x175/0x210 lib/debugobjects.c:286
debug_object_activate+0x574/0x7e0 lib/debugobjects.c:442
debug_rcu_head_queue kernel/rcu/rcu.h:75
__call_rcu.constprop.76+0xff/0x9c0 kernel/rcu/tree.c:3229
call_rcu_sched+0x12/0x20 kernel/rcu/tree.c:3288
rt6_rcu_free net/ipv6/ip6_fib.c:158
rt6_release+0x1ea/0x290 net/ipv6/ip6_fib.c:188
fib6_del_route net/ipv6/ip6_fib.c:1461
fib6_del+0xa42/0xdc0 net/ipv6/ip6_fib.c:1500
__ip6_del_rt+0x100/0x160 net/ipv6/route.c:2174
ip6_del_rt+0x140/0x1b0 net/ipv6/route.c:2187
__ipv6_ifa_notify+0x269/0x780 net/ipv6/addrconf.c:5520
addrconf_ifdown+0xe60/0x1a20 net/ipv6/addrconf.c:3672
...
Andrey's reproducer program runs in a very tight loop, calling
'unshare -n' and then spawning 2 sets of 14 threads running random ioctl
calls. The relevant networking sequence:
1. New network namespace created via unshare -n
- ip6tnl0 device is created in down state
2. address added to ip6tnl0
- equivalent to ip -6 addr add dev ip6tnl0 fd00::bb/1
- DAD is started on the address and when it completes the host
route is inserted into the FIB
3. ip6tnl0 is brought up
- the new fixup_permanent_addr function restarts DAD on the address
4. exit namespace
- teardown / cleanup sequence starts
- once in a blue moon, lo teardown appears to happen BEFORE teardown
of ip6tunl0
+ down on 'lo' removes the host route from the FIB since the dst->dev
for the route is loobback
+ host route added to rcu callback list
* rcu callback has not run yet, so rt is NOT on the gc list so it has
NOT been marked obsolete
5. in parallel to 4. worker_thread runs addrconf_dad_completed
- DAD on the address on ip6tnl0 completes
- calls ipv6_ifa_notify which inserts the host route
All of that happens very quickly. The result is that a host route that
has been deleted from the IPv6 FIB and added to the RCU list is re-inserted
into the FIB.
The exit namespace eventually gets to cleaning up ip6tnl0 which removes the
host route from the FIB again, calls the rcu function for cleanup -- and
triggers the double rcu trace.
The root cause is duplicate DAD on the address -- steps 2 and 3. Arguably,
DAD should not be started in step 2. The interface is in the down state,
so it can not really send out requests for the address which makes starting
DAD pointless.
Since the second DAD was introduced by a recent change, seems appropriate
to use it for the Fixes tag and have the fixup function only start DAD for
addresses in the PREDAD state which occurs in addrconf_ifdown if the
address is retained.
Big thanks to Andrey for isolating a reliable reproducer for this problem.
Fixes: f1705ec197 ("net: ipv6: Make address flushing on ifdown optional")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding support for Microchip LAN9250 Ethernet controller.
Signed-off-by: David Cai <david.cai@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Dangaard Brouer says:
====================
Improve bpf ELF-loader under samples/bpf
This series improves and fixes bpf ELF loader and programs under
samples/bpf. The bpf_load.c created some hard to debug issues when
the struct (bpf_map_def) used in the ELF maps section format changed
in commit fb30d4b712 ("bpf: Add tests for map-in-map").
This was hotfixed in commit 409526bea3c3 ("samples/bpf: bpf_load.c
detect and abort if ELF maps section size is wrong") by detecting the
issue and aborting the program.
In most situations the bpf-loader should be able to handle these kind
of changes to the struct size. This patch series aim to do proper
backward and forward compabilility handling when loading ELF files.
This series also adjust the callback that was introduced in commit
9fd63d05f3 ("bpf: Allow bpf sample programs (*_user.c) to change
bpf_map_def") to use the new bpf_map_data structure, before more users
start to use this callback.
Hoping these changes can make the merge window, as above mentioned
commits have not been merged yet, and it would be good to avoid users
hitting these issues.
====================
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Giving *_user.c side tools access to map_data[] provides easier
access to information on the maps being loaded. Still provide
the guarantee that the order maps are being defined in inside the
_kern.c file corresponds with the order in the array. Now user
tools are not blind, but can inspect and verify the maps that got
loaded from the ELF binary.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do this change before others start to use this callback.
Change map_perf_test_user.c which seems to be the only user.
This patch extends capabilities of commit 9fd63d05f3 ("bpf:
Allow bpf sample programs (*_user.c) to change bpf_map_def").
Give fixup callback access to struct bpf_map_data, instead of
only stuct bpf_map_def. This add flexibility to allow userspace
to reassign the map file descriptor. This is very useful when
wanting to share maps between several bpf programs.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch does proper parsing of the ELF "maps" section, in-order to
be both backwards and forwards compatible with changes to the map
definition struct bpf_map_def, which gets compiled into the ELF file.
The assumption is that new features with value zero, means that they
are not in-use. For backward compatibility where loading an ELF file
with a smaller struct bpf_map_def, only copy objects ELF size, leaving
rest of loaders struct zero. For forward compatibility where ELF file
have a larger struct bpf_map_def, only copy loaders own struct size
and verify that rest of the larger struct is zero, assuming this means
the newer feature was not activated, thus it should be safe for this
older loader to load this newer ELF file.
Fixes: fb30d4b712 ("bpf: Add tests for map-in-map")
Fixes: 409526bea3c3 ("samples/bpf: bpf_load.c detect and abort if ELF maps section size is wrong")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Needed to adjust max locked memory RLIMIT_MEMLOCK for testing these bpf samples
as these are using more and larger maps than can fit in distro default 64Kbytes limit.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
gcc warns that an empty device tree would cause undefined behavior:
drivers/of/unittest.c: In function 'of_unittest':
drivers/of/unittest.c:2199:25: warning: 'last_sibling' may be used uninitialized in this function [-Wmaybe-uninitialized]
This adds an initialization of the variable to zero, which we handle
correctly.
Fixes: 81d0848fc8 ("of: Add unit tests for applying overlays")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rob Herring <robh@kernel.org>
Power9/ISAv3 has no VRMASD field in LPCR, we shouldn't be setting reserved bits,
so don't set them on Power9.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Commit 616badd2fb ("powerpc/powernv: Use OPAL call for TCE kill on
NVLink2") forced all TCE kills to go via the OPAL call for
NVLink2. However the PHB3 implementation of TCE kill was still being
called directly from some functions which in some circumstances caused
a machine check.
This patch adds an equivalent IODA2 version of the function which uses
the correct invalidation method depending on PHB model and changes all
external callers to use it instead.
Fixes: 616badd2fb ("powerpc/powernv: Use OPAL call for TCE kill on NVLink2")
Cc: stable@vger.kernel.org # v4.11+
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently the radix TLB code includes support for CPUs that do *not*
have MMU_FTR_LOCKLESS_TLBIE. On those CPUs we are required to take a
global spinlock before issuing a tlbie.
Radix can only be built for 64-bit Book3s CPUs, and of those, only
POWER4, 970, Cell and PA6T do not have MMU_FTR_LOCKLESS_TLBIE. Although
it's possible to build a kernel with Radix support that can also boot on
those CPUs, we happen to know that in reality none of those CPUs support
the Radix MMU, so the code can never actually run on those CPUs.
So remove the native_tlbie_lock in the Radix TLB code.
Note that there is another lock of the same name in the hash code, which
is unaffected by this patch.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
If no NLM_F_EXCL is set and the element already exists in the set, make
sure that both elements have the same extensions.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Commit 84d582d236 ("xen: Revert commits da72ff5bfc and
72a9b186292d") defined xen_have_vector_callback in enlighten_hvm.c.
Since guest-type-neutral code refers to this variable this causes
build failures when CONFIG_XEN_PVHVM is not defined.
Moving xen_have_vector_callback definition to enlighten.c resolves
this issue.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
machine_check_early() gets called in real mode. The very first time when
add_taint() is called, it prints a warning which ends up calling opal
call (that uses OPAL_CALL wrapper) for writing it to console. If we get a
very first machine check while we are in opal we are doomed. OPAL_CALL
overwrites the PACASAVEDMSR in r13 and in this case when we are done with
MCE handling the original opal call will use this new MSR on it's way
back to opal_return. This usually leads to unexpected behaviour or the
kernel to panic. Instead move the add_taint() call later in the virtual
mode where it is safe to call.
This is broken with current FW level. We got lucky so far for not getting
very first MCE hit while in OPAL. But easily reproducible on Mambo.
Fixes: 27ea2c420c ("powerpc: Set the correct kernel taint on machine check errors.")
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The entire body of unregister_cpu_online() is inside an #ifdef
CONFIG_HOTPLUG_CPU block. This is ugly and means we create an empty function
when hotplug is disabled for no reason.
Instead move the #ifdef out of the function body and define the function to be
NULL in the else case. This means we'll pass NULL to cpuhp_setup_state(), but
that's fine because it accepts NULL to mean there is no teardown callback, which
is exactly what we want.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This code was until recently completely undocumented and even now the comment is
not very verbose.
We've already had one patch sent to remove the IRQ enable/disable because it's
"paradoxical and unnecessary". So document it thoroughly to save anyone else
from puzzling over it.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Otherwise we might select it when its dependenices aren't enabled,
leading to a build break.
It's default y anyway, so will be on unless someone disables it
manually.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
pnv_eeh_reset() has special handling for PEs whose primary bus is the
root bus or the bus immediately underneath the root port.
The cxl bi-modal card support added in b0b5e5918a ("cxl: Add
cxl_check_and_switch_mode() API to switch bi-modal cards") relies on
this behaviour when hot-resetting the CAPI adapter following a mode
switch. Document this in pnv_eeh_reset() so we don't accidentally break
it.
Suggested-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In order to shrink size of struct discard_cmd, change variable type of
@state in struct discard_cmd from int to unsigned char.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Previously, with protection of cmd_lock, we will wait for end io of
discard command which potentially may lead long latency, making worse
concurrency.
So, in this patch, we try to add reference into discard entry to prevent
the entry being released by other thread, then we can avoid holding
global cmd_lock during waiting discard to finish.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
F2FS uses 4 bytes to represent block address. As a result, supported
size of disk is 16 TB and it equals to 16 * 1024 * 1024 / 2 segments.
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If a page is cold, NOT atomit written and need_ipu now, there is
a high probability that IPU should be adapted. For IPU, we try to
check extent tree to get the block index first, instead of reading
the dnode page, where may lead to an useless dnode IO, since no need to
update the dnode index for IPU.
Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch introduces encrypt_one_page which encrypts one data page before
submit_bio, and change the use of need_inplace_update.
Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Pull livepatch updates from Jiri Kosina:
- a per-task consistency model is being added for architectures that
support reliable stack dumping (extending this, currently rather
trivial set, is currently in the works).
This extends the nature of the types of patches that can be applied
by live patching infrastructure. The code stems from the design
proposal made [1] back in November 2014. It's a hybrid of SUSE's
kGraft and RH's kpatch, combining advantages of both: it uses
kGraft's per-task consistency and syscall barrier switching combined
with kpatch's stack trace switching. There are also a number of
fallback options which make it quite flexible.
Most of the heavy lifting done by Josh Poimboeuf with help from
Miroslav Benes and Petr Mladek
[1] https://lkml.kernel.org/r/20141107140458.GA21774@suse.cz
- module load time patch optimization from Zhou Chengming
- a few assorted small fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
livepatch: add missing printk newlines
livepatch: Cancel transition a safe way for immediate patches
livepatch: Reduce the time of finding module symbols
livepatch: make klp_mutex proper part of API
livepatch: allow removal of a disabled patch
livepatch: add /proc/<pid>/patch_state
livepatch: change to a per-task consistency model
livepatch: store function sizes
livepatch: use kstrtobool() in enabled_store()
livepatch: move patching functions into patch.c
livepatch: remove unnecessary object loaded check
livepatch: separate enabled and patched states
livepatch/s390: add TIF_PATCH_PENDING thread flag
livepatch/s390: reorganize TIF thread flag bits
livepatch/powerpc: add TIF_PATCH_PENDING thread flag
livepatch/x86: add TIF_PATCH_PENDING thread flag
livepatch: create temporary klp_update_patch_state() stub
x86/entry: define _TIF_ALLWORK_MASK flags explicitly
stacktrace/x86: add function for detecting reliable stack traces
Pull HID subsystem updates from Jiri Kosina:
- The need for HID_QUIRK_NO_INIT_REPORTS per-device quirk has been
growing dramatically during past years, so the time has come to
switch over the default, and perform the pro-active reading only in
cases where it's really needed (multitouch, wacom).
The only place where this behavior is (in some form) preserved is
hiddev so that we don't introduce userspace-visible change of
behavior.
From Benjamin Tissoires
- HID++ support for power_supply / baterry reporting.
From Benjamin Tissoires and Bastien Nocera
- Vast improvements / rework of DS3 and DS4 in Sony driver.
From Roderick Colenbrander
- Improvment (in terms of getting closer to the Microsoft's
interpretation of slightly ambiguous specification) of logical range
interpretation in case null-state is set in the rdesc.
From Valtteri Heikkilä and Tomasz Kramkowski
- A lot of newly supported device IDs and small assorted fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (71 commits)
HID: usbhid: Add HID_QUIRK_NOGET for Aten CS-1758 KVM switch
HID: asus: support backlight on USB keyboards
HID: wacom: Move wacom_remote_irq and wacom_remote_status_irq
HID: wacom: generic: sync pad events only for actual packets
HID: sony: remove redundant check for -ve err
HID: sony: Make sure to unregister sensors on failure
HID: sony: Make DS4 bt poll interval adjustable
HID: sony: Set proper bit flags on DS4 output report
HID: sony: DS4 use brighter LED colors
HID: sony: Improve navigation controller axis/button mapping
HID: sony: Use DS3 MAC address as unique identifier on USB
HID: logitech-hidpp: add a sysfs file to tell we support power_supply
HID: logitech-hidpp: enable HID++ 1.0 battery reporting
HID: logitech-hidpp: add support for battery status for the K750
HID: logitech-hidpp: battery: provide CAPACITY_LEVEL
HID: logitech-hidpp: rename battery level into capacity
HID: logitech-hidpp: battery: provide ONLINE property
HID: logitech-hidpp: notify battery on connect
HID: logitech-hidpp: return an error if the queried feature is not present
HID: logitech-hidpp: create the battery for all types of HID++ devices
...