With newer chip versions ASPM-related issues seem to occur only if
L1.2 is enabled. I have a test system with RTL8168h that gives a
number of rx_missed errors when running iperf and L1.2 is enabled.
With L1.2 disabled (and L1 + L1.1 active) everything is fine.
See also [0]. Can't test this, but L1 + L1.1 being active should be
sufficient to reach higher package power saving states.
[0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1942830
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/36feb8c4-a0b6-422a-899c-e61f2e869dfe@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet says:
====================
net: better packing of global vars
First two patches avoid holes in data section,
and last patch makes sure some siphash keys are contained
in a single cache line.
====================
Link: https://lore.kernel.org/r/20211115172303.3732746-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
siphash keys use 16 bytes.
Define siphash_aligned_key_t macro so that we can make sure they
are not crossing a cache line boundary.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Same rationale than prior patch : using the dedicated
section avoid holes and pack all these bool values.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
.data.once contains nicely packed bool variables.
It is used already by DO_ONCE_LITE().
Using it also in DO_ONCE() removes holes in .data section.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
bp->sriov_cfg is not defined when CONFIG_BNXT_SRIOV is not set. Fix
it by adding a helper function bnxt_sriov_cfg() to handle the logic
with or without the config option.
Fixes: 46d08f55d2 ("bnxt_en: extend RTNL to VF check in devlink driver_reinit")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/1637090770-22835-1-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* bad dont-reorder check
* throughput LED trigger for various new(ish) paths
* radiotap header generation
* locking assertions in mac80211 with monitor mode
* radio statistics
* don't try to access IV when not present
* call stop_ap for P2P_GO as well as we should
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAmGT1iIACgkQB8qZga/f
l8QDhA/+JeKZeXQ+Ct8kh34xMN4VxCUeGlJNY1aH2mTlJFSwxqJ4XH7eCiSTezXB
ZEZ46wDLOpUF9epGl4NLzFkR5+Sanpn3LJOMLBDYQwkl3xcvNiQEZ0y4T8PayJRR
kSs9ina3zxvZvBjCYX1KKdFatZiE+X5qwYyaK8T4ZjNQ4vjQpeX/V3eFpFeMDjuX
fzIUN27yiESZG4heB2BkyeTR/jb4hG/gCbdGfNxa9MbF0OGJMJkuwH9MNHRQR2cx
NvEPoq2IcooZfbqWnVk9lxWzRpm8Buio+gth9Knd13AGDl4Mez4+n0KxAEhafjhE
ljUYekjh+55iSkt7eczW7wjvxmdGC+lfGN7GUop9mw4LU6NvpFCKXW5MqaH8XBy4
UIkcblXsMKkD8sFbMGzNIl7pT4JGdXkxlorxg7xI5s02nMFppnWJ7mrXQmcBkBG7
5JLkaXahiQzQYq/VGepCPMYYwKQqbHXts9TPMq/W3aIeR8QxA3uINZ5P9eGF9qn8
xTcvEQPYvIid1BIorO5/0V/VgE9VNDP1w8aKvn8bJjOmbWfuE68Wx+K6LlHy+E+O
/MhjiRwnCvvTfdi6vPUjTxis8Xh5e9xfYmcqyRVIQujLmhta31FEJ/q+R+A1A4Rs
RsesqpLgEOo3rkSciAAfItmRf3SVgJlIuZcfe16XFw0s45oYEck=
=aD/1
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-net-2021-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Couple of fixes:
* bad dont-reorder check
* throughput LED trigger for various new(ish) paths
* radiotap header generation
* locking assertions in mac80211 with monitor mode
* radio statistics
* don't try to access IV when not present
* call stop_ap for P2P_GO as well as we should
* tag 'mac80211-for-net-2021-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211:
mac80211: fix throughput LED trigger
mac80211: fix monitor_sdata RCU/locking assertions
mac80211: drop check for DONT_REORDER in __ieee80211_select_queue
mac80211: fix radiotap header generation
mac80211: do not access the IV when it was stripped
nl80211: fix radio statistics in survey dump
cfg80211: call cfg80211_stop_ap when switch from P2P_GO type
====================
Link: https://lore.kernel.org/r/20211116160845.157214-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Borkmann says:
====================
pull-request: bpf 2021-11-16
We've added 12 non-merge commits during the last 5 day(s) which contain
a total of 23 files changed, 573 insertions(+), 73 deletions(-).
The main changes are:
1) Fix pruning regression where verifier went overly conservative rejecting
previsouly accepted programs, from Alexei Starovoitov and Lorenz Bauer.
2) Fix verifier TOCTOU bug when using read-only map's values as constant
scalars during verification, from Daniel Borkmann.
3) Fix a crash due to a double free in XSK's buffer pool, from Magnus Karlsson.
4) Fix libbpf regression when cross-building runqslower, from Jean-Philippe Brucker.
5) Forbid use of bpf_ktime_get_coarse_ns() and bpf_timer_*() helpers in tracing
programs due to deadlock possibilities, from Dmitrii Banshchikov.
6) Fix checksum validation in sockmap's udp_read_sock() callback, from Cong Wang.
7) Various BPF sample fixes such as XDP stats in xdp_sample_user, from Alexander Lobakin.
8) Fix libbpf gen_loader error handling wrt fd cleanup, from Kumar Kartikeya Dwivedi.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
udp: Validate checksum in udp_read_sock()
bpf: Fix toctou on read-only map's constant scalar tracking
samples/bpf: Fix build error due to -isystem removal
selftests/bpf: Add tests for restricted helpers
bpf: Forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing progs
libbpf: Perform map fd cleanup for gen_loader in case of error
samples/bpf: Fix incorrect use of strlen in xdp_redirect_cpu
tools/runqslower: Fix cross-build
samples/bpf: Fix summary per-sec stats in xdp_sample_user
selftests/bpf: Check map in map pruning
bpf: Fix inner map state pruning regression.
xsk: Fix crash on double free in buffer pool
====================
Link: https://lore.kernel.org/r/20211116141134.6490-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
On regular ConnectX HCAs getting encap mode isn't supported when the
E-Switch is in NONE mode. Current code would return no error code when
trying to get encap mode in such case which is wrong.
Fix by returning error value to indicate failure to caller in such case.
Fixes: 8e0aa4bc95 ("net/mlx5: E-switch, Protect eswitch mode changes")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Currently, In NETDEV_CHANGELOWERSTATE/NETDEV_CHANGEUPPERSTATE events
handling, tracking is not fully completed if the LAG device is not ready
at the time the events occur. But, we must keep track of the upper and
lower states after receiving the events because RoCE needs this info in
mlx5_lag_get_roce_netdev() - in order to return the corresponding port
that its running on. Returning the wrong (not most recent) port will lead
to gids table being incorrect.
For example: If during the attachment of a slave to the bond, the other
non-attached port performs pci_reload, then the LAG device is not ready,
but that should not result in dismissing attached slave tracker update
automatically (which is performed in mlx5_handle_changelowerstate()), Since
these events might not come later, which can lead to both bond ports
having tx_enabled=0 - which is not a valid state of LAG bond.
Fixes: 9b412cc35f ("net/mlx5e: Add LAG warning if bond slave is not lag master")
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
CT clear action offload adds additional mod hdr actions to the
flow's original mod actions in order to clear the registers which
hold ct_state.
When such flow also includes encap action, a neigh update event
can cause the driver to unoffload the flow and then reoffload it.
Each time this happens, the ct clear handling adds that same set
of mod hdr actions to reset ct_state until the max of mod hdr
actions is reached.
Also the driver never releases the allocated mod hdr actions and
causing a memleak.
Fix above two issues by moving CT clear mod acts allocation
into the parsing actions phase and only use it when offloading the rule.
The release of mod acts will be done in the normal flow_put().
backtrace:
[<000000007316e2f3>] krealloc+0x83/0xd0
[<00000000ef157de1>] mlx5e_mod_hdr_alloc+0x147/0x300 [mlx5_core]
[<00000000970ce4ae>] mlx5e_tc_match_to_reg_set_and_get_id+0xd7/0x240 [mlx5_core]
[<0000000067c5fa17>] mlx5e_tc_match_to_reg_set+0xa/0x20 [mlx5_core]
[<00000000d032eb98>] mlx5_tc_ct_entry_set_registers.isra.0+0x36/0xc0 [mlx5_core]
[<00000000fd23b869>] mlx5_tc_ct_flow_offload+0x272/0x1f10 [mlx5_core]
[<000000004fc24acc>] mlx5e_tc_offload_fdb_rules.part.0+0x150/0x620 [mlx5_core]
[<00000000dc741c17>] mlx5e_tc_encap_flows_add+0x489/0x690 [mlx5_core]
[<00000000e92e49d7>] mlx5e_rep_update_flows+0x6e4/0x9b0 [mlx5_core]
[<00000000f60f5602>] mlx5e_rep_neigh_update+0x39a/0x5d0 [mlx5_core]
Fixes: 1ef3018f5a ("net/mlx5e: CT: Support clear action")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
When doing a flow counters bulk query, the number of counters to query
must be aligned to 4. Current SF bulk query len is not aligned to 4,
which leads to an error when trying to query more than 4 counters.
Fix it by aligning SF bulk query len to 4.
Fixes: 2fdeb4f4c2 ("net/mlx5: Reduce flow counters bulk query buffer size for SFs")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
A user can enable VFs without changing E-Switch mode, this can happen
when a user moves straight to switchdev mode and only once in switchdev
VFs are enabled via the sysfs interface.
The cited commit assumed this isn't possible and exposed a single
API function where the E-switch calls into the lag code, breaks the lag
and prevents any other lag operations to take place until the
E-switch update has ended.
Breaking the hardware lag when it isn't needed can make it such that
hardware lag can't be enabled again.
In the sysfs call path check if the current E-Switch mode is NONE,
in the context of the function it can only mean the E-Switch is moving
out of NONE mode and the hardware lag should be disabled and enabled
once the mode change has ended. If the mode isn't NONE it means
VFs are about to be enabled and such operation doesn't require
toggling the hardware lag.
Fixes: cac1eb2cf2 ("net/mlx5: Lag, properly lock eswitch if needed")
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
The existing loop doesn't cast the buffer while scanning it, which
results in out-of-bounds read and failure to create the matcher.
Fixes: 941f19798a ("net/mlx5: DR, Add check for unsupported fields in match param")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
When querying eswitch manager vport capabilities as "other = 1",
we encounter a FW compatibility issue with older FW versions.
To maintain backward compatibility, eswitch manager vport should
be queried as "other = 0" vport both for ECPF and non-ECPF cases.
This patch fixes these queries and improves the code readability
by handling eswitch manager and uplink vports separately, avoiding
the excessive 'if' conditions. Also, uplink caps are stored similar
to esw manager and not as part of xarray.
Fixes: dd4acb2a09 ("net/mlx5: DR, Add missing query for vport 0")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
E-Switch encap mode is relevant only when in switchdev mode.
The RDMA driver can query the encap configuration via
mlx5_eswitch_get_encap_mode(). Make sure it returns the currently
used mode and not the set one.
This reverts the cited commit which reset the encap mode
on entering switchdev and fixes the original issue properly.
Fixes: 9a64144d68 ("net/mlx5: E-Switch, Fix default encap mode")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Function mlx5e_take_tmp_flow() skips flows with zero reference count. This
can cause syndrome 0x179e84 when the called from neigh or route update code
and the skipped flow is not removed from the hardware by the time
underlying encap/decap resource is deleted. Add new completion
'del_hw_done' that is completed when flow is unoffloaded. This is safe to
do because flow with reference count zero needs to be detached from
encap/decap entry before its memory is deallocated, which requires taking
the encap_tbl_lock mutex that is held by the event handlers code.
Fixes: 8914add2c9 ("net/mlx5e: Handle FIB events to update tunnel endpoint device")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
For the TLS RX resync flow, we maintain a list of TLS contexts
that require some attention, to communicate their resync information
to the HW.
Here we fix list corruptions, by protecting the entries against
movements coming from resync_handle_seq_match(), until their resync
handling in napi is fully completed.
Fixes: e9ce991bce ("net/mlx5e: kTLS, Add resiliency to RX resync failures")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
- Cleanups for the perf test infrastructure and mapping hugepages
- Avoid contention on mmap_sem when the guests start to run
- Add event channel upcall support to xen_shinfo_test
The v2 balance ioctl has been introduced more than 9 years ago. Users of
the old v1 ioctl should have long been migrated to it. It's time we
deprecate it and eventually remove it.
The only known user is in btrfs-progs that tries v1 as a fallback in
case v2 is not supported. This is not necessary anymore.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The bitfields have_csum and io_error are currently signed which is not
recommended as the representation is an implementation defined
behaviour. Fix this by making the bit-fields unsigned ints.
Fixes: 2c36395430 ("btrfs: scrub: remove the anonymous structure from scrub_page")
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When a disk has write caching disabled, we skip submission of a bio with
flush and sync requests before writing the superblock, since it's not
needed. However when the integrity checker is enabled, this results in
reports that there are metadata blocks referred by a superblock that
were not properly flushed. So don't skip the bio submission only when
the integrity checker is enabled for the sake of simplicity, since this
is a debug tool and not meant for use in non-debug builds.
fstests/btrfs/220 trigger a check-integrity warning like the following
when CONFIG_BTRFS_FS_CHECK_INTEGRITY=y and the disk with WCE=0.
btrfs: attempt to write superblock which references block M @5242880 (sdb2/5242880/0) which is not flushed out of disk's write cache (block flush_gen=1, dev->flush_gen=0)!
------------[ cut here ]------------
WARNING: CPU: 28 PID: 843680 at fs/btrfs/check-integrity.c:2196 btrfsic_process_written_superblock+0x22a/0x2a0 [btrfs]
CPU: 28 PID: 843680 Comm: umount Not tainted 5.15.0-0.rc5.39.el8.x86_64 #1
Hardware name: Dell Inc. Precision T7610/0NK70N, BIOS A18 09/11/2019
RIP: 0010:btrfsic_process_written_superblock+0x22a/0x2a0 [btrfs]
RSP: 0018:ffffb642afb47940 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000
RDX: 00000000ffffffff RSI: ffff8b722fc97d00 RDI: ffff8b722fc97d00
RBP: ffff8b5601c00000 R08: 0000000000000000 R09: c0000000ffff7fff
R10: 0000000000000001 R11: ffffb642afb476f8 R12: ffffffffffffffff
R13: ffffb642afb47974 R14: ffff8b5499254c00 R15: 0000000000000003
FS: 00007f00a06d4080(0000) GS:ffff8b722fc80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fff5cff5ff0 CR3: 00000001c0c2a006 CR4: 00000000001706e0
Call Trace:
btrfsic_process_written_block+0x2f7/0x850 [btrfs]
__btrfsic_submit_bio.part.19+0x310/0x330 [btrfs]
? bio_associate_blkg_from_css+0xa4/0x2c0
btrfsic_submit_bio+0x18/0x30 [btrfs]
write_dev_supers+0x81/0x2a0 [btrfs]
? find_get_pages_range_tag+0x219/0x280
? pagevec_lookup_range_tag+0x24/0x30
? __filemap_fdatawait_range+0x6d/0xf0
? __raw_callee_save___native_queued_spin_unlock+0x11/0x1e
? find_first_extent_bit+0x9b/0x160 [btrfs]
? __raw_callee_save___native_queued_spin_unlock+0x11/0x1e
write_all_supers+0x1b3/0xa70 [btrfs]
? __raw_callee_save___native_queued_spin_unlock+0x11/0x1e
btrfs_commit_transaction+0x59d/0xac0 [btrfs]
close_ctree+0x11d/0x339 [btrfs]
generic_shutdown_super+0x71/0x110
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0xb8/0x140
task_work_run+0x6d/0xb0
exit_to_user_mode_prepare+0x1f0/0x200
syscall_exit_to_user_mode+0x12/0x30
do_syscall_64+0x46/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f009f711dfb
RSP: 002b:00007fff5cff7928 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 000055b68c6c9970 RCX: 00007f009f711dfb
RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000055b68c6c9b50
RBP: 0000000000000000 R08: 000055b68c6ca900 R09: 00007f009f795580
R10: 0000000000000000 R11: 0000000000000246 R12: 000055b68c6c9b50
R13: 00007f00a04bf184 R14: 0000000000000000 R15: 00000000ffffffff
---[ end trace 2c4b82abcef9eec4 ]---
S-65536(sdb2/65536/1)
-->
M-1064960(sdb2/1064960/1)
Reviewed-by: Filipe Manana <fdmanana@gmail.com>
Signed-off-by: Wang Yugui <wangyugui@e16-tech.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Ordered work functions aren't guaranteed to be handled by the same thread
which executed the normal work functions. The only way execution between
normal/ordered functions is synchronized is via the WORK_DONE_BIT,
unfortunately the used bitops don't guarantee any ordering whatsoever.
This manifested as seemingly inexplicable crashes on ARM64, where
async_chunk::inode is seen as non-null in async_cow_submit which causes
submit_compressed_extents to be called and crash occurs because
async_chunk::inode suddenly became NULL. The call trace was similar to:
pc : submit_compressed_extents+0x38/0x3d0
lr : async_cow_submit+0x50/0xd0
sp : ffff800015d4bc20
<registers omitted for brevity>
Call trace:
submit_compressed_extents+0x38/0x3d0
async_cow_submit+0x50/0xd0
run_ordered_work+0xc8/0x280
btrfs_work_helper+0x98/0x250
process_one_work+0x1f0/0x4ac
worker_thread+0x188/0x504
kthread+0x110/0x114
ret_from_fork+0x10/0x18
Fix this by adding respective barrier calls which ensure that all
accesses preceding setting of WORK_DONE_BIT are strictly ordered before
setting the flag. At the same time add a read barrier after reading of
WORK_DONE_BIT in run_ordered_work which ensures all subsequent loads
would be strictly ordered after reading the bit. This in turn ensures
are all accesses before WORK_DONE_BIT are going to be strictly ordered
before any access that can occur in ordered_func.
Reported-by: Chris Murphy <lists@colorremedies.com>
Fixes: 08a9ff3264 ("btrfs: Added btrfs_workqueue_struct implemented ordered execution based on kernel workqueue")
CC: stable@vger.kernel.org # 4.4+
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2011928
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Tested-by: Chris Murphy <chris@colorremedies.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
The following script can cause btrfs to crash:
$ mount -o compress-force=lzo $DEV /mnt
$ dd if=/dev/urandom of=/mnt/foo bs=4k count=1
$ sync
The call trace looks like this:
general protection fault, probably for non-canonical address 0xe04b37fccce3b000: 0000 [#1] PREEMPT SMP NOPTI
CPU: 5 PID: 164 Comm: kworker/u20:3 Not tainted 5.15.0-rc7-custom+ #4
Workqueue: btrfs-delalloc btrfs_work_helper [btrfs]
RIP: 0010:__memcpy+0x12/0x20
Call Trace:
lzo_compress_pages+0x236/0x540 [btrfs]
btrfs_compress_pages+0xaa/0xf0 [btrfs]
compress_file_range+0x431/0x8e0 [btrfs]
async_cow_start+0x12/0x30 [btrfs]
btrfs_work_helper+0xf6/0x3e0 [btrfs]
process_one_work+0x294/0x5d0
worker_thread+0x55/0x3c0
kthread+0x140/0x170
ret_from_fork+0x22/0x30
---[ end trace 63c3c0f131e61982 ]---
[CAUSE]
In lzo_compress_pages(), parameter @out_pages is not only an output
parameter (for the number of compressed pages), but also an input
parameter, as the upper limit of compressed pages we can utilize.
In commit d4088803f5 ("btrfs: subpage: make lzo_compress_pages()
compatible"), the refactoring doesn't take @out_pages as an input, thus
completely ignoring the limit.
And for compress-force case, we could hit incompressible data that
compressed size would go beyond the page limit, and cause the above
crash.
[FIX]
Save @out_pages as @max_nr_page, and pass it to lzo_compress_pages(),
and check if we're beyond the limit before accessing the pages.
Note: this also fixes crash on 32bit architectures that was suspected to
be caused by merge of btrfs patches to 5.16-rc1. Reported in
https://lore.kernel.org/all/20211104115001.GU20319@twin.jikos.cz/ .
Reported-by: Omar Sandoval <osandov@fb.com>
Fixes: d4088803f5 ("btrfs: subpage: make lzo_compress_pages() compatible")
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ add note ]
Signed-off-by: David Sterba <dsterba@suse.com>
We should clear the flag if the adv instance removed due to receiving
this error status is the last one we have.
Signed-off-by: Archie Pusaka <apusaka@chromium.org>
Reviewed-by: Miao-chen Chou <mcchou@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This event is received when the controller stops advertising,
specifically for these three reasons:
(a) Connection is successfully created (success).
(b) Timeout is reached (error).
(c) Number of advertising events is reached (error).
(*) This event is NOT generated when the host stops the advertisement.
Refer to the BT spec ver 5.3 vol 4 part E sec 7.7.65.18. Note that the
section was revised from BT spec ver 5.0 vol 2 part E sec 7.7.65.18
which was ambiguous about (*).
Some chips (e.g. RTL8822CE) send this event when the host stops the
advertisement with status = HCI_ERROR_CANCELLED_BY_HOST (due to (*)
above). This is treated as an error and the advertisement will be
removed and userspace will be informed via MGMT event.
On suspend, we are supposed to temporarily disable advertisements,
and continue advertising on resume. However, due to the behavior
above, the advertisements are removed instead.
This patch returns early if HCI_ERROR_CANCELLED_BY_HOST is received.
Btmon snippet of the unexpected behavior:
@ MGMT Command: Remove Advertising (0x003f) plen 1
Instance: 1
< HCI Command: LE Set Extended Advertising Enable (0x08|0x0039) plen 6
Extended advertising: Disabled (0x00)
Number of sets: 1 (0x01)
Entry 0
Handle: 0x01
Duration: 0 ms (0x00)
Max ext adv events: 0
> HCI Event: LE Meta Event (0x3e) plen 6
LE Advertising Set Terminated (0x12)
Status: Operation Cancelled by Host (0x44)
Handle: 1
Connection handle: 0
Number of completed extended advertising events: 5
> HCI Event: Command Complete (0x0e) plen 4
LE Set Extended Advertising Enable (0x08|0x0039) ncmd 2
Status: Success (0x00)
Signed-off-by: Archie Pusaka <apusaka@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This work is no longer necessary since all the code using it has been
converted to use hci_passive_scan/hci_passive_scan_sync.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This makes MGMT_OP_SET_CONNEABLE use hci_cmd_sync_queue instead of
use a dedicated connetable_update work.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This makes MGMT_OP_SET_DISCOVERABLE use hci_cmd_sync_queue instead of
use a dedicated discoverable_update work.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Do not use "/**" to begin a non-kernel-doc comment.
Fixes this build warning:
drivers/bluetooth/btmrvl_main.c:2: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
Cc: linux-bluetooth@vger.kernel.org
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2021-11-15
This series contains updates to iavf driver only.
Mateusz adds a wait for reset completion when changing queue count which
could otherwise cause issues with VF reset.
Nick adds a null check for vf_res in iavf_fix_features(), corrects
ordering of function calls to resolve dependency issues, and prevents
possible freeing of a lock which isn't being held.
Piotr fixes logic that did not allow setting all multicast mode without
promiscuous mode.
Jake prevents possible accidental freeing of filter structure.
Mitch adds null checks for key and indir parameters in iavf_get_rxfh().
Surabhi adds an additional check that would, previously, cause the driver
to print a false error due to values obtained while the VF is in reset.
Grzegorz prevents a queue request of 0 which would cause queue count to
reset to default values.
Akeem restores VLAN filters when bringing the interface back up.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet says:
====================
net: prot_inuse and sock_inuse cleanups
Small series cleaning and optimizing sock_prot_inuse_add()
and sock_inuse_add().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This is distracting really, let's make this simpler,
because many callers had to take care of this
by themselves, even if on x86 this adds more
code than really needed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net->core.sock_inuse is a per cpu variable (int),
while net->core.prot_inuse is another per cpu variable
of 64 integers.
per cpu allocator tend to place them in very different places.
Grouping them together makes sense, since it makes
updates potentially faster, if hitting the same
cache line.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
MPTCP hard codes it, let us instead provide this helper.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sock_prot_inuse_add() is very small, we can inline it.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet says:
====================
gro: get out of core files
Move GRO related content into net/core/gro.c
and include/net/gro.h.
This reduces GRO scope to where it is really needed,
and shrinks too big files (include/linux/netdevice.h
and net/core/dev.c)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Move gro code and data from net/core/dev.c to net/core/gro.c
to ease maintenance.
gro_normal_list() and gro_normal_one() are inlined
because they are called from both files.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/core/gro.c will contain all core gro functions,
to shrink net/core/skbuff.c and net/core/dev.c
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This helper is used once, no need to keep it in fat net/core/skbuff.c
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
include/linux/netdevice.h became too big, move gro stuff
into include/net/gro.h
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet says:
====================
tcp: optimizations for linux-5.17
Mostly small improvements in this series.
The notable change is in "defer skb freeing after
socket lock is released" in recvmsg() (and RX zerocopy)
The idea is to try to let skb freeing to BH handler,
whenever possible, or at least perform the freeing
outside of the socket lock section, for much improved
performance. This idea can probably be extended
to other protocols.
Tests on a 100Gbit NIC
Max throughput for one TCP_STREAM flow, over 10 runs.
MTU : 1500 (1428 bytes of TCP payload per MSS)
Before: 55 Gbit
After: 66 Gbit
MTU : 4096+ (4096 bytes of TCP payload, plus TCP/IPv6 headers)
Before: 82 Gbit
After: 95 Gbit
====================
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sk_rx_dst/sk_rx_dst_ifindex/sk_rx_dst_cookie are read in early demux,
and currently spans two cache lines.
Moving them close to sk_refcnt makes more sense, as only one cache
line is needed.
New layout for this hot cache line is :
struct sock {
struct sock_common __sk_common; /* 0 0x88 */
/* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
struct dst_entry * sk_rx_dst; /* 0x88 0x8 */
int sk_rx_dst_ifindex; /* 0x90 0x4 */
u32 sk_rx_dst_cookie; /* 0x94 0x4 */
socket_lock_t sk_lock; /* 0x98 0x20 */
atomic_t sk_drops; /* 0xb8 0x4 */
int sk_rcvlowat; /* 0xbc 0x4 */
/* --- cacheline 3 boundary (192 bytes) --- */
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Under pressure, tcp recvmsg() has logic to process the socket backlog,
but calls tcp_cleanup_rbuf() right before.
Avoiding sending ACK right before processing new segments makes
a lot of sense, as this decrease the number of ACK packets,
with no impact on effective ACK clocking.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>