Граф коммитов

1073403 Коммитов

Автор SHA1 Сообщение Дата
Jim Mattson e3bcfda012 KVM: x86: Report deprecated x87 features in supported CPUID
CPUID.(EAX=7,ECX=0):EBX.FDP_EXCPTN_ONLY[bit 6] and
CPUID.(EAX=7,ECX=0):EBX.ZERO_FCS_FDS[bit 13] are "defeature"
bits. Unlike most of the other CPUID feature bits, these bits are
clear if the features are present and set if the features are not
present. These bits should be reported in KVM_GET_SUPPORTED_CPUID,
because if these bits are set on hardware, they cannot be cleared in
the guest CPUID. Doing so would claim guest support for a feature that
the hardware doesn't support and that can't be efficiently emulated.

Of course, any software (e.g WIN87EM.DLL) expecting these features to
be present likely predates these CPUID feature bits and therefore
doesn't know to check for them anyway.

Aaron Lewis added the corresponding X86_FEATURE macros in
commit cbb99c0f58 ("x86/cpufeatures: Add FDP_EXCPTN_ONLY and
ZERO_FCS_FDS"), with the intention of reporting these bits in
KVM_GET_SUPPORTED_CPUID, but I was unable to find a proposed patch on
the kvm list.

Opportunistically reordered the CPUID_7_0_EBX capability bits from
least to most significant.

Cc: Aaron Lewis <aaronlewis@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Message-Id: <20220204001348.2844660-1-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-04 03:06:55 -05:00
Anton Lundin ac9f0c8106 ata: libata-core: Introduce ATA_HORKAGE_NO_LOG_DIR horkage
06f6c4c6c3 ("ata: libata: add missing ata_identify_page_supported() calls")
introduced additional calls to ata_identify_page_supported(), thus also
adding indirectly accesses to the device log directory log page through
ata_log_supported(). Reading this log page causes SATADOM-ML 3ME devices
to lock up.

Introduce the horkage flag ATA_HORKAGE_NO_LOG_DIR to prevent accesses to
the log directory in ata_log_supported() and add a blacklist entry
with this flag for "SATADOM-ML 3ME" devices.

Fixes: 636f6e2af4 ("libata: add horkage for missing Identify Device log")
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Anton Lundin <glance@acc.umu.se>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-04 16:44:23 +09:00
Sergey Shtylyov d052c5d3a3 MAINTAINERS: add myself as Renesas R-Car SATA driver reviewer
Add myself as a reviewer for the Renesas R-Car SATA driver -- I don't have
the hardware anymore (Geert Uytterhoeven does have a lot of hardware!) but
I do have the manuals still! :-)

Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-04 16:43:05 +09:00
Namjae Jeon f9929ef6a2 ksmbd: add support for key exchange
When mounting cifs client, can see the following warning message.

CIFS: decode_ntlmssp_challenge: authentication has been weakened as server
does not support key exchange

To remove this warning message, Add support for key exchange feature to
ksmbd. This patch decrypts 16-byte ciphertext value sent by the client
using RC4 with session key. The decrypted value is the recovered secondary
key that will use instead of the session key for signing and sealing.

Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-02-04 00:12:22 -06:00
Namjae Jeon deae24b0b1 ksmbd: reduce smb direct max read/write size
ksmbd does not support more than one Buffer Descriptor V1 element in
an smbdirect protocol request. Reducing the maximum read/write size to
about 512KB allows interoperability with Windows over a wider variety
of RDMA NICs, as an interim workaround.

Reviewed-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-02-04 00:12:22 -06:00
Namjae Jeon 04e260948a ksmbd: don't align last entry offset in smb2 query directory
When checking smb2 query directory packets from other servers,
OutputBufferLength is different with ksmbd. Other servers add an unaligned
next offset to OutputBufferLength for the last entry.

Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-02-04 00:12:22 -06:00
Namjae Jeon 97550c7478 ksmbd: fix same UniqueId for dot and dotdot entries
ksmbd sets the inode number to UniqueId. However, the same UniqueId for
dot and dotdot entry is set to the inode number of the parent inode.
This patch set them using the current inode and parent inode.

Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-02-04 00:12:22 -06:00
Hyunchul Lee 6d896d3b44 ksmbd: smbd: validate buffer descriptor structures
Check ChannelInfoOffset and ChannelInfoLength
to validate buffer descriptor structures.
And add a debug log to print the structures'
content.

Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-02-04 00:12:22 -06:00
Dave Airlie 9ca3d3cd08 Merge tag 'drm-intel-fixes-2022-02-03' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
Fix GitLab issue #4698: DP monitor through Type-C dock(Dell DA310) doesn't work.
Fixes for inconsistent engine busyness value and read timeout with GuC.
Fix to use ALLOW_FAIL for error capture buffer allocation. Don't use
interruptible lock on error path. Smatch fix to reject zero sized overlays.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YfuiG8SKMKP5V/Dm@jlahtine-mobl.ger.corp.intel.com
2022-02-04 15:48:27 +10:00
Dave Airlie 8ea2c5187d * dma-buf/heaps: Fix potential spectre v1 gadget
* drm/kmb: Fix potential out-of-bounds access
  * drm/mxsfb: Fix NULL-pointer dereference
  * drm/nouveau: Fix potential out-of-bounds access in BIOS decoding
  * fbdev: Re-add support for fbcon hardware acceleration
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmH7vCsACgkQaA3BHVML
 eiOifQgAr0o6WMxuB56vlLYnmU2I/N5aBwxtPcoOCKH8SgeF8b5Fyszb10a4qW3K
 ycn8CVzAoT+M70pR/ty2jUylYU0huMQGJOvOPxV4i83EOU8rmq6BP201yGHppVf6
 8vNu/i9lQwRb7cVtO0eHNF3/1LMgaPeDv1eHdatRyj1Bt7p58oq7J+5o+4eRMZGi
 HqXmP6drUunhlXfZsbNrnM8G59uHrgQY4VKcOKveKMoAiByVl0Qshns7jSFWVvlF
 R42awDlDAjz9HsKEGs35DtoroUYIS+d1jHh9QxuJuxHcLZmxIDOTKcYZNK+vcx0Q
 qD/AnTt3LbBGinZBPLbuzFWePBhJew==
 =+Emr
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2022-02-03' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

 * dma-buf/heaps: Fix potential spectre v1 gadget
 * drm/kmb: Fix potential out-of-bounds access
 * drm/mxsfb: Fix NULL-pointer dereference
 * drm/nouveau: Fix potential out-of-bounds access in BIOS decoding
 * fbdev: Re-add support for fbcon hardware acceleration

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/Yfu8mTZQUNt1RwZd@linux-uq9g
2022-02-04 14:43:35 +10:00
Florian Westphal d1ca60efc5 netfilter: ctnetlink: disable helper autoassign
When userspace, e.g. conntrackd, inserts an entry with a specified helper,
its possible that the helper is lost immediately after its added:

ctnetlink_create_conntrack
  -> nf_ct_helper_ext_add + assign helper
    -> ctnetlink_setup_nat
      -> ctnetlink_parse_nat_setup
         -> parse_nat_setup -> nfnetlink_parse_nat_setup
	                       -> nf_nat_setup_info
                                 -> nf_conntrack_alter_reply
                                   -> __nf_ct_try_assign_helper

... and __nf_ct_try_assign_helper will zero the helper again.

Set IPS_HELPER bit to bypass auto-assign logic, its unwanted, just like
when helper is assigned via ruleset.

Dropped old 'not strictly necessary' comment, it referred to use of
rcu_assign_pointer() before it got replaced by RCU_INIT_POINTER().

NB: Fixes tag intentionally incorrect, this extends the referenced commit,
but this change won't build without IPS_HELPER introduced there.

Fixes: 6714cf5465 ("netfilter: nf_conntrack: fix explicit helper attachment and NAT")
Reported-by: Pham Thanh Tuyen <phamtyn@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-04 05:39:57 +01:00
Florian Westphal 1f6339e034 MAINTAINERS: netfilter: update git links
nf and nf-next have a new location.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-04 05:39:57 +01:00
Florian Westphal 82b72cb946 netfilter: conntrack: re-init state for retransmitted syn-ack
TCP conntrack assumes that a syn-ack retransmit is identical to the
previous syn-ack.  This isn't correct and causes stuck 3whs in some more
esoteric scenarios.  tcpdump to illustrate the problem:

 client > server: Flags [S] seq 1365731894, win 29200, [mss 1460,sackOK,TS val 2083035583 ecr 0,wscale 7]
 server > client: Flags [S.] seq 145824453, ack 643160523, win 65535, [mss 8952,wscale 5,TS val 3215367629 ecr 2082921663]

Note the invalid/outdated synack ack number.
Conntrack marks this syn-ack as out-of-window/invalid, but it did
initialize the reply direction parameters based on this packets content.

 client > server: Flags [S] seq 1365731894, win 29200, [mss 1460,sackOK,TS val 2083036623 ecr 0,wscale 7]

... retransmit...

 server > client: Flags [S.], seq 145824453, ack 643160523, win 65535, [mss 8952,wscale 5,TS val 3215368644 ecr 2082921663]

and another bogus synack. This repeats, then client re-uses for a new
attempt:

client > server: Flags [S], seq 2375731741, win 29200, [mss 1460,sackOK,TS val 2083100223 ecr 0,wscale 7]
server > client: Flags [S.], seq 145824453, ack 643160523, win 65535, [mss 8952,wscale 5,TS val 3215430754 ecr 2082921663]

... but still gets a invalid syn-ack.

This repeats until:

 server > client: Flags [S.], seq 145824453, ack 643160523, win 65535, [mss 8952,wscale 5,TS val 3215437785 ecr 2082921663]
 server > client: Flags [R.], seq 145824454, ack 643160523, win 65535, [mss 8952,wscale 5,TS val 3215443451 ecr 2082921663]
 client > server: Flags [S], seq 2375731741, win 29200, [mss 1460,sackOK,TS val 2083115583 ecr 0,wscale 7]
 server > client: Flags [S.], seq 162602410, ack 2375731742, win 65535, [mss 8952,wscale 5,TS val 3215445754 ecr 2083115583]

This syn-ack has the correct ack number, but conntrack flags it as
invalid: The internal state was created from the first syn-ack seen
so the sequence number of the syn-ack is treated as being outside of
the announced window.

Don't assume that retransmitted syn-ack is identical to previous one.
Treat it like the first syn-ack and reinit state.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-04 05:39:51 +01:00
Florian Westphal cc4f9d6203 netfilter: conntrack: move synack init code to helper
It seems more readable to use a common helper in the followup fix rather
than copypaste or goto.

No functional change intended.  The function is only called for syn-ack
or syn in repy direction in case of simultaneous open.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-04 05:39:27 +01:00
Florian Westphal a9e8503def netfilter: nft_payload: don't allow th access for fragments
Loads relative to ->thoff naturally expect that this points to the
transport header, but this is only true if pkt->fragoff == 0.

This has little effect for rulesets with connection tracking/nat because
these enable ip defra. For other rulesets this prevents false matches.

Fixes: 96518518cc ("netfilter: add nftables")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-04 05:38:15 +01:00
Florian Westphal 77b337196a netfilter: conntrack: don't refresh sctp entries in closed state
Vivek Thrivikraman reported:
 An SCTP server application which is accessed continuously by client
 application.
 When the session disconnects the client retries to establish a connection.
 After restart of SCTP server application the session is not established
 because of stale conntrack entry with connection state CLOSED as below.

 (removing this entry manually established new connection):

 sctp 9 CLOSED src=10.141.189.233 [..]  [ASSURED]

Just skip timeout update of closed entries, we don't want them to
stay around forever.

Reported-and-tested-by: Vivek Thrivikraman <vivek.thrivikraman@est.tech>
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1579
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-04 05:38:15 +01:00
Gao Xiang 24331050a3 erofs: fix small compressed files inlining
Prior to ztailpacking feature, it's enough that each lcluster has
two pclusters at most, and the last pcluster should be turned into
an uncompressed pcluster when necessary. For example,
  _________________________________________________
 |_ pcluster n-2 _|_ pcluster n-1 _|____ EOFed ____|

which should be converted into:
  _________________________________________________
 |_ pcluster n-2 _|_ pcluster n-1 (uncompressed)' _|

That is fine since either pcluster n-1 or (uncompressed)' takes one
physical block.

However, after ztailpacking was supported, the game is changed since
the last pcluster can be inlined now. And such case above is quite
common for inlining small files. Therefore, in order to inline more
effectively, special EOF lclusters are now supported which can have
three parts at most, as illustrated below:
  _________________________________________________
 |_ pcluster n-2 _|_ pcluster n-1 _|____ EOFed ____|
                                   ^ i_size

Actually similar code exists in Yue Hu's original patchset [1], but I
removed this part on purpose. After evaluating more real cases with
small files, I've changed my mind.

[1] https://lore.kernel.org/r/20211215094449.15162-1-huyue2@yulong.com

Link: https://lore.kernel.org/r/20220203190203.30794-1-xiang@kernel.org
Fixes: ab92184ff8 ("erofs: add on-disk compressed tail-packing inline support")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-02-04 12:37:12 +08:00
Martin K. Petersen b13e0c7185 block: bio-integrity: Advance seed correctly for larger interval sizes
Commit 309a62fa3a ("bio-integrity: bio_integrity_advance must update
integrity seed") added code to update the integrity seed value when
advancing a bio. However, it failed to take into account that the
integrity interval might be larger than the 512-byte block layer
sector size. This broke bio splitting on PI devices with 4KB logical
blocks.

The seed value should be advanced by bio_integrity_intervals() and not
the number of sectors.

Cc: Dmitry Monakhov <dmonakhov@openvz.org>
Cc: stable@vger.kernel.org
Fixes: 309a62fa3a ("bio-integrity: bio_integrity_advance must update integrity seed")
Tested-by: Dmitry Ivanov <dmitry.ivanov2@hpe.com>
Reported-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220204034209.4193-1-martin.petersen@oracle.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-03 21:09:24 -07:00
Dave Airlie 7eb3848cc8 Merge tag 'amd-drm-fixes-5.17-2022-02-02' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.17-2022-02-02:

amdgpu:
- mGPU fan boost fix for beige goby
- S0ix fixes
- Cyan skillfish hang fix
- DCN fixes for DCN 3.1
- DCN fixes for DCN 3.01
- Apple retina panel fix
- ttm logic inversion fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220203035224.5801-1-alexander.deucher@amd.com
2022-02-04 13:18:55 +10:00
Steen Hegelund ed14fc7a79 net: sparx5: Fix get_stat64 crash in tcpdump
This problem was found with Sparx5 when the tcpdump tool requests the
do_get_stats64 (sparx5_get_stats64) statistic.

The portstats pointer was incorrectly incremented when fetching priority
based statistics.

Fixes: af4b11022e (net: sparx5: add ethtool configuration and statistics support)
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Link: https://lore.kernel.org/r/20220203102900.528987-1-steen.hegelund@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-03 19:01:15 -08:00
Kees Cook dcb85f85fa gcc-plugins/stackleak: Use noinstr in favor of notrace
While the stackleak plugin was already using notrace, objtool is now a
bit more picky.  Update the notrace uses to noinstr.  Silences the
following objtool warnings when building with:

CONFIG_DEBUG_ENTRY=y
CONFIG_STACK_VALIDATION=y
CONFIG_VMLINUX_VALIDATION=y
CONFIG_GCC_PLUGIN_STACKLEAK=y

  vmlinux.o: warning: objtool: do_syscall_64()+0x9: call to stackleak_track_stack() leaves .noinstr.text section
  vmlinux.o: warning: objtool: do_int80_syscall_32()+0x9: call to stackleak_track_stack() leaves .noinstr.text section
  vmlinux.o: warning: objtool: exc_general_protection()+0x22: call to stackleak_track_stack() leaves .noinstr.text section
  vmlinux.o: warning: objtool: fixup_bad_iret()+0x20: call to stackleak_track_stack() leaves .noinstr.text section
  vmlinux.o: warning: objtool: do_machine_check()+0x27: call to stackleak_track_stack() leaves .noinstr.text section
  vmlinux.o: warning: objtool: .text+0x5346e: call to stackleak_erase() leaves .noinstr.text section
  vmlinux.o: warning: objtool: .entry.text+0x143: call to stackleak_erase() leaves .noinstr.text section
  vmlinux.o: warning: objtool: .entry.text+0x10eb: call to stackleak_erase() leaves .noinstr.text section
  vmlinux.o: warning: objtool: .entry.text+0x17f9: call to stackleak_erase() leaves .noinstr.text section

Note that the plugin's addition of calls to stackleak_track_stack() from
noinstr functions is expected to be safe, as it isn't runtime
instrumentation and is self-contained.

Cc: Alexander Popov <alex.popov@linux.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-03 17:02:21 -08:00
Linus Torvalds eb2eb5161c Networking fixes for 5.17-rc3, including fixes from bpf, netfilter,
and ieee802154.
 
 Current release - regressions:
 
  - Partially revert "net/smc: Add netlink net namespace support",
    fix uABI breakage
 
  - netfilter:
      - nft_ct: fix use after free when attaching zone template
      - nft_byteorder: track register operations
 
 Previous releases - regressions:
 
  - ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback
 
  - phy: qca8081: fix speeds lower than 2.5Gb/s
 
  - sched: fix use-after-free in tc_new_tfilter()
 
 Previous releases - always broken:
 
  - tcp: fix mem under-charging with zerocopy sendmsg()
 
  - tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data()
 
  - neigh: do not trigger immediate probes on NUD_FAILED from
    neigh_managed_work, avoid a deadlock
 
  - bpf: use VM_MAP instead of VM_ALLOC for ringbuf, avoid KASAN
    false-positives
 
  - netfilter: nft_reject_bridge: fix for missing reply from prerouting
 
  - smc: forward wakeup to smc socket waitqueue after fallback
 
  - ieee802154:
      - return meaningful error codes from the netlink helpers
      - mcr20a: fix lifs/sifs periods
      - at86rf230, ca8210: stop leaking skbs on error paths
 
  - macsec: add missing un-offload call for NETDEV_UNREGISTER of parent
 
  - ax25: add refcount in ax25_dev to avoid UAF bugs
 
  - eth: mlx5e:
      - fix SFP module EEPROM query
      - fix broken SKB allocation in HW-GRO
      - IPsec offload: fix tunnel mode crypto for non-TCP/UDP flows
 
  - eth: amd-xgbe:
      - fix skb data length underflow
      - ensure reset of the tx_timer_active flag, avoid Tx timeouts
 
  - eth: stmmac: fix runtime pm use in stmmac_dvr_remove()
 
  - eth: e1000e: handshake with CSME starts from Alder Lake platforms
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmH8X9UACgkQMUZtbf5S
 IrsxuhAAlAvFHGL6y5Y2gAmhKvVUvCYjiIJBcvk7R66CwYVRxofvlhmxi6GM/Czs
 9SrVSaN4RXu3p3d7UtAl1gAQwHqzLIHH3m2g5dSKVvHZWQgkm/+n74x0aZQ9Fll7
 mWs9uu5fWsQr/qZBnnjoQTvUxRUNVd4trBy7nXGzkNqJL5j0+2TT4BhH4qalhE28
 iPc9YFCyKPdjoWFksteZqD3hAQbXxK/xRRr6xuvFHENlZdEHM6ARftHnJthTG/fY
 32rdn9YUkQ9lNtOBJNMN9yP2z1B7TcxASBqjjk55I7XtT1QAI9/PskszavHC0hOk
 BCSMX779bLNW4+G0wiSKVB4tq4tvswtawq8Hxa6zdU4TKIzfQ84ZL/Nf66GtH+4W
 C0mbZohmyJV9hQFkNT0ZLeihljd7i4BkDttlbK3uz2IL9tHeX3uSo5V7AgS/Xaf6
 frXgbGgjQTaR6IL9AUhfN3GTCx60mzpH/aRpFho8A5xAl3EtHWCJcRhbY/CEhQBR
 zyCndcLcG5mUzbhx/TxlKrrpRCLxqCUG/Tsb2wCh5jMxO1zonW9Hhv4P1ie6EFuI
 h+XiJT2WWObS/KTze9S86WOR0zcqrtRqaOGJlNB+/+K8ClZU8UsDTFXLQ0dqpVZF
 Mvp7VchBzyFFJrrvO8WkkJgLTKdaPJmM9wuWUZb4J6d2MWlmDkE=
 =qKvf
 -----END PGP SIGNATURE-----

Merge tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf, netfilter, and ieee802154.

  Current release - regressions:

   - Partially revert "net/smc: Add netlink net namespace support", fix
     uABI breakage

   - netfilter:
      - nft_ct: fix use after free when attaching zone template
      - nft_byteorder: track register operations

  Previous releases - regressions:

   - ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback

   - phy: qca8081: fix speeds lower than 2.5Gb/s

   - sched: fix use-after-free in tc_new_tfilter()

  Previous releases - always broken:

   - tcp: fix mem under-charging with zerocopy sendmsg()

   - tcp: add missing tcp_skb_can_collapse() test in
     tcp_shift_skb_data()

   - neigh: do not trigger immediate probes on NUD_FAILED from
     neigh_managed_work, avoid a deadlock

   - bpf: use VM_MAP instead of VM_ALLOC for ringbuf, avoid KASAN
     false-positives

   - netfilter: nft_reject_bridge: fix for missing reply from prerouting

   - smc: forward wakeup to smc socket waitqueue after fallback

   - ieee802154:
      - return meaningful error codes from the netlink helpers
      - mcr20a: fix lifs/sifs periods
      - at86rf230, ca8210: stop leaking skbs on error paths

   - macsec: add missing un-offload call for NETDEV_UNREGISTER of parent

   - ax25: add refcount in ax25_dev to avoid UAF bugs

   - eth: mlx5e:
      - fix SFP module EEPROM query
      - fix broken SKB allocation in HW-GRO
      - IPsec offload: fix tunnel mode crypto for non-TCP/UDP flows

   - eth: amd-xgbe:
      - fix skb data length underflow
      - ensure reset of the tx_timer_active flag, avoid Tx timeouts

   - eth: stmmac: fix runtime pm use in stmmac_dvr_remove()

   - eth: e1000e: handshake with CSME starts from Alder Lake platforms"

* tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
  ax25: fix reference count leaks of ax25_dev
  net: stmmac: ensure PTP time register reads are consistent
  net: ipa: request IPA register values be retained
  dt-bindings: net: qcom,ipa: add optional qcom,qmp property
  tools/resolve_btfids: Do not print any commands when building silently
  bpf: Use VM_MAP instead of VM_ALLOC for ringbuf
  net, neigh: Do not trigger immediate probes on NUD_FAILED from neigh_managed_work
  tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data()
  net: sparx5: do not refer to skb after passing it on
  Partially revert "net/smc: Add netlink net namespace support"
  net/mlx5e: Avoid field-overflowing memcpy()
  net/mlx5e: Use struct_group() for memcpy() region
  net/mlx5e: Avoid implicit modify hdr for decap drop rule
  net/mlx5e: IPsec: Fix tunnel mode crypto offload for non TCP/UDP traffic
  net/mlx5e: IPsec: Fix crypto offload for non TCP/UDP encapsulated traffic
  net/mlx5e: Don't treat small ceil values as unlimited in HTB offload
  net/mlx5: E-Switch, Fix uninitialized variable modact
  net/mlx5e: Fix handling of wrong devices during bond netevent
  net/mlx5e: Fix broken SKB allocation in HW-GRO
  net/mlx5e: Fix wrong calculation of header index in HW_GRO
  ...
2022-02-03 16:54:18 -08:00
Linus Torvalds 551007a8f1 selinux/stable-5.17 PR 20220203
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmH8VkAUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNyLQ/9GAsvvB7PYeUpj0CGLMaAT9Hys5l+
 WPjP+NU+HF+r+AUsSCJwKkK4yKnpEDK9nidOdkwiYjAO/83yl7kkBPGRgisQep1A
 tEbuJ5vqZnR59jLxNKCmQE0gY+gByjk3jZIVFLSWwG/ho7s1LQyNoYpm7rbFIgAz
 6qe7IR1nsATzxRhDoJI3RIPlQjzhM1qEX9PBEtwW+LLieShtvMc+ijdiUw7bqNl9
 RTM6hRf4fTX4jLHtxfZYZ99bHEjIseksFbSAnjKxxkt0W5EFha73VX8hjwnG24J/
 XZQAhsyvpQmcZKJGZPWUSa+UFcytoauMnNdgJOQw7TcMT4Y2mMuvcoZ/KkFtDjdr
 30qhp46/gml2yqnByXRfzshGQm9E4ZoqSCn+lFWAfjlrhcqdgZFKpILpwMixbdin
 NgTA/pbwXovrlho8UflB0sbDMrbyV3qNGZXD/4hRg66Vm3F7ipgqPBbM89qoDniG
 CXiQnmRQ+rwcftyeE7me7+kD6djYTWOfEY5HRNiCf9NhnQG8GP7YzZ4KACxJ2PwQ
 R9+Egc9nAl4UG6PrEjZeud81rLzc+ws2SJLokxOcIGnid8lZidf83HfWekAmRloA
 J5+tmpx5q26ug/j2uXV/rp36xaQWhjJrrnhEKamIYYAVioXa9srRhtz3qRI8r/13
 mrZ5hu4le8aC/5s=
 =AyIM
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20220203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux fix from Paul Moore:
 "One small SELinux patch to ensure that a policy structure field is
  properly reset after freeing so that we don't inadvertently do a
  double-free on certain error conditions"

* tag 'selinux-pr-20220203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: fix double free of cond_list on error paths
2022-02-03 16:44:12 -08:00
Linus Torvalds 25b20ae815 linux-kselftest-fixes-5.17-rc3
This Kselftest fixes update for Linux 5.17-rc3 consists of important
 fixes to several tests and documentation clarification on running
 mainline kselftest on stable releases. A few notable fixes:
 
 - fix kselftest run hang due to child processes that haven't been
   terminated. Fix signals all child processes
 - fix false pass/fail results from vdso_test_abi, openat2, mincore
 - build failures when using -j (multiple jobs) option
 - exec test build failure due to incorrect build rule for a run-time
   created "pipe"
 - zram test fixes related to interaction with zram-generator to
   make sure zram test to coordinate deleted with zram-generator
 - zram test compression ratio calculation fix and skipping
   max_comp_streams.
 - increasing rtc test timeout
 - cpufreq test to write test results to stdout which will necessary on
   automated test systems
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmH8OcAACgkQCwJExA0N
 QxzTkg//YF+iSc1aao8nmUOvsK6oc1RBwIj3hkLUHjP3H1qFkm9OYxzTcLYGcnyo
 JahxNjeVoDuVESYx/AyLZ568aCCxRXJEDzNm5eIVNfBrGtVTfFPM19HwC/R3I1Ew
 KJUruxRx++8AvI1RYMEzsDumKpLVe3bor7sj3CcO1E9/qkOoUAukxt7FVmSNMZlW
 qYCDgc3yBa/XrImHCbJdZc4CUhbmh+l05sZgG3V3fxQSlgfIClY0Qg8W7Ucu+r4S
 6W5nwoEJIG32Zl2avaZ2VTF4T+CTQB70g/n4OBEX8TAxuIIi9W12N6zMZ76q8qbp
 iRs7UqgUSqPWdz/3ZHiQ5gy0WsCJ/W1379TtiG0doEeU2vwZ6fR8NMn+2FrEH18W
 xHBPOWeN+PkVAFjUeoyt1c5OGNprK6EEE2kQ6CLoBTwlKWLDQ87ZNPf84uAsez1x
 G0m7AX/T5adeTLoZfEXNXVY4OROs0nxbAkGC5ghVtKQu1giMcUKj+KHUowgj5OIJ
 Zaj+uSPiN3hnwj5L2fk+orOEC+3bZxVoQzqSB2Bs6stQOQFZLP18xHIjQIDZoQuC
 O512ZY+dwMzSyTi2KoQmb/M0Ft3gSfhRVXc7gfEFfOvC3ZbqRGFfeES1ZILup3rZ
 izMTJBDOe+BGSm/GCFPHxu36YfdPqiAyHBTVSYy5EhLGpxafZvI=
 =rH2m
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-fixes-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:
 "Important fixes to several tests and documentation clarification on
  running mainline kselftest on stable releases. A few notable fixes:

   - fix kselftest run hang due to child processes that haven't been
     terminated. Fix signals all child processes

   - fix false pass/fail results from vdso_test_abi, openat2, mincore

   - build failures when using -j (multiple jobs) option

   - exec test build failure due to incorrect build rule for a run-time
     created "pipe"

   - zram test fixes related to interaction with zram-generator to make
     sure zram test to coordinate deleted with zram-generator

   - zram test compression ratio calculation fix and skipping
     max_comp_streams.

   - increasing rtc test timeout

   - cpufreq test to write test results to stdout which will necessary
     on automated test systems"

* tag 'linux-kselftest-fixes-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kselftest: Fix vdso_test_abi return status
  selftests: skip mincore.check_file_mmap when fs lacks needed support
  selftests: openat2: Skip testcases that fail with EOPNOTSUPP
  selftests: openat2: Add missing dependency in Makefile
  selftests: openat2: Print also errno in failure messages
  selftests: futex: Use variable MAKE instead of make
  selftests/exec: Remove pipe from TEST_GEN_FILES
  selftests/zram: Adapt the situation that /dev/zram0 is being used
  selftests/zram01.sh: Fix compression ratio calculation
  selftests/zram: Skip max_comp_streams interface on newer kernel
  docs/kselftest: clarify running mainline tests on stables
  kselftest: signal all child processes
  selftests: cpufreq: Write test output to stdout as well
  selftests: rtc: Increase test timeout so that all tests run
2022-02-03 16:36:26 -08:00
Duoming Zhou 87563a043c ax25: fix reference count leaks of ax25_dev
The previous commit d01ffb9eee ("ax25: add refcount in ax25_dev
to avoid UAF bugs") introduces refcount into ax25_dev, but there
are reference leak paths in ax25_ctl_ioctl(), ax25_fwd_ioctl(),
ax25_rt_add(), ax25_rt_del() and ax25_rt_opt().

This patch uses ax25_dev_put() and adjusts the position of
ax25_addr_ax25dev() to fix reference cout leaks of ax25_dev.

Fixes: d01ffb9eee ("ax25: add refcount in ax25_dev to avoid UAF bugs")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20220203150811.42256-1-duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-03 14:20:36 -08:00
Yannick Vignon 80d4609008 net: stmmac: ensure PTP time register reads are consistent
Even if protected from preemption and interrupts, a small time window
remains when the 2 register reads could return inconsistent values,
each time the "seconds" register changes. This could lead to an about
1-second error in the reported time.

Add logic to ensure the "seconds" and "nanoseconds" values are consistent.

Fixes: 92ba688851 ("stmmac: add the support for PTP hw clock driver")
Signed-off-by: Yannick Vignon <yannick.vignon@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20220203160025.750632-1-yannick.vignon@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-03 13:54:19 -08:00
Jakub Kicinski 77b1b8b43e Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2022-02-03

We've added 6 non-merge commits during the last 10 day(s) which contain
a total of 7 files changed, 11 insertions(+), 236 deletions(-).

The main changes are:

1) Fix BPF ringbuf to allocate its area with VM_MAP instead of VM_ALLOC
   flag which otherwise trips over KASAN, from Hou Tao.

2) Fix unresolved symbol warning in resolve_btfids due to LSM callback
   rename, from Alexei Starovoitov.

3) Fix a possible race in inc_misses_counter() when IRQ would trigger
   during counter update, from He Fengqing.

4) Fix tooling infra for cross-building with clang upon probing whether
   gcc provides the standard libraries, from Jean-Philippe Brucker.

5) Fix silent mode build for resolve_btfids, from Nathan Chancellor.

6) Drop unneeded and outdated lirc.h header copy from tooling infra as
   BPF does not require it anymore, from Sean Young.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  tools/resolve_btfids: Do not print any commands when building silently
  bpf: Use VM_MAP instead of VM_ALLOC for ringbuf
  tools: Ignore errors from `which' when searching a GCC toolchain
  tools headers UAPI: remove stale lirc.h
  bpf: Fix possible race in inc_misses_counter
  bpf: Fix renaming task_getsecid_subj->current_getsecid_subj.
====================

Link: https://lore.kernel.org/r/20220203155815.25689-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-03 13:42:38 -08:00
Jens Axboe e8db8c9cad nvme fixes for Linux 5.17
- fix a use-after-free in rdm and tcp controller reset (Sagi Grimberg)
  - fix the state check in nvmf_ctlr_matches_baseopts (Uday Shankar)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmH8LisLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYPHzQ//X1QK3AyDl2X8jcpnDr/FJB2Pqp/dvmHNF7EFImSq
 aizW282ngBxZR5HUQxFxt/sFNjq07Zcz3ZImgCa3x8fW6HR/dPn8puM3bIMsWem7
 cZGr10mReujIa6mQdmYPuzwUCI5ycRqIKuA50Ug4MpBevILRO2xOhRGgYahoMsrm
 j6YWazIHFvfIEtniVGoCaX5PnI17hQIS4bgkd8PwU3RnuYg3QDNf/vIMybnhq60Q
 kAwhm7i/BRWclVoETLriWgctdcEl1LSWcragyiTnfAMRmGS6vmdiWsG5bkul+KJn
 wah6Gy2X0/bPBAGORptk5XvkwreJXJqh9WRWZKeKRAskMHusq/FLIosAuugXnCnF
 jJFJTH+k1odLO4KOK30eVNwvaVtVX2O2zcCjHMwW1U3TXiJBebkkJE+x4Zqb2ACj
 PookjdS0nt+aWr2DmfXCC89L1gw5hbZotfBucK9o61LYQn38h52utmP3tbBm5EvE
 K/pOnFmvTxQxp5UMBRRTSNB4OSQbTNyrdCz8TFJ+UKTVYELexNvbszVv7Sp6eDXF
 tvEFTZRrcunRuA5jT6LnuTNG08nTeJUKfJHFm5zN6u/s1L8/Aty3wxuNmE9wHZ+A
 jHDn2eh9vf1OMRzzVVfBYIE0Jei/VVreIBVh1G2L74YxnFdl8Y/D7vqUzkbK0G4c
 UPQ=
 =uezq
 -----END PGP SIGNATURE-----

Merge tag 'nvme-5.17-2022-02-03' of git://git.infradead.org/nvme into block-5.17

Pull NVMe fixes from Christoph:

"nvme fixes for Linux 5.17

 - fix a use-after-free in rdm and tcp controller reset (Sagi Grimberg)
 - fix the state check in nvmf_ctlr_matches_baseopts (Uday Shankar)"

* tag 'nvme-5.17-2022-02-03' of git://git.infradead.org/nvme:
  nvme-fabrics: fix state check in nvmf_ctlr_matches_baseopts()
  nvme-rdma: fix possible use-after-free in transport error_recovery work
  nvme-tcp: fix possible use-after-free in transport error_recovery work
  nvme: fix a possible use-after-free in controller reset during load
2022-02-03 12:37:02 -07:00
Mickaël Salaün 1f2cfdd349 printk: Fix incorrect __user type in proc_dointvec_minmax_sysadmin()
The move of proc_dointvec_minmax_sysadmin() from kernel/sysctl.c to
kernel/printk/sysctl.c introduced an incorrect __user attribute to the
buffer argument.  I spotted this change in [1] as well as the kernel
test robot.  Revert this change to please sparse:

  kernel/printk/sysctl.c:20:51: warning: incorrect type in argument 3 (different address spaces)
  kernel/printk/sysctl.c:20:51:    expected void *
  kernel/printk/sysctl.c:20:51:    got void [noderef] __user *buffer

Fixes: faaa357a55 ("printk: move printk sysctl to printk/sysctl.c")
Link: https://lore.kernel.org/r/20220104155024.48023-2-mic@digikod.net [1]
Reported-by: kernel test robot <lkp@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Link: https://lore.kernel.org/r/20220203145029.272640-1-mic@digikod.net
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-03 11:27:38 -08:00
Igor Pylypiv 67d6212afd Revert "module, async: async_synchronize_full() on module init iff async is used"
This reverts commit 774a1221e8.

We need to finish all async code before the module init sequence is
done.  In the reverted commit the PF_USED_ASYNC flag was added to mark a
thread that called async_schedule().  Then the PF_USED_ASYNC flag was
used to determine whether or not async_synchronize_full() needs to be
invoked.  This works when modprobe thread is calling async_schedule(),
but it does not work if module dispatches init code to a worker thread
which then calls async_schedule().

For example, PCI driver probing is invoked from a worker thread based on
a node where device is attached:

	if (cpu < nr_cpu_ids)
		error = work_on_cpu(cpu, local_pci_probe, &ddi);
	else
		error = local_pci_probe(&ddi);

We end up in a situation where a worker thread gets the PF_USED_ASYNC
flag set instead of the modprobe thread.  As a result,
async_synchronize_full() is not invoked and modprobe completes without
waiting for the async code to finish.

The issue was discovered while loading the pm80xx driver:
(scsi_mod.scan=async)

modprobe pm80xx                      worker
...
  do_init_module()
  ...
    pci_call_probe()
      work_on_cpu(local_pci_probe)
                                     local_pci_probe()
                                       pm8001_pci_probe()
                                         scsi_scan_host()
                                           async_schedule()
                                           worker->flags |= PF_USED_ASYNC;
                                     ...
      < return from worker >
  ...
  if (current->flags & PF_USED_ASYNC) <--- false
  	async_synchronize_full();

Commit 21c3c5d280 ("block: don't request module during elevator init")
fixed the deadlock issue which the reverted commit 774a1221e8
("module, async: async_synchronize_full() on module init iff async is
used") tried to fix.

Since commit 0fdff3ec6d ("async, kmod: warn on synchronous
request_module() from async workers") synchronous module loading from
async is not allowed.

Given that the original deadlock issue is fixed and it is no longer
allowed to call synchronous request_module() from async we can remove
PF_USED_ASYNC flag to make module init consistently invoke
async_synchronize_full() unless async module probe is requested.

Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Reviewed-by: Changyuan Lyu <changyuanl@google.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-03 11:20:34 -08:00
Jens Axboe aace2b7a93 Merge branch 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-5.17
Pull MD fix from Song:

"Please consider pulling the following fix on top of your block-5.17
 branch. It fixes a NULL ptr deref case with nowait."

* 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
  md: fix NULL pointer deref with nowait but no mddev->queue
2022-02-03 11:54:53 -07:00
Linus Torvalds 305e6c42e8 Merge branch 'for-5.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:

 - Eric's fix for a long standing cgroup1 permission issue where it only
   checks for uid 0 instead of CAP which inadvertently allows
   unprivileged userns roots to modify release_agent userhelper

 - Fixes for the fallout from Waiman's recent cpuset work

* 'for-5.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup/cpuset: Fix "suspicious RCU usage" lockdep warning
  cgroup-v1: Require capabilities to set release_agent
  cpuset: Fix the bug that subpart_cpus updated wrongly in update_cpumask()
  cgroup/cpuset: Make child cpusets restrict parents on v1 hierarchy
2022-02-03 08:15:13 -08:00
Jakub Kicinski 0166556a12 Merge branch 'net-ipa-enable-register-retention'
Alex Elder says:

====================
net: ipa: enable register retention

With runtime power management in place, we sometimes need to issue
a command to enable retention of IPA register values before power
collapse.  This requires a new Device Tree property, whose presence
will also be used to signal that the command is required.
====================

Link: https://lore.kernel.org/r/20220201150205.468403-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-03 08:04:16 -08:00
Alex Elder 34a081761e net: ipa: request IPA register values be retained
In some cases, the IPA hardware needs to request the always-on
subsystem (AOSS) to coordinate with the IPA microcontroller to
retain IPA register values at power collapse.  This is done by
issuing a QMP request to the AOSS microcontroller.  A similar
request ondoes that request.

We must get and hold the "QMP" handle early, because we might get
back EPROBE_DEFER for that.  But the actual request should be sent
while we know the IPA clock is active, and when we know the
microcontroller is operational.

Fixes: 1aac309d32 ("net: ipa: use autosuspend")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-03 08:03:43 -08:00
Alex Elder ac62a0174d dt-bindings: net: qcom,ipa: add optional qcom,qmp property
For some systems, the IPA driver must make a request to ensure that
its registers are retained across power collapse of the IPA hardware.
On such systems, we'll use the existence of the "qcom,qmp" property
as a signal that this request is required.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-03 08:03:20 -08:00
Waiman Long 2bdfd2825c cgroup/cpuset: Fix "suspicious RCU usage" lockdep warning
It was found that a "suspicious RCU usage" lockdep warning was issued
with the rcu_read_lock() call in update_sibling_cpumasks().  It is
because the update_cpumasks_hier() function may sleep. So we have
to release the RCU lock, call update_cpumasks_hier() and reacquire
it afterward.

Also add a percpu_rwsem_assert_held() in update_sibling_cpumasks()
instead of stating that in the comment.

Fixes: 4716909cc5 ("cpuset: Track cpusets that use parent's effective_cpus")
Signed-off-by: Waiman Long <longman@redhat.com>
Tested-by: Phil Auld <pauld@redhat.com>
Reviewed-by: Phil Auld <pauld@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-02-03 05:59:01 -10:00
hongnanli f340b3d902 fs/ext4: fix comments mentioning i_mutex
inode->i_mutex has been replaced with inode->i_rwsem long ago. Fix
comments still mentioning i_mutex.

Signed-off-by: hongnanli <hongnan.li@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220121070611.21618-1-hongnan.li@linux.alibaba.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-02-03 10:57:53 -05:00
Xin Yin 8fca8a2b0a ext4: fix incorrect type issue during replay_del_range
should not use fast commit log data directly, add le32_to_cpu().

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 0b5b5a62b9 ("ext4: use ext4_ext_remove_space() for fast commit replay delete range")
Cc: stable@kernel.org
Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>
Link: https://lore.kernel.org/r/20220126063146.2302-1-yinxin.x@bytedance.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-02-03 10:57:53 -05:00
Yang Li 715a67f11d jbd2: fix kernel-doc descriptions for jbd2_journal_shrink_{scan,count}()
Add the description of @shrink and @sc in jbd2_journal_shrink_scan() and
jbd2_journal_shrink_count() kernel-doc comment to remove warnings found
by running scripts/kernel-doc, which is caused by using 'make W=1'.
fs/jbd2/journal.c:1296: warning: Function parameter or member 'shrink'
not described in 'jbd2_journal_shrink_scan'
fs/jbd2/journal.c:1296: warning: Function parameter or member 'sc' not
described in 'jbd2_journal_shrink_scan'
fs/jbd2/journal.c:1320: warning: Function parameter or member 'shrink'
not described in 'jbd2_journal_shrink_count'
fs/jbd2/journal.c:1320: warning: Function parameter or member 'sc' not
described in 'jbd2_journal_shrink_count'

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220110132841.34531-1-yang.lee@linux.alibaba.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-02-03 10:57:53 -05:00
Lukas Czerner 7c268d4ce2 ext4: fix potential NULL pointer dereference in ext4_fill_super()
By mistake we fail to return an error from ext4_fill_super() in case
that ext4_alloc_sbi() fails to allocate a new sbi. Instead we just set
the ret variable and allow the function to continue which will later
lead to a NULL pointer dereference. Fix it by returning -ENOMEM in the
case ext4_alloc_sbi() fails.

Fixes: cebe85d570 ("ext4: switch to the new mount api")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Link: https://lore.kernel.org/r/20220119130209.40112-1-lczerner@redhat.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2022-02-03 10:57:44 -05:00
Ritesh Harjani 4f98186848 jbd2: refactor wait logic for transaction updates into a common function
No functionality change as such in this patch. This only refactors the
common piece of code which waits for t_updates to finish into a common
function named as jbd2_journal_wait_updates(journal_t *)

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/8c564f70f4b2591171677a2a74fccb22a7b6c3a4.1642416995.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-02-03 10:57:44 -05:00
Ritesh Harjani 3ca40c0d32 jbd2: cleanup unused functions declarations from jbd2.h
During code review found no references of few of these below function
declarations. This patch cleans those up from jbd2.h

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/30d1fc327becda197a4136cf9cdc73d9baa3b7b9.1642416995.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-02-03 10:57:44 -05:00
Ritesh Harjani cdce59a154 ext4: fix error handling in ext4_fc_record_modified_inode()
Current code does not fully takes care of krealloc() error case, which
could lead to silent memory corruption or a kernel bug.  This patch
fixes that.

Also it cleans up some duplicated error handling logic from various
functions in fast_commit.c file.

Reported-by: luo penghao <luo.penghao@zte.com.cn>
Suggested-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/62e8b6a1cce9359682051deb736a3c0953c9d1e9.1642416995.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2022-02-03 10:57:30 -05:00
Ritesh Harjani 09355d9d03 ext4: remove redundant max inline_size check in ext4_da_write_inline_data_begin()
ext4_prepare_inline_data() already checks for ext4_get_max_inline_size()
and returns -ENOSPC. So there is no need to check it twice within
ext4_da_write_inline_data_begin(). This patch removes the extra check.

It also makes it more clean.

No functionality change in this patch.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/cdd1654128d5105550c65fd13ca5da53b2162cc4.1642416995.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-02-03 10:57:30 -05:00
Ritesh Harjani 897026aaa7 ext4: fix error handling in ext4_restore_inline_data()
While running "./check -I 200 generic/475" it sometimes gives below
kernel BUG(). Ideally we should not call ext4_write_inline_data() if
ext4_create_inline_data() has failed.

<log snip>
[73131.453234] kernel BUG at fs/ext4/inline.c:223!

<code snip>
 212 static void ext4_write_inline_data(struct inode *inode, struct ext4_iloc *iloc,
 213                                    void *buffer, loff_t pos, unsigned int len)
 214 {
<...>
 223         BUG_ON(!EXT4_I(inode)->i_inline_off);
 224         BUG_ON(pos + len > EXT4_I(inode)->i_inline_size);

This patch handles the error and prints out a emergency msg saying potential
data loss for the given inode (since we couldn't restore the original
inline_data due to some previous error).

[ 9571.070313] EXT4-fs (dm-0): error restoring inline_data for inode -- potential data loss! (inode 1703982, error -30)

Reported-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/9f4cd7dfd54fa58ff27270881823d94ddf78dd07.1642416995.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2022-02-03 10:57:20 -05:00
Xin Yin bdc8a53a6f ext4: fast commit may miss file actions
in the follow scenario:
1. jbd start transaction n
2. task A get new handle for transaction n+1
3. task A do some actions and add inode to FC_Q_MAIN fc_q
4. jbd complete transaction n and clear FC_Q_MAIN fc_q
5. task A call fsync

Fast commit will lost the file actions during a full commit.

we should also add updates to staging queue during a full commit.
and in ext4_fc_cleanup(), when reset a inode's fc track range, check
it's i_sync_tid, if it bigger than current transaction tid, do not
rest it, or we will lost the track range.

And EXT4_MF_FC_COMMITTING is not needed anymore, so drop it.

Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Link: https://lore.kernel.org/r/20220117093655.35160-3-yinxin.x@bytedance.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2022-02-03 10:57:02 -05:00
Xin Yin e85c81ba88 ext4: fast commit may not fallback for ineligible commit
For the follow scenario:
1. jbd start commit transaction n
2. task A get new handle for transaction n+1
3. task A do some ineligible actions and mark FC_INELIGIBLE
4. jbd complete transaction n and clean FC_INELIGIBLE
5. task A call fsync

In this case fast commit will not fallback to full commit and
transaction n+1 also not handled by jbd.

Make ext4_fc_mark_ineligible() also record transaction tid for
latest ineligible case, when call ext4_fc_cleanup() check
current transaction tid, if small than latest ineligible tid
do not clear the EXT4_MF_FC_INELIGIBLE.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reported-by: Ritesh Harjani <riteshh@linux.ibm.com>
Suggested-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Link: https://lore.kernel.org/r/20220117093655.35160-2-yinxin.x@bytedance.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2022-02-03 10:56:39 -05:00
Xin Yin 31a074a0c6 ext4: modify the logic of ext4_mb_new_blocks_simple
For now in ext4_mb_new_blocks_simple, if we found a block which
should be excluded then will switch to next group, this may
probably cause 'group' run out of range.

Change to check next block in the same group when get a block should
be excluded. Also change the search range to EXT4_CLUSTERS_PER_GROUP
and add error checking.

Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20220110035141.1980-3-yinxin.x@bytedance.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2022-02-03 10:56:24 -05:00
Xin Yin 599ea31d13 ext4: prevent used blocks from being allocated during fast commit replay
During fast commit replay procedure, we clear inode blocks bitmap in
ext4_ext_clear_bb(), this may cause ext4_mb_new_blocks_simple() allocate
blocks still in use.

Make ext4_fc_record_regions() also record physical disk regions used by
inodes during replay procedure. Then ext4_mb_new_blocks_simple() can
excludes these blocks in use.

Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Link: https://lore.kernel.org/r/20220110035141.1980-2-yinxin.x@bytedance.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2022-02-03 10:56:01 -05:00
Mathias Krause 46963e2e06 misc: fastrpc: avoid double fput() on failed usercopy
If the copy back to userland fails for the FASTRPC_IOCTL_ALLOC_DMA_BUFF
ioctl(), we shouldn't assume that 'buf->dmabuf' is still valid. In fact,
dma_buf_fd() called fd_install() before, i.e. "consumed" one reference,
leaving us with none.

Calling dma_buf_put() will therefore put a reference we no longer own,
leading to a valid file descritor table entry for an already released
'file' object which is a straight use-after-free.

Simply avoid calling dma_buf_put() and rely on the process exit code to
do the necessary cleanup, if needed, i.e. if the file descriptor is
still valid.

Fixes: 6cffd79504 ("misc: fastrpc: Add support for dmabuf exporter")
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Link: https://lore.kernel.org/r/20220127130218.809261-1-minipli@grsecurity.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-03 16:45:39 +01:00