Граф коммитов

1122309 Коммитов

Автор SHA1 Сообщение Дата
Petr Vorel bfca3dd3d0 kernel/utsname_sysctl.c: print kernel arch
Print the machine hardware name (UTS_MACHINE) in /proc/sys/kernel/arch.

This helps people who debug kernel with initramfs with minimal environment
(i.e.  without coreutils or even busybox) or allow to open sysfs file
instead of run 'uname -m' in high level languages.

Link: https://lkml.kernel.org/r/20220901194403.3819-1-pvorel@suse.cz
Signed-off-by: Petr Vorel <pvorel@suse.cz>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: David Sterba <dsterba@suse.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:12 -07:00
Mickaël Salaün 8ea0114eda checkpatch: handle FILE pointer type
When using a "FILE *" type, checkpatch considers this an error:
  ERROR: need consistent spacing around '*' (ctx:WxV)
  #32: FILE: f.c:8:
  +static void a(FILE *const b)
                      ^

Fix this by explicitly defining "FILE" as a common type.  This is useful for
user space patches.

With this patch, we now get:
   <E> <E> <_>WS( )
   <E> <E> <_>IDENT(static)
   <E> <V> <_>WS( )
   <E> <V> <_>DECLARE(void )
   <E> <T> <_>FUNC(a)
   <E> <V> <V>PAREN('(')
   <EV> <N> <_>DECLARE(FILE *const )
   <EV> <T> <_>IDENT(b)
   <EV> <V> <_>PAREN(')') -> V
   <E> <V> <_>WS(
  )
  32 > . static void a(FILE *const b)
  32 > EEVVVVVVVTTTTTVNTTTTTTTTTTTTVVV
  32 >  ______________________________

Link: https://lkml.kernel.org/r/20220902111923.1488671-1-mic@digikod.net
Link: https://lore.kernel.org/r/20220902111923.1488671-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Acked-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:12 -07:00
Andy Shevchenko 7b9e664beb asm-generic: make parameter types consistent in _unaligned_be48()
There is a convention to use internal kernel types, so replace __u8 by u8.

Link: https://lkml.kernel.org/r/20220830172713.43686-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:12 -07:00
wuchi 35783ccbe5 kernel/profile.c: simplify duplicated code in profile_setup()
The code to parse option string "schedule/sleep/kvm" of cmdline in
function profile_setup is redundant, so simplify that.

Link: https://lkml.kernel.org/r/20220901003121.53597-1-wuchi.zero@gmail.com
Signed-off-by: wuchi <wuchi.zero@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foudation.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:12 -07:00
Hawkins Jiawei 63095f4f3a ntfs: check overflow when iterating ATTR_RECORDs
Kernel iterates over ATTR_RECORDs in mft record in ntfs_attr_find(). 
Because the ATTR_RECORDs are next to each other, kernel can get the next
ATTR_RECORD from end address of current ATTR_RECORD, through current
ATTR_RECORD length field.

The problem is that during iteration, when kernel calculates the end
address of current ATTR_RECORD, kernel may trigger an integer overflow bug
in executing `a = (ATTR_RECORD*)((u8*)a + le32_to_cpu(a->length))`.  This
may wrap, leading to a forever iteration on 32bit systems.

This patch solves it by adding some checks on calculating end address
of current ATTR_RECORD during iteration.

Link: https://lkml.kernel.org/r/20220831160935.3409-4-yin31149@gmail.com
Link: https://lore.kernel.org/all/20220827105842.GM2030@kadam/
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: chenxiaosong (A) <chenxiaosong2@huawei.com>
Cc: syzkaller-bugs <syzkaller-bugs@googlegroups.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:12 -07:00
Hawkins Jiawei 36a4d82ddd ntfs: fix out-of-bounds read in ntfs_attr_find()
Kernel iterates over ATTR_RECORDs in mft record in ntfs_attr_find().  To
ensure access on these ATTR_RECORDs are within bounds, kernel will do some
checking during iteration.

The problem is that during checking whether ATTR_RECORD's name is within
bounds, kernel will dereferences the ATTR_RECORD name_offset field, before
checking this ATTR_RECORD strcture is within bounds.  This problem may
result out-of-bounds read in ntfs_attr_find(), reported by Syzkaller:

==================================================================
BUG: KASAN: use-after-free in ntfs_attr_find+0xc02/0xce0 fs/ntfs/attrib.c:597
Read of size 2 at addr ffff88807e352009 by task syz-executor153/3607

[...]
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:317 [inline]
 print_report.cold+0x2ba/0x719 mm/kasan/report.c:433
 kasan_report+0xb1/0x1e0 mm/kasan/report.c:495
 ntfs_attr_find+0xc02/0xce0 fs/ntfs/attrib.c:597
 ntfs_attr_lookup+0x1056/0x2070 fs/ntfs/attrib.c:1193
 ntfs_read_inode_mount+0x89a/0x2580 fs/ntfs/inode.c:1845
 ntfs_fill_super+0x1799/0x9320 fs/ntfs/super.c:2854
 mount_bdev+0x34d/0x410 fs/super.c:1400
 legacy_get_tree+0x105/0x220 fs/fs_context.c:610
 vfs_get_tree+0x89/0x2f0 fs/super.c:1530
 do_new_mount fs/namespace.c:3040 [inline]
 path_mount+0x1326/0x1e20 fs/namespace.c:3370
 do_mount fs/namespace.c:3383 [inline]
 __do_sys_mount fs/namespace.c:3591 [inline]
 __se_sys_mount fs/namespace.c:3568 [inline]
 __x64_sys_mount+0x27f/0x300 fs/namespace.c:3568
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
 [...]
 </TASK>

The buggy address belongs to the physical page:
page:ffffea0001f8d400 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7e350
head:ffffea0001f8d400 order:3 compound_mapcount:0 compound_pincount:0
flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000010200 0000000000000000 dead000000000122 ffff888011842140
raw: 0000000000000000 0000000000040004 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
 ffff88807e351f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88807e351f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88807e352000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                      ^
 ffff88807e352080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88807e352100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

This patch solves it by moving the ATTR_RECORD strcture's bounds checking
earlier, then checking whether ATTR_RECORD's name is within bounds. 
What's more, this patch also add some comments to improve its
maintainability.

Link: https://lkml.kernel.org/r/20220831160935.3409-3-yin31149@gmail.com
Link: https://lore.kernel.org/all/1636796c-c85e-7f47-e96f-e074fee3c7d3@huawei.com/
Link: https://groups.google.com/g/syzkaller-bugs/c/t_XdeKPGTR4/m/LECAuIGcBgAJ
Signed-off-by: chenxiaosong (A) <chenxiaosong2@huawei.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Reported-by: syzbot+5f8dcabe4a3b2c51c607@syzkaller.appspotmail.com
Tested-by: syzbot+5f8dcabe4a3b2c51c607@syzkaller.appspotmail.com
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: syzkaller-bugs <syzkaller-bugs@googlegroups.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:11 -07:00
Hawkins Jiawei d85a1bec8e ntfs: fix use-after-free in ntfs_attr_find()
Patch series "ntfs: fix bugs about Attribute", v2.

This patchset fixes three bugs relative to Attribute in record:

Patch 1 adds a sanity check to ensure that, attrs_offset field in first
mft record loading from disk is within bounds.

Patch 2 moves the ATTR_RECORD's bounds checking earlier, to avoid
dereferencing ATTR_RECORD before checking this ATTR_RECORD is within
bounds.

Patch 3 adds an overflow checking to avoid possible forever loop in
ntfs_attr_find().

Without patch 1 and patch 2, the kernel triggersa KASAN use-after-free
detection as reported by Syzkaller.

Although one of patch 1 or patch 2 can fix this, we still need both of
them.  Because patch 1 fixes the root cause, and patch 2 not only fixes
the direct cause, but also fixes the potential out-of-bounds bug.


This patch (of 3):

Syzkaller reported use-after-free read as follows:
==================================================================
BUG: KASAN: use-after-free in ntfs_attr_find+0xc02/0xce0 fs/ntfs/attrib.c:597
Read of size 2 at addr ffff88807e352009 by task syz-executor153/3607

[...]
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:317 [inline]
 print_report.cold+0x2ba/0x719 mm/kasan/report.c:433
 kasan_report+0xb1/0x1e0 mm/kasan/report.c:495
 ntfs_attr_find+0xc02/0xce0 fs/ntfs/attrib.c:597
 ntfs_attr_lookup+0x1056/0x2070 fs/ntfs/attrib.c:1193
 ntfs_read_inode_mount+0x89a/0x2580 fs/ntfs/inode.c:1845
 ntfs_fill_super+0x1799/0x9320 fs/ntfs/super.c:2854
 mount_bdev+0x34d/0x410 fs/super.c:1400
 legacy_get_tree+0x105/0x220 fs/fs_context.c:610
 vfs_get_tree+0x89/0x2f0 fs/super.c:1530
 do_new_mount fs/namespace.c:3040 [inline]
 path_mount+0x1326/0x1e20 fs/namespace.c:3370
 do_mount fs/namespace.c:3383 [inline]
 __do_sys_mount fs/namespace.c:3591 [inline]
 __se_sys_mount fs/namespace.c:3568 [inline]
 __x64_sys_mount+0x27f/0x300 fs/namespace.c:3568
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
 [...]
 </TASK>

The buggy address belongs to the physical page:
page:ffffea0001f8d400 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7e350
head:ffffea0001f8d400 order:3 compound_mapcount:0 compound_pincount:0
flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000010200 0000000000000000 dead000000000122 ffff888011842140
raw: 0000000000000000 0000000000040004 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
 ffff88807e351f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88807e351f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88807e352000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                      ^
 ffff88807e352080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88807e352100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

Kernel will loads $MFT/$DATA's first mft record in
ntfs_read_inode_mount().

Yet the problem is that after loading, kernel doesn't check whether
attrs_offset field is a valid value.

To be more specific, if attrs_offset field is larger than bytes_allocated
field, then it may trigger the out-of-bounds read bug(reported as
use-after-free bug) in ntfs_attr_find(), when kernel tries to access the
corresponding mft record's attribute.

This patch solves it by adding the sanity check between attrs_offset field
and bytes_allocated field, after loading the first mft record.

Link: https://lkml.kernel.org/r/20220831160935.3409-1-yin31149@gmail.com
Link: https://lkml.kernel.org/r/20220831160935.3409-2-yin31149@gmail.com
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: ChenXiaoSong <chenxiaosong2@huawei.com>
Cc: syzkaller-bugs <syzkaller-bugs@googlegroups.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:11 -07:00
wuchi 199cda1353 initramfs: mark my_inptr as __initdata
As my_inptr is only used in __init function unpack_to_rootfs(), mark it as
__initdata to allow it be freed after boot.

Link: https://lkml.kernel.org/r/20220827071116.83078-1-wuchi.zero@gmail.com
Signed-off-by: wuchi <wuchi.zero@gmail.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Martin Wilck <mwilck@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:11 -07:00
Yang Yingliang d2e85432a2 fail_function: fix wrong use of fei_attr_remove()
If register_kprobe() fails, the new attr is not added to the list yet, so
it should call fei_attr_free() intstead.

Link: https://lkml.kernel.org/r/20220826073337.2085798-3-yangyingliang@huawei.com
Fixes: 4b1a29a7f5 ("error-injection: Support fault injection framework")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:11 -07:00
Yang Yingliang cef9f5f866 fail_function: refactor code of checking return value of register_kprobe()
Refactor the error handling of register_kprobe() to improve readability. 
No functional change.

Link: https://lkml.kernel.org/r/20220826073337.2085798-2-yangyingliang@huawei.com
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:11 -07:00
Yang Yingliang f81259c6db fail_function: switch to memdup_user_nul() helper
Use memdup_user_nul() helper instead of open-coding to simplify the code.

Link: https://lkml.kernel.org/r/20220826073337.2085798-1-yangyingliang@huawei.com
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:10 -07:00
Uros Bizjak 9a15193e23 smpboot: use atomic_try_cmpxchg in cpu_wait_death and cpu_report_death
Use atomic_try_cmpxchg instead of atomic_cmpxchg (*ptr, old, new) == old
in cpu_wait_death and cpu_report_death.  x86 CMPXCHG instruction returns
success in ZF flag, so this change saves a compare after cmpxchg (and
related move instruction in front of cmpxchg).  Also, atomic_try_cmpxchg
implicitly assigns old *ptr value to "old" when cmpxchg fails, enabling
further code simplifications.

No functional change intended.

Link: https://lkml.kernel.org/r/20220825145603.5811-1-ubizjak@gmail.com
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:10 -07:00
Uros Bizjak 5fdfa161b2 task_work: use try_cmpxchg in task_work_add, task_work_cancel_match and task_work_run
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
task_work_add, task_work_cancel_match and task_work_run.  x86 CMPXCHG
instruction returns success in ZF flag, so this change saves a compare
after cmpxchg (and related move instruction in front of cmpxchg).

Also, atomic_try_cmpxchg implicitly assigns old *ptr value to "old"
when cmpxchg fails, enabling further code simplifications.

The patch avoids extra memory read in case cmpxchg fails.

Link: https://lkml.kernel.org/r/20220823152632.4517-1-ubizjak@gmail.com
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:10 -07:00
Wolfram Sang 977bbf4385 lib: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem.  Conversion is 1:1 because the return value is not used. 
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Link: https://lkml.kernel.org/r/20220818210203.8251-1-wsa+renesas@sang-engineering.com
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:10 -07:00
Wolfram Sang a1d3a6d9f2 init: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem.  Conversion is 1:1 because the return value is not used. 
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Link: https://lkml.kernel.org/r/20220818210200.8203-1-wsa+renesas@sang-engineering.com
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:10 -07:00
Wolfram Sang 512cb7e4c1 reiserfs: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem.  Conversion is 1:1 because the return value is not used. 
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Link: https://lkml.kernel.org/r/20220818210153.8095-1-wsa+renesas@sang-engineering.com
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:10 -07:00
Wolfram Sang c97e21fe91 ocfs2: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem.  Conversion is 1:1 because the return value is not used. 
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Link: https://lkml.kernel.org/r/20220818210123.7637-4-wsa+renesas@sang-engineering.com
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:09 -07:00
Wolfram Sang 216e71f13c ia64: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem.  Conversion is 1:1 because the return value is not used. 
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Link: https://lkml.kernel.org/r/20220818205940.6216-1-wsa+renesas@sang-engineering.com
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:09 -07:00
Wolfram Sang 88040e67b9 alpha: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem.  Conversion is 1:1 because the return value is not used. 
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Link: https://lkml.kernel.org/r/20220818205936.6144-1-wsa+renesas@sang-engineering.com
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:09 -07:00
Uros Bizjak e1d7c7609a bitops: use try_cmpxchg in set_mask_bits and bit_clear_unless
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
set_mask_bits and bit_clear_unless.  x86 CMPXCHG instruction returns
success in ZF flag, so this change saves a compare after cmpxchg (and
related move instruction in front of cmpxchg).

Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
fails, enabling further code simplifications.

Link: https://lkml.kernel.org/r/20220822143851.3290-1-ubizjak@gmail.com
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:09 -07:00
Fabio M. De Francesco 21490eff12 hfs: replace kmap() with kmap_local_page() in btree.c
kmap() is being deprecated in favor of kmap_local_page().

Two main problems with kmap(): (1) It comes with an overhead as mapping
space is restricted and protected by a global lock for synchronization and
(2) it also requires global TLB invalidation when the kmap's pool wraps
and it might block when the mapping space is fully utilized until a slot
becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts). 
It is faster than kmap() in kernels with HIGHMEM enabled.  Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

Since its use in btree.c is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in btree.c.  Where
possible, use the suited standard helpers (memzero_page(), memcpy_page())
instead of open coding kmap_local_page() plus memset() or memcpy().

Tested in a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel with
HIGHMEM64GB enabled.

Link: https://lkml.kernel.org/r/20220821180400.8198-4-fmdefrancesco@gmail.com
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Christian Brauner (Microsoft) <brauner@kernel.org>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:09 -07:00
Fabio M. De Francesco ca0ac8dfd3 hfs: replace kmap() with kmap_local_page() in bnode.c
kmap() is being deprecated in favor of kmap_local_page().

Two main problems with kmap(): (1) It comes with an overhead as mapping
space is restricted and protected by a global lock for synchronization and
(2) it also requires global TLB invalidation when the kmap's pool wraps
and it might block when the mapping space is fully utilized until a slot
becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts). 
It is faster than kmap() in kernels with HIGHMEM enabled.  Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

Since its use in bnode.c is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in bnode.c.  Where
possible, use the suited standard helpers (memzero_page(), memcpy_page())
instead of open coding kmap_local_page() plus memset() or memcpy().

Tested in a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel with
HIGHMEM64GB enabled.

Link: https://lkml.kernel.org/r/20220821180400.8198-3-fmdefrancesco@gmail.com
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Christian Brauner (Microsoft) <brauner@kernel.org>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:08 -07:00
Fabio M. De Francesco d75e9a4bcc hfs: unmap the page in the "fail_page" label
Patch series "hfs: Replace kmap() with kmap_local_page()".

kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmaps pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts). 
It is faster than kmap() in kernels with HIGHMEM enabled.  Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

Since its use in fs/hfs is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in fs/hfs.  Where
possible, use the suited standard helpers (memzero_page(), memcpy_page())
instead of open coding kmap_local_page() plus memset() or memcpy().

Fix a bug due to a page being not unmapped if the code jumps to the
"fail_page" label (1/3).

Tested in a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel with
HIGHMEM64GB enabled.


This patch (of 3):

Several paths within hfs_btree_open() jump to the "fail_page" label where
put_page() is called while the page is still mapped.

Call kunmap() to unmap the page soon before put_page().

Link: https://lkml.kernel.org/r/20220821180400.8198-1-fmdefrancesco@gmail.com
Link: https://lkml.kernel.org/r/20220821180400.8198-2-fmdefrancesco@gmail.com
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Christian Brauner (Microsoft) <brauner@kernel.org>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Matthew Wilcox <willy@infradead.org>]
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:08 -07:00
Fabio M. De Francesco 948084f0f6 kexec: replace kmap() with kmap_local_page()
kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap's pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts). 
It is faster than kmap() in kernels with HIGHMEM enabled.  Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since its use in kexec_core.c is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in kexec_core.c.

Tested on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel with
HIGHMEM64GB enabled.

Link: https://lkml.kernel.org/r/20220821182519.9483-1-fmdefrancesco@gmail.com
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:08 -07:00
Uros Bizjak da3f52ba35 iversion: use atomic64_try_cmpxchg)
Use atomic64_try_cmpxchg instead of
atomic64_cmpxchg (*ptr, old, new) == old in inode_set_max_iversion_raw,
inode_maybe_inc_version and inode_query_iversion. x86 CMPXCHG instruction
returns success in ZF flag, so this change saves a compare after cmpxchg
(and related move instruction in front of cmpxchg).

Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
fails, enabling further code simplifications.

The loop in inode_maybe_inc_iversion improves from:

    5563:	48 89 ca             	mov    %rcx,%rdx
    5566:	48 89 c8             	mov    %rcx,%rax
    5569:	48 83 e2 fe          	and    $0xfffffffffffffffe,%rdx
    556d:	48 83 c2 02          	add    $0x2,%rdx
    5571:	f0 48 0f b1 16       	lock cmpxchg %rdx,(%rsi)
    5576:	48 39 c1             	cmp    %rax,%rcx
    5579:	0f 84 85 fc ff ff    	je     5204 <...>
    557f:	48 89 c1             	mov    %rax,%rcx
    5582:	eb df                	jmp    5563 <...>

to:

    5563:	48 89 c2             	mov    %rax,%rdx
    5566:	48 83 e2 fe          	and    $0xfffffffffffffffe,%rdx
    556a:	48 83 c2 02          	add    $0x2,%rdx
    556e:	f0 48 0f b1 11       	lock cmpxchg %rdx,(%rcx)
    5573:	0f 84 8b fc ff ff    	je     5204 <...>
    5579:	eb e8                	jmp    5563 <...>

Link: https://lkml.kernel.org/r/20220821193011.88208-1-ubizjak@gmail.com
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:08 -07:00
Uros Bizjak 38ace0d513 aio: use atomic_try_cmpxchg in __get_reqs_available
Use atomic_try_cmpxchg instead of atomic_cmpxchg (*ptr, old, new) == old
in __get_reqs_available.  x86 CMPXCHG instruction returns success in ZF
flag, so this change saves a compare after cmpxchg (and related move
instruction in front of cmpxchg).

Also, atomic_try_cmpxchg implicitly assigns old *ptr value to "old" when
cmpxchg fails, enabling further code simplifications.

No functional change intended.

Link: https://lkml.kernel.org/r/20220714164851.3055-1-ubizjak@gmail.com
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:08 -07:00
Uros Bizjak b0192296b4 buffer: use try_cmpxchg in discard_buffer
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
discard_buffer.  x86 CMPXCHG instruction returns success in ZF flag, so
this change saves a compare after cmpxchg (and related move instruction in
front of cmpxchg).

Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
fails, enabling further code simplifications.

Note that the value from *ptr should be read using READ_ONCE to prevent
the compiler from merging, refetching or reordering the read.

No functional change intended.

Link: https://lkml.kernel.org/r/20220714171653.12128-1-ubizjak@gmail.com
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:07 -07:00
Uros Bizjak 693fc06e98 epoll: use try_cmpxchg in list_add_tail_lockless
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
list_add_tail_lockless.  x86 CMPXCHG instruction returns success in ZF
flag, so this change saves a compare after cmpxchg (and related move
instruction in front of cmpxchg).

No functional change intended.

Link: https://lkml.kernel.org/r/20220714173255.12987-1-ubizjak@gmail.com
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:07 -07:00
Sergei Trofimovich aa06a9bd85 ia64: fix clock_getres(CLOCK_MONOTONIC) to report ITC frequency
clock_gettime(CLOCK_MONOTONIC, &tp) is very precise on ia64 as it uses ITC
(similar to rdtsc on x86).  It's not quite a hrtimer as it is a few times
slower than 1ns.  Usually 2-3ns.

clock_getres(CLOCK_MONOTONIC, &res) never reflected that fact and reported
0.04s precision (1/HZ value).

In https://bugs.gentoo.org/596382 gstreamer's test suite failed loudly
when it noticed precision discrepancy.

Before the change:

    clock_getres(CLOCK_MONOTONIC, &res) reported 250Hz precision.

After the change:

    clock_getres(CLOCK_MONOTONIC, &res) reports ITC (400Mhz) precision.

The patch is based on matoro's fix. I added a bit of explanation why we
need to special-case arch-specific clock_getres().

[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/20220820181813.2275195-1-slyich@gmail.com
Signed-off-by: Sergei Trofimovich <slyich@gmail.com>
Cc: matoro <matoro_mailinglist_kernel@matoro.tk>
Cc: Émeric Maschino <emeric.maschino@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:07 -07:00
Minghao Chi cba7543e15 fs/qnx6: delete unnecessary checks before brelse()
brelse() tests whether its argument is NULL and then returns immediately. 
Thus remove the tests which are not needed around the shown calls.

Link: https://lkml.kernel.org/r/20220819081819.96347-1-chi.minghao@zte.com.cn
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Cc: CGEL ZTE <cgel.zte@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Minghao Chi <chi.minghao@zte.com.cn>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:07 -07:00
Kefeng Wang 2be9880dc8 kernel: exit: cleanup release_thread()
Only x86 has own release_thread(), introduce a new weak release_thread()
function to clean empty definitions in other ARCHs.

Link: https://lkml.kernel.org/r/20220819014406.32266-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Guo Ren <guoren@kernel.org>				[csky]
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Brian Cain <bcain@quicinc.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>			[powerpc]
Acked-by: Stafford Horne <shorne@gmail.com>			[openrisc]
Acked-by: Catalin Marinas <catalin.marinas@arm.com>		[arm64]
Acked-by: Huacai Chen <chenhuacai@kernel.org>			[LoongArch]
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Guo Ren <guoren@kernel.org> [csky]
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Xuerui Wang <kernel@xen0n.name>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:07 -07:00
Brian Foster f4068af3a6 proc: save LOC in vsyscall test
Do one fork in vsyscall detection code and let SIGSEGV handler exit and
carry information to the parent saving LOC.

[adobriyan@gmail.com: redo original patch, delete unnecessary variables, minimise code changes]
Link: https://lkml.kernel.org/r/YvoWzAn5dlhF75xa@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Tested-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:06 -07:00
Uros Bizjak 4f1d2a030d llist: use try_cmpxchg in llist_add_batch and llist_del_first
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
llist_add_batch and llist_del_first.  x86 CMPXCHG instruction returns
success in ZF flag, so this change saves a compare after cmpxchg.

Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
fails, enabling further code simplifications.

No functional change intended.

Link: https://lkml.kernel.org/r/20220712144917.4497-1-ubizjak@gmail.com
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:06 -07:00
Valentin Schneider 05c6257433 panic, kexec: make __crash_kexec() NMI safe
Attempting to get a crash dump out of a debug PREEMPT_RT kernel via an NMI
panic() doesn't work.  The cause of that lies in the PREEMPT_RT definition
of mutex_trylock():

	if (IS_ENABLED(CONFIG_DEBUG_RT_MUTEXES) && WARN_ON_ONCE(!in_task()))
		return 0;

This prevents an nmi_panic() from executing the main body of
__crash_kexec() which does the actual kexec into the kdump kernel.  The
warning and return are explained by:

  6ce47fd961 ("rtmutex: Warn if trylock is called from hard/softirq context")
  [...]
  The reasons for this are:

      1) There is a potential deadlock in the slowpath

      2) Another cpu which blocks on the rtmutex will boost the task
	 which allegedly locked the rtmutex, but that cannot work
	 because the hard/softirq context borrows the task context.

Furthermore, grabbing the lock isn't NMI safe, so do away with kexec_mutex
and replace it with an atomic variable.  This is somewhat overzealous as
*some* callsites could keep using a mutex (e.g.  the sysfs-facing ones
like crash_shrink_memory()), but this has the benefit of involving a
single unified lock and preventing any future NMI-related surprises.

Tested by triggering NMI panics via:

  $ echo 1 > /proc/sys/kernel/panic_on_unrecovered_nmi
  $ echo 1 > /proc/sys/kernel/unknown_nmi_panic
  $ echo 1 > /proc/sys/kernel/panic

  $ ipmitool power diag

Link: https://lkml.kernel.org/r/20220630223258.4144112-3-vschneid@redhat.com
Fixes: 6ce47fd961 ("rtmutex: Warn if trylock is called from hard/softirq context")
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Juri Lelli <jlelli@redhat.com>
Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:06 -07:00
Valentin Schneider 7bb5da0d49 kexec: turn all kexec_mutex acquisitions into trylocks
Patch series "kexec, panic: Making crash_kexec() NMI safe", v4.


This patch (of 2):

Most acquistions of kexec_mutex are done via mutex_trylock() - those were
a direct "translation" from:

  8c5a1cf0ad ("kexec: use a mutex for locking rather than xchg()")

there have however been two additions since then that use mutex_lock():
crash_get_memory_size() and crash_shrink_memory().

A later commit will replace said mutex with an atomic variable, and
locking operations will become atomic_cmpxchg().  Rather than having those
mutex_lock() become while (atomic_cmpxchg(&lock, 0, 1)), turn them into
trylocks that can return -EBUSY on acquisition failure.

This does halve the printable size of the crash kernel, but that's still
neighbouring 2G for 32bit kernels which should be ample enough.

Link: https://lkml.kernel.org/r/20220630223258.4144112-1-vschneid@redhat.com
Link: https://lkml.kernel.org/r/20220630223258.4144112-2-vschneid@redhat.com
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Juri Lelli <jlelli@redhat.com>
Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:06 -07:00
Neel Natu 9847f21225 lib/cmdline: avoid page fault in next_arg
An argument list like "arg=val arg2 \"" can trigger a page fault if the
page pointed by 'args[0xffffffff]' is not mapped and potential memory
corruption otherwise (unlikely but possible if the bogus address is mapped
and contents happen to match the ascii value of the quote character).

The fix is to ensure that we load 'args[i-1]' only when (i > 0).

Prior to this commit the following command would trigger an
unhandled page fault in the kernel:

root@(none):/linus/fs/fat# insmod ./fat.ko  "foo=bar \""
[   33.870507] BUG: unable to handle page fault for address: ffff888204252608
[   33.872180] #PF: supervisor read access in kernel mode
[   33.873414] #PF: error_code(0x0000) - not-present page
[   33.874650] PGD 4401067 P4D 4401067 PUD 0
[   33.875321] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
[   33.876113] CPU: 16 PID: 399 Comm: insmod Not tainted 5.19.0-dbg-DEV #4
[   33.877193] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
[   33.878739] RIP: 0010:next_arg+0xd1/0x110
[   33.879399] Code: 22 75 1d 41 c6 04 01 00 41 80 f8 22 74 18 eb 35 4c 89 0e 45 31 d2 4c 89 cf 48 c7 02 00 00 00 00 41 80 f8 22 75 1f 41 8d 42 ff <41> 80 3c 01 22 75 14 41 c6 04 01 00 eb 0d 48 c7 02 00 00 00 00 41
[   33.882338] RSP: 0018:ffffc90001253d08 EFLAGS: 00010246
[   33.883174] RAX: 00000000ffffffff RBX: ffff888104252608 RCX: 0fc317bba1c1dd00
[   33.884311] RDX: ffffc90001253d40 RSI: ffffc90001253d48 RDI: ffff888104252609
[   33.885450] RBP: ffffc90001253d10 R08: 0000000000000022 R09: ffff888104252609
[   33.886595] R10: 0000000000000000 R11: ffffffff82c7ff20 R12: 0000000000000282
[   33.887748] R13: 00000000ffff8000 R14: 0000000000000000 R15: 0000000000007fff
[   33.888887] FS:  00007f04ec7432c0(0000) GS:ffff88813d300000(0000) knlGS:0000000000000000
[   33.890183] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   33.891111] CR2: ffff888204252608 CR3: 0000000100f36005 CR4: 0000000000170ee0
[   33.892241] Call Trace:
[   33.892641]  <TASK>
[   33.892989]  parse_args+0x8f/0x220
[   33.893538]  load_module+0x138b/0x15a0
[   33.894149]  ? prepare_coming_module+0x50/0x50
[   33.894879]  ? kernel_read_file_from_fd+0x5f/0x90
[   33.895639]  __se_sys_finit_module+0xce/0x130
[   33.896342]  __x64_sys_finit_module+0x1d/0x20
[   33.897042]  do_syscall_64+0x44/0xa0
[   33.897622]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   33.898434] RIP: 0033:0x7f04ec85ef79
[   33.899009] Code: 48 8d 3d da db 0d 00 0f 05 eb a5 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c7 9e 0d 00 f7 d8 64 89 01 48
[   33.901912] RSP: 002b:00007fffae81bfe8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[   33.903081] RAX: ffffffffffffffda RBX: 0000559c5f1d2640 RCX: 00007f04ec85ef79
[   33.904191] RDX: 0000000000000000 RSI: 0000559c5f1d12a0 RDI: 0000000000000003
[   33.905304] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[   33.906421] R10: 0000000000000003 R11: 0000000000000246 R12: 0000559c5f1d12a0
[   33.907526] R13: 0000000000000000 R14: 0000559c5f1d25f0 R15: 0000559c5f1d12a0
[   33.908631]  </TASK>
[   33.908986] Modules linked in: fat(+) [last unloaded: fat]
[   33.909843] CR2: ffff888204252608
[   33.910375] ---[ end trace 0000000000000000 ]---
[   33.911172] RIP: 0010:next_arg+0xd1/0x110
[   33.911796] Code: 22 75 1d 41 c6 04 01 00 41 80 f8 22 74 18 eb 35 4c 89 0e 45 31 d2 4c 89 cf 48 c7 02 00 00 00 00 41 80 f8 22 75 1f 41 8d 42 ff <41> 80 3c 01 22 75 14 41 c6 04 01 00 eb 0d 48 c7 02 00 00 00 00 41
[   33.914643] RSP: 0018:ffffc90001253d08 EFLAGS: 00010246
[   33.915446] RAX: 00000000ffffffff RBX: ffff888104252608 RCX: 0fc317bba1c1dd00
[   33.916544] RDX: ffffc90001253d40 RSI: ffffc90001253d48 RDI: ffff888104252609
[   33.917636] RBP: ffffc90001253d10 R08: 0000000000000022 R09: ffff888104252609
[   33.918727] R10: 0000000000000000 R11: ffffffff82c7ff20 R12: 0000000000000282
[   33.919821] R13: 00000000ffff8000 R14: 0000000000000000 R15: 0000000000007fff
[   33.920908] FS:  00007f04ec7432c0(0000) GS:ffff88813d300000(0000) knlGS:0000000000000000
[   33.922125] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   33.923017] CR2: ffff888204252608 CR3: 0000000100f36005 CR4: 0000000000170ee0
[   33.924098] Kernel panic - not syncing: Fatal exception
[   33.925776] Kernel Offset: disabled
[   33.926347] Rebooting in 10 seconds..

Link: https://lkml.kernel.org/r/20220728232434.1666488-1-neelnatu@google.com
Signed-off-by: Neel Natu <neelnatu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:06 -07:00
Ira Weiny defdaff15a checkpatch: add kmap and kmap_atomic to the deprecated list
kmap() and kmap_atomic() are being deprecated in favor of
kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap's pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

kmap_local_page() is safe from any context and is therefore redundant with
kmap_atomic() with the exception of any pagefault or preemption disable
requirements.  However, using kmap_atomic() for these side effects makes
the code less clear.  So any requirement for pagefault or preemption
disable should be made explicitly.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts). 
It is faster than kmap() in kernels with HIGHMEM enabled.  Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored.

Link: https://lkml.kernel.org/r/20220813220034.806698-1-ira.weiny@intel.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:06 -07:00
Fabio M. De Francesco 5bb6ce3aeb fs/isofs: replace kmap() with kmap_local_page()
The use of kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap's pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts). 
Tasks can be preempted and, when scheduled to run again, the kernel
virtual addresses are restored and still valid.  It is faster than kmap()
in kernels with HIGHMEM enabled.

Since kmap_local_page() can be safely used in compress.c, it should be
called everywhere instead of kmap().

Therefore, replace kmap() with kmap_local_page() in compress.c.  Where it
is needed, use memzero_page() instead of open coding kmap_local_page()
plus memset() to fill the pages with zeros.  Delete the redundant
flush_dcache_page() in the two call sites of memzero_page().

Tested with mkisofs on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel
with HIGHMEM64GB enabled.

Link: https://lkml.kernel.org/r/20220801122709.8164-1-fmdefrancesco@gmail.com
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Pali Rohár <pali@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:05 -07:00
Arnd Bergmann 64367f2e4f treewide: defconfig: address renamed CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_INFO is now implicitly selected if one picks one of the
explicit options that could be DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT,
DEBUG_INFO_DWARF4, DEBUG_INFO_DWARF5.

This was actually not what I had in mind when I suggested making it a
'choice' statement, but it's too late to change again now, and the Kconfig
logic is more sensible in the new form.

Change any defconfig file that had CONFIG_DEBUG_INFO enabled but did not
pick DWARF4 or DWARF5 explicitly to now pick the toolchain default.

Link: https://lkml.kernel.org/r/20220811114609.2097335-1-arnd@kernel.org
Fixes: f9b3cd2457 ("Kconfig.debug: make DEBUG_INFO selectable from a choice")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Cc: Rich Felker <dalias@libc.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:05 -07:00
Manfred Spraul 58b5c20336 ipc/util.c: cleanup and improve sysvipc_find_ipc()
sysvipc_find_ipc() can be simplified further:

- It uses a for() loop to locate the next entry in the idr.
  This can be replaced with idr_get_next().

- It receives two parameters (pos - which is actually
  an idr index and not a position, and new_pos, which
  is really a position).
  One parameter is sufficient.

Link: https://lore.kernel.org/all/20210903052020.3265-3-manfred@colorfullife.com/
Link: https://lkml.kernel.org/r/20220805115733.104763-1-manfred@colorfullife.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Waiman Long <longman@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: <1vier1@web.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:05 -07:00
Borislav Petkov 765f2bf04f scripts/decodecode: improve faulting line determination
There are cases where the IP pointer in a Code: line in an oops doesn't
point at the beginning of an instruction:

Code: 0f bd c2 e9 a0 cd b5 e4 48 0f bd c2 e9 97 cd b5 e4 0f 1f 80 00 00 00 00 \
	  e9 8b cd b5 e4 0f 1f 00 66 0f a3 d0 e9 7f cd b5 e4 0f 1f <80> 00 00 00 \
	  00 0f a3 d0 e9 70 cd b5 e4 48 0f a3 d0 e9 67 cd b5

  e9 7f cd b5 e4          jmp    0xffffffffe4b5cda8
  0f 1f 80 00 00 00 00    nopl   0x0(%rax)
	^^

and the current way of determining the faulting instruction line doesn't
work because disassembled instructions are counted from the IP byte to
the end and when that thing points in the middle, the trailing bytes can
be interpreted as different insns:

  Code starting with the faulting instruction
  ===========================================
     0:   80 00 00                addb   $0x0,(%rax)
     3:   00 00                   add    %al,(%rax)

whereas, this is part of

0f 1f 80 00 00 00 00    nopl   0x0(%rax)

     5:   0f a3 d0                bt     %edx,%eax
     ...

leading to:

  1d:   0f 1f 00                nopl   (%rax)
  20:   66 0f a3 d0             bt     %dx,%ax
  24:*  e9 7f cd b5 e4          jmp    0xffffffffe4b5cda8               <-- trapping instruction
  29:   0f 1f 80 00 00 00 00    nopl   0x0(%rax)
  30:   0f a3 d0                bt     %edx,%eax

which is the wrong faulting instruction.

Change the way the faulting line number is determined by matching the
opcode bytes from the beginning, leading to correct output:

  1d:   0f 1f 00                nopl   (%rax)
  20:   66 0f a3 d0             bt     %dx,%ax
  24:   e9 7f cd b5 e4          jmp    0xffffffffe4b5cda8
  29:*  0f 1f 80 00 00 00 00    nopl   0x0(%rax)                <-- trapping instruction
  30:   0f a3 d0                bt     %edx,%eax

While at it, make decodecode use bash as the interpreter - that thing
should be present on everything by now. It simplifies the code a lot
too.

Link: https://lkml.kernel.org/r/20220808085928.29840-1-bp@alien8.de
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:05 -07:00
Fabio M. De Francesco 9f25f357c5 hfsplus: convert kmap() to kmap_local_page() in btree.c
kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap's pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts). 
It is faster than kmap() in kernels with HIGHMEM enabled.  Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since its use in btree.c is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in btree.c.

Tested in a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel with
HIGHMEM64GB enabled.

Link: https://lkml.kernel.org/r/20220809203105.26183-5-fmdefrancesco@gmail.com
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:05 -07:00
Fabio M. De Francesco f9ef3b95a3 hfsplus: convert kmap() to kmap_local_page() in bitmap.c
kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap's pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts). 
It is faster than kmap() in kernels with HIGHMEM enabled.  Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since its use in bitmap.c is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in bitmap.c.

Tested in a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel with
HIGHMEM64GB enabled.

Link: https://lkml.kernel.org/r/20220809203105.26183-4-fmdefrancesco@gmail.com
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:04 -07:00
Fabio M. De Francesco 6c3014a67a hfsplus: convert kmap() to kmap_local_page() in bnode.c
kmap() is being deprecated in favor of kmap_local_page().

Two main problems with kmap(): (1) It comes with an overhead as mapping
space is restricted and protected by a global lock for synchronization and
(2) it also requires global TLB invalidation when the kmap's pool wraps
and it might block when the mapping space is fully utilized until a slot
becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts). 
It is faster than kmap() in kernels with HIGHMEM enabled.  Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

Since its use in bnode.c is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in bnode.c.  Where
possible, use the suited standard helpers (memzero_page(), memcpy_page())
instead of open coding kmap_local_page() plus memset() or memcpy().

Tested in a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel with
HIGHMEM64GB enabled.

Link: https://lkml.kernel.org/r/20220809203105.26183-3-fmdefrancesco@gmail.com
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:04 -07:00
Fabio M. De Francesco f5b23d6704 hfsplus: unmap the page in the "fail_page" label
Patch series "hfsplus: Replace kmap() with kmap_local_page()".

kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts). 
It is faster than kmap() in kernels with HIGHMEM enabled.  Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

Since its use in fs/hfsplus is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in fs/hfsplus.  Where
possible, use the suited standard helpers (memzero_page(), memcpy_page())
instead of open coding kmap_local_page() plus memset() or memcpy().

Fix a bug due to a page being not unmapped if the code jumps to the
"fail_page" label (1/4).

Tested in a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel with
HIGHMEM64GB enabled.


This patch (of 4):

Several paths within hfs_btree_open() jump to the "fail_page" label where
put_page() is called while the page is still mapped.

Call kunmap() to unmap the page soon before put_page().

Link: https://lkml.kernel.org/r/20220809203105.26183-1-fmdefrancesco@gmail.com
Link: https://lkml.kernel.org/r/20220809203105.26183-2-fmdefrancesco@gmail.com
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:04 -07:00
Linus Torvalds b90cb10531 Linux 6.0-rc3 2022-08-28 15:05:29 -07:00
Linus Torvalds b467192ec7 Seventeen hotfixes. Mostly memory management things. Ten patches are
cc:stable, addressing pre-6.0 issues.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYwvgrAAKCRDdBJ7gKXxA
 jlweAQC9dzE08Elxl4F7Uvxe+62JWVeflBRrT7sJ6jU1Gu3QcQEAhhI1Xit3/MGq
 pRytDBObGADxlA67c9eNq6J5pCT/7gE=
 =pD67
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more hotfixes from Andrew Morton:
 "Seventeen hotfixes.  Mostly memory management things.

  Ten patches are cc:stable, addressing pre-6.0 issues"

* tag 'mm-hotfixes-stable-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  .mailmap: update Luca Ceresoli's e-mail address
  mm/mprotect: only reference swap pfn page if type match
  squashfs: don't call kmalloc in decompressors
  mm/damon/dbgfs: avoid duplicate context directory creation
  mailmap: update email address for Colin King
  asm-generic: sections: refactor memory_intersects
  bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem
  ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown
  Revert "memcg: cleanup racy sum avoidance code"
  mm/zsmalloc: do not attempt to free IS_ERR handle
  binder_alloc: add missing mmap_lock calls when using the VMA
  mm: re-allow pinning of zero pfns (again)
  vmcoreinfo: add kallsyms_num_syms symbol
  mailmap: update Guilherme G. Piccoli's email addresses
  writeback: avoid use-after-free after removing device
  shmem: update folio if shmem_replace_page() updates the page
  mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte
2022-08-28 14:49:59 -07:00
Linus Torvalds 373eff576e bitmap fixes for v6.0-rc3
Hi Linus,
 
 Please pull (hopefully) the last portion of fixes from Sander for his
 UP rework series. The original series came from -mm tree, and it was
 not the latest version, that's why we need follow-ups. It fixes only
 a test introduced by that series. The test fails under certain configs.
 
 From Sander:
 
 This series fixes the reported issues, and implements the suggested
 improvements, for the version of the cpumask tests [1] that was merged
 with commit c41e8866c2 ("lib/test: introduce cpumask KUnit test
 suite").
 
 These changes include fixes for the tests, and better alignment with the
 KUnit style guidelines.
 
 Thanks,
 Yury
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmMKYcoACgkQsUSA/Tof
 vsiceAv/TY+HTn1gmrNQwi7xC6VUD7mFYlVNZMtyMpZ23UYildz5SjFfuQV3UbXI
 H5yKgSao9VFsbwyDUXbhySgOaNR8auq17Ey3jSJuR2A76qO2u2d79Gdt4IjIkq5N
 IGOPv/pNOur7J+KSbiVhXasFeZGJ6Xi+xAobp5CK1uPCUI3oU1pAcm1iKkI+eWZ3
 tPsM3aWcYGCDec7tqtqcsiWO2x9imPnrpI+C91Pwwr+N40ObkMc4IPzuPrQRn2T2
 ECY9pgIWKOwOJ41jzgCVwZIHmuOn9dEgmaEGvE9Ah57OwuDlS43M4Ok3xy2+xS3t
 3naLG3p02sJy7sXabC+xH4VJVPNT9/qauMW27cntPeeI2i/+yZXuQSLlVOllrY7/
 LYxI8lVb1j50A90I/WrwXoDV0E68cfjhkiqhkgV33t1EamhSJvTG8GwCnF46WG8o
 LzLukvoohA9uIrPAH2YpkZtrvsuT6iQccCY0M+kXv6TuYTgygdE16muVHffDKvsG
 EIVdBGu6
 =oNmV
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-6.0-rc3' of github.com:/norov/linux

Pull bitmap fixes from Yury Norov:
 "Fix the reported issues, and implements the suggested improvements,
  for the version of the cpumask tests [1] that was merged with commit
  c41e8866c2 ("lib/test: introduce cpumask KUnit test suite").

  These changes include fixes for the tests, and better alignment with
  the KUnit style guidelines"

* tag 'bitmap-6.0-rc3' of github.com:/norov/linux:
  lib/cpumask_kunit: add tests file to MAINTAINERS
  lib/cpumask_kunit: log mask contents
  lib/test_cpumask: follow KUnit style guidelines
  lib/test_cpumask: fix cpu_possible_mask last test
  lib/test_cpumask: drop cpu_possible_mask full test
2022-08-28 14:36:27 -07:00
Luca Ceresoli 0ebafe2ea8 .mailmap: update Luca Ceresoli's e-mail address
My Bootlin address is preferred from now on.

Link: https://lkml.kernel.org/r/20220826130515.3011951-1-luca.ceresoli@bootlin.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Atish Patra <atishp@atishpatra.org>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:46 -07:00
Peter Xu 3d2f78f08c mm/mprotect: only reference swap pfn page if type match
Yu Zhao reported a bug after the commit "mm/swap: Add swp_offset_pfn() to
fetch PFN from swap entry" added a check in swp_offset_pfn() for swap type [1]:

  kernel BUG at include/linux/swapops.h:117!
  CPU: 46 PID: 5245 Comm: EventManager_De Tainted: G S         O L 6.0.0-dbg-DEV #2
  RIP: 0010:pfn_swap_entry_to_page+0x72/0xf0
  Code: c6 48 8b 36 48 83 fe ff 74 53 48 01 d1 48 83 c1 08 48 8b 09 f6
  c1 01 75 7b 66 90 48 89 c1 48 8b 09 f6 c1 01 74 74 5d c3 eb 9e <0f> 0b
  48 ba ff ff ff ff 03 00 00 00 eb ae a9 ff 0f 00 00 75 13 48
  RSP: 0018:ffffa59e73fabb80 EFLAGS: 00010282
  RAX: 00000000ffffffe8 RBX: 0c00000000000000 RCX: ffffcd5440000000
  RDX: 1ffffffffff7a80a RSI: 0000000000000000 RDI: 0c0000000000042b
  RBP: ffffa59e73fabb80 R08: ffff9965ca6e8bb8 R09: 0000000000000000
  R10: ffffffffa5a2f62d R11: 0000030b372e9fff R12: ffff997b79db5738
  R13: 000000000000042b R14: 0c0000000000042b R15: 1ffffffffff7a80a
  FS:  00007f549d1bb700(0000) GS:ffff99d3cf680000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000440d035b3180 CR3: 0000002243176004 CR4: 00000000003706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   <TASK>
   change_pte_range+0x36e/0x880
   change_p4d_range+0x2e8/0x670
   change_protection_range+0x14e/0x2c0
   mprotect_fixup+0x1ee/0x330
   do_mprotect_pkey+0x34c/0x440
   __x64_sys_mprotect+0x1d/0x30

It triggers because pfn_swap_entry_to_page() could be called upon e.g. a
genuine swap entry.

Fix it by only calling it when it's a write migration entry where the page*
is used.

[1] https://lore.kernel.org/lkml/CAOUHufaVC2Za-p8m0aiHw6YkheDcrO-C3wRGixwDS32VTS+k1w@mail.gmail.com/

Link: https://lkml.kernel.org/r/20220823221138.45602-1-peterx@redhat.com
Fixes: 6c287605fd ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Yu Zhao <yuzhao@google.com>
Tested-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:46 -07:00