Since binutils 2.39, ld will print a warning if any stack section is
executable, which is the default for stack sections on files without a
.note.GNU-stack section.
This was fixed for x86 in commit ffcf9c5700 ("x86: link vdso and boot with -z noexecstack --no-warn-rwx-segments"),
but remained broken for UML, resulting in several warnings:
/usr/bin/ld: warning: arch/x86/um/vdso/vdso.o: missing .note.GNU-stack section implies executable stack
/usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
/usr/bin/ld: warning: .tmp_vmlinux.kallsyms1 has a LOAD segment with RWX permissions
/usr/bin/ld: warning: .tmp_vmlinux.kallsyms1.o: missing .note.GNU-stack section implies executable stack
/usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
/usr/bin/ld: warning: .tmp_vmlinux.kallsyms2 has a LOAD segment with RWX permissions
/usr/bin/ld: warning: .tmp_vmlinux.kallsyms2.o: missing .note.GNU-stack section implies executable stack
/usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
/usr/bin/ld: warning: vmlinux has a LOAD segment with RWX permissions
Link both the VDSO and vmlinux with -z noexecstack, fixing the warnings
about .note.GNU-stack sections. In addition, pass --no-warn-rwx-segments
to dodge the remaining warnings about LOAD segments with RWX permissions
in the kallsyms objects. (Note that this flag is apparently not
available on lld, so hide it behind a test for BFD, which is what the
x86 patch does.)
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ffcf9c5700e49c0aee42dcba9a12ba21338e8136
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba951afb99912da01a6e8434126b8fac7aa75107
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Lukas Straub <lukasstraub2@web.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: Richard Weinberger <richard@nod.at>
Use READ_ONCE_NOCHECK() when reading the stack to prevent KASAN splats
when dump_stack() is used.
Fixes: 5b301409e8 ("UML: add support for KASAN under x86_64")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
OpenWrt's UML with 5.15 was producing odd errors/warnings during preinit
part of the early userspace portion:
|[ 0.000000] Kernel command line: ubd0=root.img root=98:0 console=tty
|[...]
|[ 0.440000] random: jshn: uninitialized urandom read (4 bytes read)
|[ 0.460000] random: jshn: uninitialized urandom read (4 bytes read)
|/etc/preinit: line 47: can't create /dev/tty: No such device or address
|/etc/preinit: line 48: can't create /dev/tty: No such device or address
|/etc/preinit: line 58: can't open /dev/tty: No such device or address
|[...] repeated many times
That "/dev/tty" came from the command line (which is automatically
added if no console= parameter was specified for the uml binary).
The TLDP project tells the following about the /dev/tty:
<https://tldp.org/HOWTO/Text-Terminal-HOWTO-7.html#ss7.3>
| /dev/tty stands for the controlling terminal (if any) for the current
| process.[...]
| /dev/tty is something like a link to the actually terminal device[..]
The "(if any)" is important here, since it's possible for processes to
not have a controlling terminal.
I think this was a simple typo and the author wanted tty0 there.
CC: Thomas Meyer <thomas@m3y3r.de>
Fixes: d7ffac3363 ("um: stdio_console: Make preferred console")
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
GCC has supported asm goto since 4.5, and Clang has since version 9.0.0.
The minimum supported versions of these tools for the build according to
Documentation/process/changes.rst are 5.1 and 11.0.0 respectively.
Remove the feature detection script, Kconfig option, and clean up some
fallback code that is no longer supported.
The removed script was also testing for a GCC specific bug that was
fixed in the 4.7 release.
Also remove workarounds for bpftrace using clang older than 9.0.0, since
other BPF backend fixes are required at this point.
Link: https://lore.kernel.org/lkml/CAK7LNATSr=BXKfkdW8f-H5VT_w=xBpT2ZQcZ7rm6JfkdE+QnmA@mail.gmail.com/
Link: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48637
Acked-by: Borislav Petkov <bp@suse.de>
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit a10fba0377145fccefea4dc4dd5915b7ed87e546: the
proposed API isn't supported on all transports but no
effort was made to address this.
It might not be hard to fix if we want to: maybe just
rename size to size_hint and make sure legacy
transports ignore the hint.
But it's not sure what the benefit is in any case, so
let's drop it.
Fixes: a10fba0377 ("virtio: find_vqs() add arg sizes")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20220816053602.173815-8-mst@redhat.com>
A huge patchset supporting vq resize using the
new vq reset capability.
Features, fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmL2F9APHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRp00QIAKpxyu+zCtrdDuh68DsNn1Cu0y0PXG336ySy
MA1ck/bv94MZBIbI/Bnn3T1jDmUqTFHJiwaGz/aZ5gGAplZiejhH5Ds3SYjHckaa
MKeJ4FTXin9RESP+bXhv4BgZ+ju3KHHkf1jw3TAdVKQ7Nma1u4E6f8nprYEi0TI0
7gLUYenqzS7X1+v9O3rEvPr7tSbAKXYGYpV82sSjHIb9YPQx5luX1JJIZade8A25
mTt5hG1dP1ugUm1NEBPQHjSvdrvO3L5Ahy0My2Bkd77+tOlNF4cuMPt2NS/6+Pgd
n6oMt3GXqVvw5RxZyY8dpknH5kofZhjgFyZXH0l+aNItfHUs7t0=
=rIo2
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
- A huge patchset supporting vq resize using the new vq reset
capability
- Features, fixes, and cleanups all over the place
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (88 commits)
vdpa/mlx5: Fix possible uninitialized return value
vdpa_sim_blk: add support for discard and write-zeroes
vdpa_sim_blk: add support for VIRTIO_BLK_T_FLUSH
vdpa_sim_blk: make vdpasim_blk_check_range usable by other requests
vdpa_sim_blk: check if sector is 0 for commands other than read or write
vdpa_sim: Implement suspend vdpa op
vhost-vdpa: uAPI to suspend the device
vhost-vdpa: introduce SUSPEND backend feature bit
vdpa: Add suspend operation
virtio-blk: Avoid use-after-free on suspend/resume
virtio_vdpa: support the arg sizes of find_vqs()
vhost-vdpa: Call ida_simple_remove() when failed
vDPA: fix 'cast to restricted le16' warnings in vdpa.c
vDPA: !FEATURES_OK should not block querying device config space
vDPA/ifcvf: support userspace to query features and MQ of a management device
vDPA/ifcvf: get_config_size should return a value no greater than dev implementation
vhost scsi: Allow user to control num virtqueues
vhost-scsi: Fix max number of virtqueues
vdpa/mlx5: Support different address spaces for control and data
vdpa/mlx5: Implement susupend virtqueue callback
...
find_vqs() adds a new parameter sizes to specify the size of each vq
vring.
NULL as sizes means that all queues in find_vqs() use the maximum size.
A value in the array is 0, which means that the corresponding queue uses
the maximum size.
In the split scenario, the meaning of size is the largest size, because
it may be limited by memory, the virtio core will try a smaller size.
And the size is power of 2.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220801063902.129329-34-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio-net can display the maximum (supported by hardware) ring size in
ethtool -g eth0.
When the subsequent patch implements vring reset, it can judge whether
the ring size passed by the driver is legal based on this.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220801063902.129329-2-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Lin, Yang Shi, Anshuman Khandual and Mike Rapoport
- Some kmemleak fixes from Patrick Wang and Waiman Long
- DAMON updates from SeongJae Park
- memcg debug/visibility work from Roman Gushchin
- vmalloc speedup from Uladzislau Rezki
- more folio conversion work from Matthew Wilcox
- enhancements for coherent device memory mapping from Alex Sierra
- addition of shared pages tracking and CoW support for fsdax, from
Shiyang Ruan
- hugetlb optimizations from Mike Kravetz
- Mel Gorman has contributed some pagealloc changes to improve latency
and realtime behaviour.
- mprotect soft-dirty checking has been improved by Peter Xu
- Many other singleton patches all over the place
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYuravgAKCRDdBJ7gKXxA
jpqSAQDrXSdII+ht9kSHlaCVYjqRFQz/rRvURQrWQV74f6aeiAD+NHHeDPwZn11/
SPktqEUrF1pxnGQxqLh1kUFUhsVZQgE=
=w/UH
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"Most of the MM queue. A few things are still pending.
Liam's maple tree rework didn't make it. This has resulted in a few
other minor patch series being held over for next time.
Multi-gen LRU still isn't merged as we were waiting for mapletree to
stabilize. The current plan is to merge MGLRU into -mm soon and to
later reintroduce mapletree, with a view to hopefully getting both
into 6.1-rc1.
Summary:
- The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
Lin, Yang Shi, Anshuman Khandual and Mike Rapoport
- Some kmemleak fixes from Patrick Wang and Waiman Long
- DAMON updates from SeongJae Park
- memcg debug/visibility work from Roman Gushchin
- vmalloc speedup from Uladzislau Rezki
- more folio conversion work from Matthew Wilcox
- enhancements for coherent device memory mapping from Alex Sierra
- addition of shared pages tracking and CoW support for fsdax, from
Shiyang Ruan
- hugetlb optimizations from Mike Kravetz
- Mel Gorman has contributed some pagealloc changes to improve
latency and realtime behaviour.
- mprotect soft-dirty checking has been improved by Peter Xu
- Many other singleton patches all over the place"
[ XFS merge from hell as per Darrick Wong in
https://lore.kernel.org/all/YshKnxb4VwXycPO8@magnolia/ ]
* tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (282 commits)
tools/testing/selftests/vm/hmm-tests.c: fix build
mm: Kconfig: fix typo
mm: memory-failure: convert to pr_fmt()
mm: use is_zone_movable_page() helper
hugetlbfs: fix inaccurate comment in hugetlbfs_statfs()
hugetlbfs: cleanup some comments in inode.c
hugetlbfs: remove unneeded header file
hugetlbfs: remove unneeded hugetlbfs_ops forward declaration
hugetlbfs: use helper macro SZ_1{K,M}
mm: cleanup is_highmem()
mm/hmm: add a test for cross device private faults
selftests: add soft-dirty into run_vmtests.sh
selftests: soft-dirty: add test for mprotect
mm/mprotect: fix soft-dirty check in can_change_pte_writable()
mm: memcontrol: fix potential oom_lock recursion deadlock
mm/gup.c: fix formatting in check_and_migrate_movable_page()
xfs: fail dax mount if reflink is enabled on a partition
mm/memcontrol.c: remove the redundant updating of stats_flush_threshold
userfaultfd: don't fail on unrecognized features
hugetlb_cgroup: fix wrong hugetlb cgroup numa stat
...
- KASAN support for x86_64
- noreboot command line option, just like qemu's -no-reboot
- Various fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdgfidid8lnn52cLTZvlZhesYu8EFAmLteZYACgkQZvlZhesY
u8F8bRAA2806QUzysg3Nj1AKPiTOj47TuluGu4SXytB0usQgYK/n3Fxr36ULJAOJ
3qZWf2fsAkBLvgX9Sw2QFGfulrpfKnLeTdBXSEbWYWhZ0ZoaEJztKmtfH02kRDOW
POedQT5FXMDVjGQdLC7Ycp+WyjaUwrccZ+KRkGWmlr7vNFlxcTlEqBb13mgLdjkY
ep8X+SgmAcdvWBd/os+nNn9Al6TbFd4XQCok82DtNrv0ggwXnVPov/ArvZvvn2Oj
F028X77180rbrGV+ZnDkV1KSv/ccT5EFebJkfEEcYVjre8o0QoPQmh2tFqXN0d83
2WpIOb1+mQL0VClpC4hKbScpIB5tw8vIHsUT+ifloIgY/puhezx6aWm0TKSA+aTM
WitJl1Nf4uNu1rqkBkn9o3VK8CYokTALQIRexHCzvZ70CSxmFbR7EVRSTf7Rr690
Oq7StHagfuTJpddh0wQwaMorIH4s0/bpPoA6m4OhwlppnCpY0Hfl3+AKluNRUtH6
lPeQwfxhd/LKqYW0COElEnReDLzer82kUx/keVyxVINqxpm6YTHVtOgtMCEuVNXg
GbS8PFCW2mIP8Is6HJavZYCzG8vnz3wZ9GENujanwLemiIJfINDauybu+nNsE5pO
7v12vWeZ0x2HGM/cFxODrpp4xAkdq8BBLap8/aXB8uJFagmYyhs=
=f3Bh
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML updates from Richard Weinberger:
- KASAN support for x86_64
- noreboot command line option, just like qemu's -no-reboot
- Various fixes and cleanups
* tag 'for-linus-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: include sys/types.h for size_t
um: Replace to_phys() and to_virt() with less generic function names
um: Add missing apply_returns()
um: add "noreboot" command line option for PANIC_TIMEOUT=-1 setups
um: include linux/stddef.h for __always_inline
UML: add support for KASAN under x86_64
mm: Add PAGE_ALIGN_DOWN macro
um: random: Don't initialise hwrng struct with zero
um: remove unused mm_copy_segments
um: remove unused variable
um: Remove straying parenthesis
um: x86: print RIP with symbol
arch: um: Fix build for statically linked UML w/ constructors
x86/um: Kconfig: Fix indentation
um/drivers: Kconfig: Fix indentation
um: Kconfig: Fix indentation
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmLr+2wUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vxfZg//eChkC2EUdT6K3zuQDbJJhsGcuOQF
lnZuUyDn4xw7BkEoZf8V6YdAnp7VvgKhLOq1/q3Geu/LBbCaczoEogOCaR/WcVOs
C+MsN0RWZQtgfuZKncQoqp25NeLPK9PFToeiIX/xViAYZF7NVjDY7XQiZHQ6JkEA
/7cUqv/4nS3KCMsKjfmiOxGnqohMWtICiw9qjFvJ40PEDnNB1b53rkiVTxBFePpI
ePfsRfi/C7klE3xNfoiEgrPp+Jfw+oShsCwXUsId7bEL2oLBc7ClqP05ZYZD3bTK
QQYyZ12Cq8TysciYpUGBjBnywUHS5DIO5YaV3wxyVAR2Z+6GY2/QVjOa2kKvoK0o
Hba6TJf8bL58AhSI8Q62pBM0sS7dqJSff+9c2BGpZvII5spP/rQQLlJO56TJjwkw
Dlf0d3thhZOc9vSKjKw+0v0FdAyc4L11EOwUsw95jZeT5WWgqJYGFnWPZwqBI1KM
DI1E5wVO5tA2H3NEn+BTTHbLWL+UppqyXPXBHiW52b2q5Bt8fJWMsFvnEEjclxmG
pYCI7VgF8jqbYKxjobxPFY2x6PH9hfaGMxwzZSdOX6e/Eh+1esgyyaC5APpCO+Pp
e4OkJaOzCmggrD0jYeLWu+yDm5KRrYo5cdfKHrKgAof0Am41lAa1OhJ2iH4ckNqP
1qmHereDOe0zNVw=
=9TAR
-----END PGP SIGNATURE-----
Merge tag 'pci-v5.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Consolidate duplicated 'next function' scanning and extend to allow
'isolated functions' on s390, similar to existing hypervisors
(Niklas Schnelle)
Resource management:
- Implement pci_iobar_pfn() for sparc, which allows us to remove the
sparc-specific pci_mmap_page_range() and pci_mmap_resource_range().
This removes the ability to map the entire PCI I/O space using
/proc/bus/pci, but we believe that's already been broken since
v2.6.28 (Arnd Bergmann)
- Move common PCI definitions to asm-generic/pci.h and rework others
to be be more specific and more encapsulated in arches that need
them (Stafford Horne)
Power management:
- Convert drivers to new *_PM_OPS macros to avoid need for '#ifdef
CONFIG_PM_SLEEP' or '__maybe_unused' (Bjorn Helgaas)
Virtualization:
- Add ACS quirk for Broadcom BCM5750x multifunction NICs that isolate
the functions but don't advertise an ACS capability (Pavan Chebbi)
Error handling:
- Clear PCI Status register during enumeration in case firmware left
errors logged (Kai-Heng Feng)
- When we have native control of AER, enable error reporting for all
devices that support AER. Previously only a few drivers enabled
this (Stefan Roese)
- Keep AER error reporting enabled for switches. Previously we
enabled this during enumeration but immediately disabled it (Stefan
Roese)
- Iterate over error counters instead of error strings to avoid
printing junk in AER sysfs counters (Mohamed Khalfella)
ASPM:
- Remove pcie_aspm_pm_state_change() so ASPM config changes, e.g.,
via sysfs, are not lost across power state changes (Kai-Heng Feng)
Endpoint framework:
- Don't stop an EPC when unbinding an EPF from it (Shunsuke Mie)
Endpoint embedded DMA controller driver:
- Simplify and clean up support for the DesignWare embedded DMA
(eDMA) controller (Frank Li, Serge Semin)
Broadcom STB PCIe controller driver:
- Avoid config space accesses when link is down because we can't
recover from the CPU aborts these cause (Jim Quinlan)
- Look for power regulators described under Root Ports in DT and
enable them before scanning the secondary bus (Jim Quinlan)
- Disable/enable regulators in suspend/resume (Jim Quinlan)
Freescale i.MX6 PCIe controller driver:
- Simplify and clean up clock and PHY management (Richard Zhu)
- Disable/enable regulators in suspend/resume (Richard Zhu)
- Set PCIE_DBI_RO_WR_EN before writing DBI registers (Richard Zhu)
- Allow speeds faster than Gen2 (Richard Zhu)
- Make link being down a non-fatal error so controller probe doesn't
fail if there are no Endpoints connected (Richard Zhu)
Loongson PCIe controller driver:
- Add ACPI and MCFG support for Loongson LS7A (Huacai Chen)
- Avoid config reads to non-existent LS2K/LS7A devices because a
hardware defect causes machine hangs (Huacai Chen)
- Work around LS7A integrated devices that report incorrect Interrupt
Pin values (Jianmin Lv)
Marvell Aardvark PCIe controller driver:
- Add support for AER and Slot capability on emulated bridge (Pali
Rohár)
MediaTek PCIe controller driver:
- Add Airoha EN7532 to DT binding (John Crispin)
- Allow building of driver for ARCH_AIROHA (Felix Fietkau)
MediaTek PCIe Gen3 controller driver:
- Print decoded LTSSM state when the link doesn't come up (Jianjun
Wang)
NVIDIA Tegra194 PCIe controller driver:
- Convert DT binding to json-schema (Vidya Sagar)
- Add DT bindings and driver support for Tegra234 Root Port and
Endpoint mode (Vidya Sagar)
- Fix some Root Port interrupt handling issues (Vidya Sagar)
- Set default Max Payload Size to 256 bytes (Vidya Sagar)
- Fix Data Link Feature capability programming (Vidya Sagar)
- Extend Endpoint mode support to devices beyond Controller-5 (Vidya
Sagar)
Qualcomm PCIe controller driver:
- Rework clock, reset, PHY power-on ordering to avoid hangs and
improve consistency (Robert Marko, Christian Marangi)
- Move pipe_clk handling to PHY drivers (Dmitry Baryshkov)
- Add IPQ60xx support (Selvam Sathappan Periakaruppan)
- Allow ASPM L1 and substates for 2.7.0 (Krishna chaitanya chundru)
- Add support for more than 32 MSI interrupts (Dmitry Baryshkov)
Renesas R-Car PCIe controller driver:
- Convert DT binding to json-schema (Herve Codina)
- Add Renesas RZ/N1D (R9A06G032) to rcar-gen2 DT binding and driver
(Herve Codina)
Samsung Exynos PCIe controller driver:
- Fix phy-exynos-pcie driver so it follows the 'phy_init() before
phy_power_on()' PHY programming model (Marek Szyprowski)
Synopsys DesignWare PCIe controller driver:
- Simplify and clean up the DWC core extensively (Serge Semin)
- Fix an issue with programming the ATU for regions that cross a 4GB
boundary (Serge Semin)
- Enable the CDM check if 'snps,enable-cdm-check' exists; previously
we skipped it if 'num-lanes' was absent (Serge Semin)
- Allocate a 32-bit DMA-able page to be MSI target instead of using a
driver data structure that may not be addressable with 32-bit
address (Will McVicker)
- Add DWC core support for more than 32 MSI interrupts (Dmitry
Baryshkov)
Xilinx Versal CPM PCIe controller driver:
- Add DT binding and driver support for Versal CPM5 Gen5 Root Port
(Bharat Kumar Gogada)"
* tag 'pci-v5.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (150 commits)
PCI: imx6: Support more than Gen2 speed link mode
PCI: imx6: Set PCIE_DBI_RO_WR_EN before writing DBI registers
PCI: imx6: Reformat suspend callback to keep symmetric with resume
PCI: imx6: Move the imx6_pcie_ltssm_disable() earlier
PCI: imx6: Disable clocks in reverse order of enable
PCI: imx6: Do not hide PHY driver callbacks and refine the error handling
PCI: imx6: Reduce resume time by only starting link if it was up before suspend
PCI: imx6: Mark the link down as non-fatal error
PCI: imx6: Move regulator enable out of imx6_pcie_deassert_core_reset()
PCI: imx6: Turn off regulator when system is in suspend mode
PCI: imx6: Call host init function directly in resume
PCI: imx6: Disable i.MX6QDL clock when disabling ref clocks
PCI: imx6: Propagate .host_init() errors to caller
PCI: imx6: Collect clock enables in imx6_pcie_clk_enable()
PCI: imx6: Factor out ref clock disable to match enable
PCI: imx6: Move imx6_pcie_clk_disable() earlier
PCI: imx6: Move imx6_pcie_enable_ref_clk() earlier
PCI: imx6: Move PHY management functions together
PCI: imx6: Move imx6_pcie_grp_offset(), imx6_pcie_configure_type() earlier
PCI: imx6: Convert to NOIRQ_SYSTEM_SLEEP_PM_OPS()
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmLnDOwACgkQSfxwEqXe
A65Fiw//Z0YaPejSslQIGitQ1b0XzdWBhyJArYDieaaiQRXMqlaSKlIUqHz38xb7
+FykUY51/SJLjHV2riPxq1OK3/MPmk6VlTd0HHihcHVmg77oZcFcv2tPnDpZoqND
TsBOujLbXKwxP8tNFedRY/4+K7w+ue9BTfDjuH7aCtz7uWd+4cNJmPg3x9FCfkMA
+hbcRluwE9W3Pg4OCKwv+qxL0JF3qQtNKEOp1wpnjGAZZW/I9gFNgFBEkykvcAsj
TkIRDc3agPFj6QgDeRIgLdnf9KCsLubKAg5oJneeCvQztJJUCSkn8nQXxpx+4sLo
GsRgvCdfL/GyJqfSAzQJVYDHKtKMkJiCiWCC/oOALR8dzHJfSlULDAjbY1m/DAr9
at+vi4678Or7TNx2ZSaUlCXXKZ+UT7yWMlQWax9JuxGk1hGYP5/eT1AH5SGjqUwF
w1q8oyzxt1vUcnOzEddFXPFirnqqhAk4dQFtu83+xKM4ZssMVyeB4NZdEhAdW0ng
MX+RjrVj4l5gWWuoS0Cx3LUxDCgV6WT0dN+Vl9axAZkoJJbcXLEmXwQ6NbzTLPWg
1/MT7qFTxNcTCeAArMdZvvFbeh7pOBXO42pafrK/7vDRnTMUIw9tqXNLQUfvdFQp
F5flPgiVRHDU2vSzKIFtnPTyXU0RBBGvNb4n0ss2ehH2DSsCxYE=
=Zy3d
-----END PGP SIGNATURE-----
Merge tag 'random-6.0-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
"Though there's been a decent amount of RNG-related development during
this last cycle, not all of it is coming through this tree, as this
cycle saw a shift toward tackling early boot time seeding issues,
which took place in other trees as well.
Here's a summary of the various patches:
- The CONFIG_ARCH_RANDOM .config option and the "nordrand" boot
option have been removed, as they overlapped with the more widely
supported and more sensible options, CONFIG_RANDOM_TRUST_CPU and
"random.trust_cpu". This change allowed simplifying a bit of arch
code.
- x86's RDRAND boot time test has been made a bit more robust, with
RDRAND disabled if it's clearly producing bogus results. This would
be a tip.git commit, technically, but I took it through random.git
to avoid a large merge conflict.
- The RNG has long since mixed in a timestamp very early in boot, on
the premise that a computer that does the same things, but does so
starting at different points in wall time, could be made to still
produce a different RNG state. Unfortunately, the clock isn't set
early in boot on all systems, so now we mix in that timestamp when
the time is actually set.
- User Mode Linux now uses the host OS's getrandom() syscall to
generate a bootloader RNG seed and later on treats getrandom() as
the platform's RDRAND-like faculty.
- The arch_get_random_{seed_,}_long() family of functions is now
arch_get_random_{seed_,}_longs(), which enables certain platforms,
such as s390, to exploit considerable performance advantages from
requesting multiple CPU random numbers at once, while at the same
time compiling down to the same code as before on platforms like
x86.
- A small cleanup changing a cmpxchg() into a try_cmpxchg(), from
Uros.
- A comment spelling fix"
More info about other random number changes that come in through various
architecture trees in the full commentary in the pull request:
https://lore.kernel.org/all/20220731232428.2219258-1-Jason@zx2c4.com/
* tag 'random-6.0-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
random: correct spelling of "overwrites"
random: handle archrandom with multiple longs
um: seed rng using host OS rng
random: use try_cmpxchg in _credit_init_bits
timekeeping: contribute wall clock to rng on time change
x86/rdrand: Remove "nordrand" flag in favor of "random.trust_cpu"
random: remove CONFIG_ARCH_RANDOM
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmLko3gQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpmQaD/90NKFj4v8I456TUQyg1jimXEsL+e84E6o2
ALWVb6JzQvlPVQXNLnK5YKIunMWOTtTMz0nyB8sVRwVJVJO0P5d7QopAkZM8fkyU
MK5OCzoryENw4DTc2wJS4in6cSbGylIuN74wMzlf7+M67JTImfoZQhbTMcjwzZfn
b3OlL6sID7zMXwGcuOJPZyUJICCpDhzdSF9JXqKma5PQuG2SBmQyvFxJAcsoFBPc
YetnoRIOIN6yBvsIZaPaYq7XI9MIvF0e67EQtyCEHj4tHpyVnyDWkeObVFULsISU
gGEKbkYPvNUzRAU5Q1NBBHh1tTfkf/MaUxTuZwoEwZ/s04IGBGMmrZGyfvdfzYo6
M7NwSEg/TrUSNfTwn65mQi7uOXu1pGkJrqz84Flm8u9Qid9Vd7LExLG5p/ggnWdH
5th93MDEmtEg29e9DXpEAuS5d0t3TtSvosflaKpyfNNfr+P0rWCN6GM/uW62VUTK
ls69SQh/AQJRbg64jU4xper6WhaYtSXK7TKEnxJycoEn9gYNyCcdot2uekth0xRH
ChHGmRlteiqe/y4uFWn/2dcxWjoleiHbFjTaiRL75WVl8wIDEjw02LGuoZ61Ss9H
WOV+MT7KqNjBGe6lreUY+O/PO02dzmoR6heJXN19p8zr/pBuLCTGX7UpO7rzgaBR
4N1HEozvIw==
=celk
-----END PGP SIGNATURE-----
Merge tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
- Improve the type checking of request flags (Bart)
- Ensure queue mapping for a single queues always picks the right queue
(Bart)
- Sanitize the io priority handling (Jan)
- rq-qos race fix (Jinke)
- Reserved tags handling improvements (John)
- Separate memory alignment from file/disk offset aligment for O_DIRECT
(Keith)
- Add new ublk driver, userspace block driver using io_uring for
communication with the userspace backend (Ming)
- Use try_cmpxchg() to cleanup the code in various spots (Uros)
- Finally remove bdevname() (Christoph)
- Clean up the zoned device handling (Christoph)
- Clean up independent access range support (Christoph)
- Clean up and improve block sysfs handling (Christoph)
- Clean up and improve teardown of block devices.
This turns the usual two step process into something that is simpler
to implement and handle in block drivers (Christoph)
- Clean up chunk size handling (Christoph)
- Misc cleanups and fixes (Bart, Bo, Dan, GuoYong, Jason, Keith, Liu,
Ming, Sebastian, Yang, Ying)
* tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-block: (178 commits)
ublk_drv: fix double shift bug
ublk_drv: make sure that correct flags(features) returned to userspace
ublk_drv: fix error handling of ublk_add_dev
ublk_drv: fix lockdep warning
block: remove __blk_get_queue
block: call blk_mq_exit_queue from disk_release for never added disks
blk-mq: fix error handling in __blk_mq_alloc_disk
ublk: defer disk allocation
ublk: rewrite ublk_ctrl_get_queue_affinity to not rely on hctx->cpumask
ublk: fold __ublk_create_dev into ublk_ctrl_add_dev
ublk: cleanup ublk_ctrl_uring_cmd
ublk: simplify ublk_ch_open and ublk_ch_release
ublk: remove the empty open and release block device operations
ublk: remove UBLK_IO_F_PREFLUSH
ublk: add a MAINTAINERS entry
block: don't allow the same type rq_qos add more than once
mmc: fix disk/queue leak in case of adding disk failure
ublk_drv: fix an IS_ERR() vs NULL check
ublk: remove UBLK_IO_F_INTEGRITY
ublk_drv: remove unneeded semicolon
...
The archrandom interface was originally designed for x86, which supplies
RDRAND/RDSEED for receiving random words into registers, resulting in
one function to generate an int and another to generate a long. However,
other architectures don't follow this.
On arm64, the SMCCC TRNG interface can return between one and three
longs. On s390, the CPACF TRNG interface can return arbitrary amounts,
with four longs having the same cost as one. On UML, the os_getrandom()
interface can return arbitrary amounts.
So change the api signature to take a "max_longs" parameter designating
the maximum number of longs requested, and then return the number of
longs generated.
Since callers need to check this return value and loop anyway, each arch
implementation does not bother implementing its own loop to try again to
fill the maximum number of longs. Additionally, all existing callers
pass in a constant max_longs parameter. Taken together, these two things
mean that the codegen doesn't really change much for one-word-at-a-time
platforms, while performance is greatly improved on platforms such as
s390.
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
The isa_dma_bridge_buggy symbol is only used for x86_32, and only x86_32
platforms or quirks ever set it.
Add a new linux/isa-dma.h header that #defines isa_dma_bridge_buggy to 0
except on x86_32, where we keep it as a variable, and remove all the arch-
specific definitions.
[bhelgaas: commit log]
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/r/20220722214944.831438-3-shorne@gmail.com
Signed-off-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
pci_get_legacy_ide_irq() is only used on platforms that support PNP, so
many architectures define it but never use it. Replace uses of it with
ATA_PRIMARY_IRQ() and ATA_SECONDARY_IRQ(), which provide the same
functionality.
Since pci_get_legacy_ide_irq() is no longer used, remove all the
architecture-specific definitions of it as well as asm-generic/pci.h, which
only provides pci_get_legacy_ide_irq()
[bhelgaas: commit log]
Co-developed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20220722214944.831438-2-shorne@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
UML generally does not provide access to special CPU instructions like
RDRAND, and execution tends to be rather deterministic, with no real
hardware interrupts, making good randomness really very hard, if not
all together impossible. Not only is this a security eyebrow raiser, but
it's also quite annoying when trying to do various pieces of UML-based
automation that takes a long time to boot, if ever.
Fix this by trivially calling getrandom() in the host and using that
seed as "bootloader randomness", which initializes the rng immediately
at UML boot.
The old behavior can be restored the same way as on any other arch, by
way of CONFIG_TRUST_BOOTLOADER_RANDOMNESS=n or
random.trust_bootloader=0. So seen from that perspective, this just
makes UML act like other archs, which is positive in its own right.
Additionally, wire up arch_get_random_{int,long}() in the same way, so
that reseeds can also make use of the host RNG, controllable by
CONFIG_TRUST_CPU_RANDOMNESS and random.trust_cpu, per usual.
Cc: stable@vger.kernel.org
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Now all the platforms enable ARCH_HAS_GET_PAGE_PROT. They define and
export own vm_get_page_prot() whether custom or standard
DECLARE_VM_GET_PAGE_PROT. Hence there is no need for default generic
fallback for vm_get_page_prot(). Just drop this fallback and also
ARCH_HAS_GET_PAGE_PROT mechanism.
Link: https://lkml.kernel.org/r/20220711070600.2378316-27-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This enables ARCH_HAS_VM_GET_PAGE_PROT on the platform and exports
standard vm_get_page_prot() implementation via DECLARE_VM_GET_PAGE_PROT,
which looks up a private and static protection_map[] array. Subsequently
all __SXXX and __PXXX macros can be dropped which are no longer needed.
Link: https://lkml.kernel.org/r/20220711070600.2378316-25-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Usually size_t comes from sys/types.h, not stddef.h. This code likely
worked only because something else in its usage chain was pulling in
sys/types.h. stddef.h is still required for NULL, however, so note this.
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
to_virt() and to_phys() are very generic and may be defined by drivers.
As it turns out, commit 9409c9b670 ("pmem: refactor pmem_clear_poison()")
did exactly that. This results in build errors such as the following
when trying to build um:allmodconfig.
drivers/nvdimm/pmem.c: In function ‘pmem_dax_zero_page_range’:
./arch/um/include/asm/page.h:105:20: error:
too few arguments to function ‘to_phys’
105 | #define __pa(virt) to_phys((void *) (unsigned long) (virt))
| ^~~~~~~
Use less generic function names for the um specific to_phys() and to_virt()
functions to fix the problem and to avoid similar problems in the future.
Fixes: 9409c9b670 ("pmem: refactor pmem_clear_poison()")
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Implement apply_returns() stub for UM, just like all the other patching
routines.
Fixes: 15e67227c4 ("x86: Undo return-thunk damage")
Reported-by: Randy Dunlap <rdunlap@infradead.org)
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
QEMU has a -no-reboot option, which halts instead of reboots when the
guest asks to reboot. This is invaluable when used with
CONFIG_PANIC_TIMEOUT=-1 (and panic_on_warn), because it allows panics
and warnings to be caught immediately in CI. Implement this in UML too,
by way of a basic setup param.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Make KASAN run on User Mode Linux on x86_64.
The UML-specific KASAN initializer uses mmap to map the ~16TB of shadow
memory to the location defined by KASAN_SHADOW_OFFSET. kasan_init()
utilizes constructors to initialize KASAN before main().
The location of the KASAN shadow memory, starting at
KASAN_SHADOW_OFFSET, can be configured using the KASAN_SHADOW_OFFSET
option. The default location of this offset is 0x100000000000, which
keeps it out-of-the-way even on UML setups with more "physical" memory.
For low-memory setups, 0x7fff8000 can be used instead, which fits in an
immediate and is therefore faster, as suggested by Dmitry Vyukov. There
is usually enough free space at this location; however, it is a config
option so that it can be easily changed if needed.
Note that, unlike KASAN on other architectures, vmalloc allocations
still use the shadow memory allocated upfront, rather than allocating
and free-ing it per-vmalloc allocation.
If another architecture chooses to go down the same path, we should
replace the checks for CONFIG_UML with something more generic, such
as:
- A CONFIG_KASAN_NO_SHADOW_ALLOC option, which architectures could set
- or, a way of having architecture-specific versions of these vmalloc
and module shadow memory allocation options.
Also note that, while UML supports both KASAN in inline mode
(CONFIG_KASAN_INLINE) and static linking (CONFIG_STATIC_LINK), it does
not support both at the same time.
Signed-off-by: Patricia Alfonso <trishalfonso@google.com>
Co-developed-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Initialising the hwrng struct with zeros causes a
compile-time sparse warning:
$ ARCH=um make -j10 W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'
...
CHECK arch/um/drivers/random.c
arch/um/drivers/random.c:31:31: sparse: warning: Using plain integer as NULL pointer
Fix the warning by not initialising the hwrng struct
with zeros as it is initialised anyway during module
init.
Fixes: 72d3e093af ("um: random: Register random as hwrng-core device")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christopher Obbard <chris.obbard@collabora.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
It was already removed by commit c17c02040b ("arch: remove unused
*_segments() macros/functions") but seems to have been accidentally
reintroduced by commit 0500871f21 ("Construct init thread stack in the
linker script rather than by union"). Remove it for good.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Richard Weinberger <richard@nod.at>
The variable dead is initialized but never used otherwise.
Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Commit e3a33af812 ("um: fix and optimize xor select template for CONFIG64 and timetravel mode")
caused a build regression when CONFIG_XOR_BLOCKS and CONFIG_UML_TIME_TRAVEL_SUPPORT
are selected.
Fix it by removing the straying parenthesis.
Cc: stable@vger.kernel.org
Fixes: e3a33af812 ("um: fix and optimize xor select template for CONFIG64 and timetravel mode")
Signed-off-by: Benjamin Beichler <benjamin.beichler@uni-rostock.de>
[rw: Added commit message]
Signed-off-by: Richard Weinberger <richard@nod.at>
If CONFIG_CONSTUCTORS is enabled on a statically linked
(CONFIG_STATIC_LINK=y) build of UML, the build fails due to the
.eh_frame section being both used and discarded:
ERROR:root:`.eh_frame' referenced in section `.text' of /usr/lib/gcc/x86_64-linux-gnu/11/crtbeginT.o: defined in discarded section `.eh_frame' of /usr/lib/gcc/x86_64-linux-gnu/11/crtbeginT.o
`.eh_frame' referenced in section `.text' of /usr/lib/gcc/x86_64-linux-gnu/11/crtbeginT.o: defined in discarded section `.eh_frame' of /usr/lib/gcc/x86_64-linux-gnu/11/crtbeginT.o
Instead, keep the .eh_frame section, as we do in dyn.lds.S for
dynamically linked UML.
This can be reproduced with:
./tools/testing/kunit/kunit.py run --kconfig_add CONFIG_STATIC_LINK=y --kconfig_add CONFIG_GCOV_KERNEL=y --kconfig_add CONFIG_DEBUG_FS=y
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
The convention for indentation seems to be a single tab. Help text is
further indented by an additional two whitespaces. Fix the lines that
violate these rules.
Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
The convention for indentation seems to be a single tab. Help text is
further indented by an additional two whitespaces. Fix the lines that
violate these rules.
Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
can accomodate a XenPV guest due to how the latter is setting up the PAT
machinery
Now that the retbleed nightmare is public, here's the first round of
fallout fixes:
- Fix a build failure on 32-bit due to missing include
- Remove an untraining point in espfix64 return path
- other small cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLTudcACgkQEsHwGGHe
VUrY4BAAtWm7wC6T8rzovbsyticj6kcehRMBEXxtlEP5LOeltR0dbNaIGskrS2Li
Q9YxxtQhbZPXqzqB+xeHVhDPThzsd3+wRvvetmR4fW/c3XCYr+fLLFjHj0NEvX0P
lQzuY8GKWGU/QTrjKSKclGvqyB692Fvdu4YImlnrGSbR6ywwVttditd3YNJR0w1q
7S4zq90uwWuX6cTLqXIcbKbhssOjcR1Agj9+bE8i+rzyB2VtNoihJCJh0pTJAn3P
RaXnxI/7J6Y+5imPf5/ywu8gxhvGBTy5MU/1v2pw939EurU9tmhVkNVWdO2g/qYY
V+Y1nj9xV0ucL4hlUBqAFdM+5jFC89Ey1X2tSgUgSl+44L/d8IIjVCp6inVoyCzE
Olbc6q7A/V8PNXfo4g6gDwVc3Ii53Fwgtu8xVHkwPGfjly6+yZ9O/RUBXcBAOnpU
jfS9LSc/Ro7kxqFy32beUgB7wwhMpkYuHe6ECxrvXj1IK13y3OkdxFzm03ty7S/E
BwjrkltDia9BQ6i4Ywy+qSBYkSH6+sxxt4pboB+ft6/p2JIw4YJIp9PRqzPAG0jx
JTjcZ9YAr7zPNlWp5e30BjGawgbuKPq0wkF1r6QD+3VzNf9+SSDmtkYGWeAyTTP2
SkzLy5QCNBTeeq0FZVYZFm/vcF7wccQhQdwNcezxygKAmihH0xw=
=ND3p
-----END PGP SIGNATURE-----
Merge tag 'x86_urgent_for_v5.19_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Improve the check whether the kernel supports WP mappings so that it
can accomodate a XenPV guest due to how the latter is setting up the
PAT machinery
- Now that the retbleed nightmare is public, here's the first round of
fallout fixes:
* Fix a build failure on 32-bit due to missing include
* Remove an untraining point in espfix64 return path
* other small cleanups
* tag 'x86_urgent_for_v5.19_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/bugs: Remove apostrophe typo
um: Add missing apply_returns()
x86/entry: Remove UNTRAIN_RET from native_irq_return_ldt
x86/bugs: Mark retbleed_strings static
x86/pat: Fix x86_has_pat_wp()
x86/asm/32: Fix ANNOTATE_UNRET_SAFE use on 32-bit
The UML function names to_virt() and to_phys() are exposed by UML
headers, and are very generic and may be defined by drivers. As it
turns out, commit 9409c9b670 ("pmem: refactor pmem_clear_poison()")
did exactly that.
This results in build errors such as the following when trying to build
um:allmodconfig:
drivers/nvdimm/pmem.c: In function ‘pmem_dax_zero_page_range’:
./arch/um/include/asm/page.h:105:20: error: too few arguments to function ‘to_phys’
105 | #define __pa(virt) to_phys((void *) (unsigned long) (virt))
| ^~~~~~~
Use less generic function names for the um specific to_phys() and
to_virt() functions to fix the problem and to avoid similar problems in
the future.
Fixes: 9409c9b670 ("pmem: refactor pmem_clear_poison()")
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Improve static type checking by using type enum req_op instead of int where
appropriate.
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-21-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Implement apply_returns() stub for UM, just like all the other patching
routines.
Fixes: 15e67227c4 ("x86: Undo return-thunk damage")
Reported-by: Randy Dunlap <rdunlap@infradead.org)
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/Ys%2Ft45l%2FgarIrD0u@worktop.programming.kicks-ass.net
blk_cleanup_disk is nothing but a trivial wrapper for put_disk now,
so remove it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20220619060552.1850436-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
I observed that for each of the shared file-backed page faults, we're very
likely to retry one more time for the 1st write fault upon no page. It's
because we'll need to release the mmap lock for dirty rate limit purpose
with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()).
Then after that throttling we return VM_FAULT_RETRY.
We did that probably because VM_FAULT_RETRY is the only way we can return
to the fault handler at that time telling it we've released the mmap lock.
However that's not ideal because it's very likely the fault does not need
to be retried at all since the pgtable was well installed before the
throttling, so the next continuous fault (including taking mmap read lock,
walk the pgtable, etc.) could be in most cases unnecessary.
It's not only slowing down page faults for shared file-backed, but also add
more mmap lock contention which is in most cases not needed at all.
To observe this, one could try to write to some shmem page and look at
"pgfault" value in /proc/vmstat, then we should expect 2 counts for each
shmem write simply because we retried, and vm event "pgfault" will capture
that.
To make it more efficient, add a new VM_FAULT_COMPLETED return code just to
show that we've completed the whole fault and released the lock. It's also
a hint that we should very possibly not need another fault immediately on
this page because we've just completed it.
This patch provides a ~12% perf boost on my aarch64 test VM with a simple
program sequentially dirtying 400MB shmem file being mmap()ed and these are
the time it needs:
Before: 650.980 ms (+-1.94%)
After: 569.396 ms (+-1.38%)
I believe it could help more than that.
We need some special care on GUP and the s390 pgfault handler (for gmap
code before returning from pgfault), the rest changes in the page fault
handlers should be relatively straightforward.
Another thing to mention is that mm_account_fault() does take this new
fault as a generic fault to be accounted, unlike VM_FAULT_RETRY.
I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do
not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping
them as-is.
Link: https://lkml.kernel.org/r/20220530183450.42886-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vineet Gupta <vgupta@kernel.org>
Acked-by: Guo Ren <guoren@kernel.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm part]
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Stafford Horne <shorne@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Weinberger <richard@nod.at>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Will Deacon <will@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Chris Zankel <chris@zankel.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Rich Felker <dalias@libc.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Helge Deller <deller@gmx.de>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Call virtio_device_ready() to make this driver work after commit
b4ec69d7e09 ("virtio: harden vring IRQ"), since the driver uses the
virtqueues in the probe function. (The virtio core sets the device
ready when probe returns.)
Fixes: 8b4ec69d7e ("virtio: harden vring IRQ")
Fixes: 68f5d3f3b6 ("um: add PCI over virtio emulation driver")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Message-Id: <20220610151203.3492541-1-vincent.whitchurch@axis.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Johannes Berg <johannes@sipsolutions.net>
of Peter Zijlstra was encountering with ptrace in his freezer rewrite
I identified some cleanups to ptrace_stop that make sense on their own
and move make resolving the other problems much simpler.
The biggest issue is the habbit of the ptrace code to change task->__state
from the tracer to suppress TASK_WAKEKILL from waking up the tracee. No
other code in the kernel does that and it is straight forward to update
signal_wake_up and friends to make that unnecessary.
Peter's task freezer sets frozen tasks to a new state TASK_FROZEN and
then it stores them by calling "wake_up_state(t, TASK_FROZEN)" relying
on the fact that all stopped states except the special stop states can
tolerate spurious wake up and recover their state.
The state of stopped and traced tasked is changed to be stored in
task->jobctl as well as in task->__state. This makes it possible for
the freezer to recover tasks in these special states, as well as
serving as a general cleanup. With a little more work in that
direction I believe TASK_STOPPED can learn to tolerate spurious wake
ups and become an ordinary stop state.
The TASK_TRACED state has to remain a special state as the registers for
a process are only reliably available when the process is stopped in
the scheduler. Fundamentally ptrace needs acess to the saved
register values of a task.
There are bunch of semi-random ptrace related cleanups that were found
while looking at these issues.
One cleanup that deserves to be called out is from commit 57b6de08b5
("ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs"). This
makes a change that is technically user space visible, in the handling
of what happens to a tracee when a tracer dies unexpectedly.
According to our testing and our understanding of userspace nothing
cares that spurious SIGTRAPs can be generated in that case.
The entire discussion can be found at:
https://lkml.kernel.org/r/87a6bv6dl6.fsf_-_@email.froward.int.ebiederm.org
Eric W. Biederman (11):
signal: Rename send_signal send_signal_locked
signal: Replace __group_send_sig_info with send_signal_locked
ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP
ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP
ptrace: Remove arch_ptrace_attach
signal: Use lockdep_assert_held instead of assert_spin_locked
ptrace: Reimplement PTRACE_KILL by always sending SIGKILL
ptrace: Document that wait_task_inactive can't fail
ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs
ptrace: Don't change __state
ptrace: Always take siglock in ptrace_resume
Peter Zijlstra (1):
sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state
arch/ia64/include/asm/ptrace.h | 4 --
arch/ia64/kernel/ptrace.c | 57 ----------------
arch/um/include/asm/thread_info.h | 2 +
arch/um/kernel/exec.c | 2 +-
arch/um/kernel/process.c | 2 +-
arch/um/kernel/ptrace.c | 8 +--
arch/um/kernel/signal.c | 4 +-
arch/x86/kernel/step.c | 3 +-
arch/xtensa/kernel/ptrace.c | 4 +-
arch/xtensa/kernel/signal.c | 4 +-
drivers/tty/tty_jobctrl.c | 4 +-
include/linux/ptrace.h | 7 --
include/linux/sched.h | 10 ++-
include/linux/sched/jobctl.h | 8 +++
include/linux/sched/signal.h | 20 ++++--
include/linux/signal.h | 3 +-
kernel/ptrace.c | 87 ++++++++---------------
kernel/sched/core.c | 5 +-
kernel/signal.c | 140 +++++++++++++++++---------------------
kernel/time/posix-cpu-timers.c | 6 +-
20 files changed, 140 insertions(+), 240 deletions(-)
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEgjlraLDcwBA2B+6cC/v6Eiajj0AFAmKaXaYACgkQC/v6Eiaj
j0CgoA/+JncSQ6PY2D5Jh1apvHzmnRsFXzr3DRvtv/CVx4oIebOXRQFyVDeD5tRn
TmMgB29HpBlHRDLojlmlZRGAld1HR/aPEW9j8W1D3Sy/ZFO5L8lQitv9aDHO9Ntw
4lZvlhS1M0KhATudVVBqSPixiG6CnV5SsGmixqdOyg7xcXSY6G1l2nB7Zk9I3Tat
ZlmhuZ6R5Z5qsm4MEq0vUSrnsHiGxYrpk6uQOaVz8Wkv8ZFmbutt6XgxF0tsyZNn
mHSmWSiZzIgBjTlaibEmxi8urYJTPj3vGBeJQVYHblFwLFi6+Oy7bDxQbWjQvaZh
DsgWPScfBF4Jm0+8hhCiSYpvPp8XnZuklb4LNCeok/VFr+KfSmpJTIhn00kagQ1u
vxQDqLws8YLW4qsfGydfx9uUIFCbQE/V2VDYk5J3Re3gkUNDOOR1A56hPniKv6VB
2aqGO2Fl0RdBbUa3JF+XI5Pwq5y1WrqR93EUvj+5+u5W9rZL/8WLBHBMEz6gbmfD
DhwFE0y8TG2WRlWJVEDRId+5zo3di/YvasH0vJZ5HbrxhS2RE/yIGAd+kKGx/lZO
qWDJC7IHvFJ7Mw5KugacyF0SHeNdloyBM7KZW6HeXmgKn9IMJBpmwib92uUkRZJx
D8j/bHHqD/zsgQ39nO+c4M0MmhO/DsPLG/dnGKrRCu7v1tmEnkY=
=ZUuO
-----END PGP SIGNATURE-----
Merge tag 'ptrace_stop-cleanup-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull ptrace_stop cleanups from Eric Biederman:
"While looking at the ptrace problems with PREEMPT_RT and the problems
Peter Zijlstra was encountering with ptrace in his freezer rewrite I
identified some cleanups to ptrace_stop that make sense on their own
and move make resolving the other problems much simpler.
The biggest issue is the habit of the ptrace code to change
task->__state from the tracer to suppress TASK_WAKEKILL from waking up
the tracee. No other code in the kernel does that and it is straight
forward to update signal_wake_up and friends to make that unnecessary.
Peter's task freezer sets frozen tasks to a new state TASK_FROZEN and
then it stores them by calling "wake_up_state(t, TASK_FROZEN)" relying
on the fact that all stopped states except the special stop states can
tolerate spurious wake up and recover their state.
The state of stopped and traced tasked is changed to be stored in
task->jobctl as well as in task->__state. This makes it possible for
the freezer to recover tasks in these special states, as well as
serving as a general cleanup. With a little more work in that
direction I believe TASK_STOPPED can learn to tolerate spurious wake
ups and become an ordinary stop state.
The TASK_TRACED state has to remain a special state as the registers
for a process are only reliably available when the process is stopped
in the scheduler. Fundamentally ptrace needs acess to the saved
register values of a task.
There are bunch of semi-random ptrace related cleanups that were found
while looking at these issues.
One cleanup that deserves to be called out is from commit 57b6de08b5
("ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs"). This
makes a change that is technically user space visible, in the handling
of what happens to a tracee when a tracer dies unexpectedly. According
to our testing and our understanding of userspace nothing cares that
spurious SIGTRAPs can be generated in that case"
* tag 'ptrace_stop-cleanup-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state
ptrace: Always take siglock in ptrace_resume
ptrace: Don't change __state
ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs
ptrace: Document that wait_task_inactive can't fail
ptrace: Reimplement PTRACE_KILL by always sending SIGKILL
signal: Use lockdep_assert_held instead of assert_spin_locked
ptrace: Remove arch_ptrace_attach
ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP
ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP
signal: Replace __group_send_sig_info with send_signal_locked
signal: Rename send_signal send_signal_locked
ordinary user mode tasks.
In commit 40966e316f ("kthread: Ensure struct kthread is present for
all kthreads") caused init and the user mode helper threads that call
kernel_execve to have struct kthread allocated for them. This struct
kthread going away during execve in turned made a use after free of
struct kthread possible.
The commit 343f4c49f2 ("kthread: Don't allocate kthread_struct for
init and umh") is enough to fix the use after free and is simple enough
to be backportable.
The rest of the changes pass struct kernel_clone_args to clean things
up and cause the code to make sense.
In making init and the user mode helpers tasks purely user mode tasks
I ran into two complications. The function task_tick_numa was
detecting tasks without an mm by testing for the presence of
PF_KTHREAD. The initramfs code in populate_initrd_image was using
flush_delayed_fput to ensuere the closing of all it's file descriptors
was complete, and flush_delayed_fput does not work in a userspace thread.
I have looked and looked and more complications and in my code review
I have not found any, and neither has anyone else with the code sitting
in linux-next.
Link: https://lkml.kernel.org/r/87mtfu4up3.fsf@email.froward.int.ebiederm.org
Eric W. Biederman (8):
kthread: Don't allocate kthread_struct for init and umh
fork: Pass struct kernel_clone_args into copy_thread
fork: Explicity test for idle tasks in copy_thread
fork: Generalize PF_IO_WORKER handling
init: Deal with the init process being a user mode process
fork: Explicitly set PF_KTHREAD
fork: Stop allowing kthreads to call execve
sched: Update task_tick_numa to ignore tasks without an mm
arch/alpha/kernel/process.c | 13 ++++++------
arch/arc/kernel/process.c | 13 ++++++------
arch/arm/kernel/process.c | 12 ++++++-----
arch/arm64/kernel/process.c | 12 ++++++-----
arch/csky/kernel/process.c | 15 ++++++-------
arch/h8300/kernel/process.c | 10 ++++-----
arch/hexagon/kernel/process.c | 12 ++++++-----
arch/ia64/kernel/process.c | 15 +++++++------
arch/m68k/kernel/process.c | 12 ++++++-----
arch/microblaze/kernel/process.c | 12 ++++++-----
arch/mips/kernel/process.c | 13 ++++++------
arch/nios2/kernel/process.c | 12 ++++++-----
arch/openrisc/kernel/process.c | 12 ++++++-----
arch/parisc/kernel/process.c | 18 +++++++++-------
arch/powerpc/kernel/process.c | 15 +++++++------
arch/riscv/kernel/process.c | 12 ++++++-----
arch/s390/kernel/process.c | 12 ++++++-----
arch/sh/kernel/process_32.c | 12 ++++++-----
arch/sparc/kernel/process_32.c | 12 ++++++-----
arch/sparc/kernel/process_64.c | 12 ++++++-----
arch/um/kernel/process.c | 15 +++++++------
arch/x86/include/asm/fpu/sched.h | 2 +-
arch/x86/include/asm/switch_to.h | 8 +++----
arch/x86/kernel/fpu/core.c | 4 ++--
arch/x86/kernel/process.c | 18 +++++++++-------
arch/xtensa/kernel/process.c | 17 ++++++++-------
fs/exec.c | 8 ++++---
include/linux/sched/task.h | 8 +++++--
init/initramfs.c | 2 ++
init/main.c | 2 +-
kernel/fork.c | 46 +++++++++++++++++++++++++++++++++-------
kernel/sched/fair.c | 2 +-
kernel/umh.c | 6 +++---
33 files changed, 234 insertions(+), 160 deletions(-)
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEgjlraLDcwBA2B+6cC/v6Eiajj0AFAmKaR/MACgkQC/v6Eiaj
j0Aayg/7Bx66872d9c6igkJ+MPCTuh+v9QKCGwiYEmiU4Q5sVAFB0HPJO27qC14u
630X0RFNZTkPzNNEJNIW4kw6Dj8s8YRKf+FgQAVt4SzdRwT7eIPDjk1nGraopPJ3
O04pjvuTmUyidyViRyFcf2ptx/pnkrwP8jUSc+bGTgfASAKAgAokqKE5ecjewbBc
Y/EAkQ6QW7KxPjeSmpAHwI+t3BpBev9WEC4PbhRhsBCQFO2+PJiklvqdhVNBnIjv
qUezll/1xv9UYgniB15Q4Nb722SmnWSU3r8as1eFPugzTHizKhufrrpyP+KMK1A0
tdtEJNs5t2DZF7ZbGTFSPqJWmyTYLrghZdO+lOmnaSjHxK4Nda1d4NzbefJ0u+FE
tutewowvHtBX6AFIbx+H3O+DOJM2IgNMf+ReQDU/TyNyVf3wBrTbsr9cLxypIJIp
zze8npoLMlB7B4yxVo5ES5e63EXfi3iHl0L3/1EhoGwriRz1kWgVLUX/VZOUpscL
RkJHsW6bT8sqxPWAA5kyWjEN+wNR2PxbXi8OE4arT0uJrEBMUgDCzydzOv5tJB00
mSQdytxH9LVdsmxBKAOBp5X6WOLGA4yb1cZ6E/mEhlqXMpBDF1DaMfwbWqxSYi4q
sp5zU3SBAW0qceiZSsWZXInfbjrcQXNV/DkDRDO9OmzEZP4m1j0=
=x6fy
-----END PGP SIGNATURE-----
Merge tag 'kthread-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull kthread updates from Eric Biederman:
"This updates init and user mode helper tasks to be ordinary user mode
tasks.
Commit 40966e316f ("kthread: Ensure struct kthread is present for
all kthreads") caused init and the user mode helper threads that call
kernel_execve to have struct kthread allocated for them. This struct
kthread going away during execve in turned made a use after free of
struct kthread possible.
Here, commit 343f4c49f2 ("kthread: Don't allocate kthread_struct for
init and umh") is enough to fix the use after free and is simple
enough to be backportable.
The rest of the changes pass struct kernel_clone_args to clean things
up and cause the code to make sense.
In making init and the user mode helpers tasks purely user mode tasks
I ran into two complications. The function task_tick_numa was
detecting tasks without an mm by testing for the presence of
PF_KTHREAD. The initramfs code in populate_initrd_image was using
flush_delayed_fput to ensuere the closing of all it's file descriptors
was complete, and flush_delayed_fput does not work in a userspace
thread.
I have looked and looked and more complications and in my code review
I have not found any, and neither has anyone else with the code
sitting in linux-next"
* tag 'kthread-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
sched: Update task_tick_numa to ignore tasks without an mm
fork: Stop allowing kthreads to call execve
fork: Explicitly set PF_KTHREAD
init: Deal with the init process being a user mode process
fork: Generalize PF_IO_WORKER handling
fork: Explicity test for idle tasks in copy_thread
fork: Pass struct kernel_clone_args into copy_thread
kthread: Don't allocate kthread_struct for init and umh
- Various cleanups and fixes: xterm, serial line, time travel
- Set ARCH_HAS_GCOV_PROFILE_ALL
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAmKZz4UWHHJpY2hhcmRA
c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wRrZD/951tWxCjNiSZYL8B32YnaxqJLf
vwDwXWxYbXmtLgVhqTGCQ61qNFhKnGVeyfO2niacB5EB3VFzxtB2dXTswDpcu/93
o99/Dozisehcn9L3CbGO0sKCxKZbdgP4TQPOGQpr8lyzv9NlmNF/bgkiH8s8rIB/
ACaiKmLrAVIyQz/8VElFqJNyB/RkmcILks//jVidFlueZhmYMSzbfdVDFyTDDJMA
JEmIz+SG2hg15yCy620EpWgskHvSQSNhWMv4wxJVy4XRdZ7nztb9babV3lIaRmaI
8rBR+DsLGlDeep3SOv63giFzjMjHpXcAlJ0UoefkaRA4htSP4GmyDPHX6Fo6ilW4
fQ52lxsHmP+6fLmWNOnFjsvk93z9u3XU55ReEhP1PGzgAGDNUczy8BEYvb6l0Weq
M8BdYgU2nh/SA0ycmXdSVbyl7nFST5s0lHr6hEt22CMxJ+jz6WS650vrnH2JimMX
bnJMcEYX6PeyDq4lTYyrWOCdzPTorT6eEcn/BM3qKgWkosK0FlN+nnT0HOef5Bag
jezo4/dt/VPfftKQS28Waufud/nnJD2oFAvFBG0/YHTooTZMkSnH2LnRbYkFo7vS
xkDRnHJGSDOaxzdSfgNlwveqZ/qTgjNs1CUYHBrs2Tj1dJicqro+wWjZg8qzCHI+
8YgCQjaEOYLN6CN0CA==
=hD4o
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML updates from Richard Weinberger:
- Various cleanups and fixes: xterm, serial line, time travel
- Set ARCH_HAS_GCOV_PROFILE_ALL
* tag 'for-linus-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: Fix out-of-bounds read in LDT setup
um: chan_user: Fix winch_tramp() return value
um: virtio_uml: Fix broken device handling in time-travel
um: line: Use separate IRQs per line
um: Enable ARCH_HAS_GCOV_PROFILE_ALL
um: Use asm-generic/dma-mapping.h
um: daemon: Make default socket configurable
um: xterm: Make default terminal emulator configurable
The previous fix here was only partially correct, it did
result in returning a proper error value in case of error,
but it also clobbered the pid that we need to return from
this function (not just zero for success).
As a result, it returned 0 here, but later this is treated
as a pid and used to kill the process, but since it's now
0 we kill(0, SIGKILL), which makes UML kill itself rather
than just the helper thread.
Fix that and make it more obvious by using a separate
variable for the pid.
Fixes: ccf1236eca ("um: fix error return code in winch_tramp()")
Reported-and-tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
If a device implementation crashes, virtio_uml will mark it
as dead by calling virtio_break_device() and scheduling the
work that will remove it.
This still seems like the right thing to do, but it's done
directly while reading the message, and if time-travel is
used, this is in the time-travel handler, outside of the
normal Linux machinery. Therefore, we cannot acquire locks
or do normal "linux-y" things because e.g. lockdep will be
confused about the context.
Move handling this situation out of the read function and
into the actual IRQ handler and response handling instead,
so that in the case of time-travel we don't call it in the
wrong context.
Chances are the system will still crash immediately, since
the device implementation crashing may also cause the time-
travel controller to go down, but at least all of that now
happens without strange warnings from lockdep.
Fixes: c8177aba37 ("um: time-travel: rework interrupt handling in ext mode")
Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Today, all possible serial lines (ssl*=) as well as all
possible consoles (con*=) each share a single interrupt
(with a fixed number) with others of the same type.
Now, if you have two lines, say ssl0 and ssl1, and one
of them is connected to an fd you cannot read (e.g. a
file), but the other gets a read interrupt, then both
of them get the interrupt since it's shared. Then, the
read() call will return EOF, since it's a file being
written and there's nothing to read (at least not at
the current offset, at the end).
Unfortunately, this is treated as a read error, and we
close this line, losing all the possible output.
It might be possible to work around this and make the
IRQ sharing work, however, now that we have dynamically
allocated IRQs that are easy to use, simply use that to
achieve separating between the events; then there's no
interrupt for that line and we never attempt the read
in the first place, thus not closing the line.
This manifested itself in the wifi hostap/hwsim tests
where the parallel script communicates via one serial
console and the kernel messages go to another (a file)
and sending data on the communication console caused
the kernel messages to stop flowing into the file.
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: anton ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Enable ARCH_HAS_GCOV_PROFILE_ALL so that CONFIG_GCOV_PROFILE_ALL can be
selected on UML. I didn't need to explicitly disable GCOV on anything
to get this to work on the configs I tested.
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
If DMA (PCI over virtio) is enabled, then some drivers may
enable CONFIG_DMA_OPS as well, and then we pull in the x86
definition of get_arch_dma_ops(), which uses the dma_ops
symbol, which isn't defined.
Since we don't have real DMA ops nor any kind of IOMMU fix
this in the simplest possible way: pull in the asm-generic
file instead of inheriting the x86 one. It's not clear why
those drivers that do (e.g. VDPA) "select DMA_OPS", and if
they'd even work with this, but chances are nobody will be
wanting to do that anyway, so fixing the build failure is
good enough.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: 68f5d3f3b6 ("um: add PCI over virtio emulation driver")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
Even if daemon network is deprecated, some configurations may
still use it (e.g. Debian), and not want to default to the
/tmp/uml.ctl socket location. Allow configuring the default
socket location.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Tested-by: Ritesh Raj Sarraf <ritesh@collabora.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Make the default terminal emulator configurable so e.g.
Debian can set it to x-terminal-emulator instead of the
current default of xterm.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Tested-by: Ritesh Raj Sarraf <ritesh@collabora.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
- Add HOSTPKG_CONFIG env variable to allow users to override pkg-config
- Support W=e as a shorthand for KCFLAGS=-Werror
- Fix CONFIG_IKHEADERS build to support toybox cpio
- Add scripts/dummy-tools/pahole to ease distro packagers' life
- Suppress false-positive warnings from checksyscalls.sh for W=2 build
- Factor out the common code of arch/*/boot/install.sh into
scripts/install.sh
- Support 'kernel-install' tool in scripts/prune-kernel
- Refactor module-versioning to link the symbol versions at the final
link of vmlinux and modules
- Remove CONFIG_MODULE_REL_CRCS because module-versioning now works in
an arch-agnostic way
- Refactor modpost, Makefiles
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmKOO2oVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGG54P/3/U5FIP5EoPAVu9HqSUKeeUiBYc
z1B8d7Wt1xU0xHImPWNjoacfye4MrDMUv8mEWKgHCVusJxbUoS+3Z/kd64NU75Fg
Cpj+9fP1N8m02IJzraxn6fw0bmfx4zp9Zsa9l2fjwL0emq4qhB7BA9/Nl6Png1IW
p0TPR6gV0Wgp6ikf/eJ3b1decFSqM7QzDlbo860nPMG164gNpDZmFVf2G4HCRQoY
GtgoQLEy2pBeOdU7+nJTKl2f5JOhDjRKX8equ7BHW9l7nbUvHd6ys3DGqYO3nvwV
hZZdHwDtxxO6bJtzClKPREyfL2H9R2AGxq94HzSwdvwdLLoFxrTN+mg88xBg17Rm
tKHy8jpZT36qh218h5lX5n9ZWcovTA38giZ+S/tkwOvvYGivKHDS23QwzB0HrG8/
VRd+0rhfIvuIpu0OQaTpTkZr2QVci2Zn6PPnxpyPEsGvWVFRjyx0WyZh4fFXnkQT
n+GS7j5g1LVMra0qu0y+yp4zy/DVFKIcfry0xU8S5SaSEBBcWUxLS2nnoBVB4vb2
RpiVD2vaOlvu/Zs2SOgtuMOnTw+Qqrvh7OYm/WyxWrB3JQGa/r+vipMKiFEDi2NN
pwR8wJT/CW1ycte93m3oO83jiitFqzXtAqo24wKlp4SOqnR/TQ/dx743ku2xvONe
uynJVW/gZVm4KEUl
=Y2TB
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Add HOSTPKG_CONFIG env variable to allow users to override pkg-config
- Support W=e as a shorthand for KCFLAGS=-Werror
- Fix CONFIG_IKHEADERS build to support toybox cpio
- Add scripts/dummy-tools/pahole to ease distro packagers' life
- Suppress false-positive warnings from checksyscalls.sh for W=2 build
- Factor out the common code of arch/*/boot/install.sh into
scripts/install.sh
- Support 'kernel-install' tool in scripts/prune-kernel
- Refactor module-versioning to link the symbol versions at the final
link of vmlinux and modules
- Remove CONFIG_MODULE_REL_CRCS because module-versioning now works in
an arch-agnostic way
- Refactor modpost, Makefiles
* tag 'kbuild-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (56 commits)
genksyms: adjust the output format to modpost
kbuild: stop merging *.symversions
kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS
modpost: extract symbol versions from *.cmd files
modpost: add sym_find_with_module() helper
modpost: change the license of EXPORT_SYMBOL to bool type
modpost: remove left-over cross_compile declaration
kbuild: record symbol versions in *.cmd files
kbuild: generate a list of objects in vmlinux
modpost: move *.mod.c generation to write_mod_c_files()
modpost: merge add_{intree_flag,retpoline,staging_flag} to add_header
scripts/prune-kernel: Use kernel-install if available
kbuild: factor out the common installation code into scripts/install.sh
modpost: split new_symbol() to symbol allocation and hash table addition
modpost: make sym_add_exported() always allocate a new symbol
modpost: make multiple export error
modpost: dump Module.symvers in the same order of modules.order
modpost: traverse the namespace_list in order
modpost: use doubly linked list for dump_lists
modpost: traverse unresolved symbols in order
...