Граф коммитов

519 Коммитов

Автор SHA1 Сообщение Дата
Linus Torvalds f306b90c69 Updates for the SMP and CPU hotplug:
- Remove DEFINE_SMP_CALL_CACHE_FUNCTION() which is a left over of the
    original hotplug code and now causing trouble with the ARM64 cache
    topology setup due to the pointless SMP function call. It's not longer
    required as the hotplug callbacks are guaranteed to be invoked on the
    upcoming CPU.
 
  - Remove the deprecated and now unused CPU hotplug functions
 
  - Rewrite the CPU hotplug API documentation
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmE+VhMTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoUZmD/9Q7XO8EgfitIh3sMO53spOv6ql1aWK
 1bHZmnFZL/txdIJiEgouf7wV4YgPgadJtZcK6//V/wGhYj5dB+z6otj+LwdrjjQT
 dgaXN6a27My0kvoyNCP2V3Xc9g6Q6XXAUadw+d7aWGqZvg5yAr+AdRgGmK3Ct2a1
 AsNjiG1HJsBMWv6eKnweOwfE6FbQpwFH4vXlldQi59QaMIOteMUwx9f64ZNyZWSe
 FNqVF2EVmLEmjMzhWSBzYqVdZBEUuEsPM2Y2UYqGAs7Wtwttoupredvplzsf2uJ/
 sCrDQspdgZsiD1EnjaSogLFUSfdRFd+9KvvChhuR8FSjPMNU+cWf62SAjVlUGIpI
 QI2G6S7707LPbun8KSlbqsXD2zKmZ9U+SkTdwJFpRhkket73uVYtuuR0PjSxUrxt
 BaULcpjKjf2joMji7BMvY7AR5bwnbDS+NUtqZpqhaUYHCjOZrPglGeUlLqth5epw
 SMP21BQq8Ys9M5/6dA3ATUYaE1vJb2ES7jn6sULVJ9e9RuupdCl3KfdGCaH9fiWg
 dfcowI9ACI+ZZ4OPJVvR/nlEVnK3GREYS5w3S/Ay1kLYpAfvGH2l3idzclfHMvWT
 ywB2uyRKowAT/Ig7mL7t3Y7ZOMLTzG8KxPfl8ar8Ja+oqDbEL5VOnIQXev3gxBgC
 1f4K8WUVGl+xXQ==
 =uAYt
 -----END PGP SIGNATURE-----

Merge tag 'smp-urgent-2021-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull CPU hotplug updates from Thomas Gleixner:
 "Updates for the SMP and CPU hotplug:

   - Remove DEFINE_SMP_CALL_CACHE_FUNCTION() which is a left over of the
     original hotplug code and now causing trouble with the ARM64 cache
     topology setup due to the pointless SMP function call.

     It's not longer required as the hotplug callbacks are guaranteed to
     be invoked on the upcoming CPU.

   - Remove the deprecated and now unused CPU hotplug functions

   - Rewrite the CPU hotplug API documentation"

* tag 'smp-urgent-2021-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Documentation: core-api/cpuhotplug: Rewrite the API section
  cpu/hotplug: Remove deprecated CPU-hotplug functions.
  thermal: Replace deprecated CPU-hotplug functions.
  drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION()
2021-09-12 12:42:51 -07:00
Linus Torvalds b79bd0d510 RISC-V Patches for the 5.15 Merge Window, Part 2
* A pair of defconfig additions, for NVMe and the EFI filesystem
   localization options.
 * A larger address space for stack randomization.
 * A cleanup to our install rules.
 * A DTS update for the Microchip Icicle board, to fix the serial
   console.
 * Support for build-time table sorting, which allows us to have
   __ex_table read-only.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmE84S4THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiYAlD/9NMmgarbs7kWtQjNPleAonwbgRHjW8
 3yVXDhxuAIihkr+grje6eahon17gbLcLhzSq89fUBUKe0gvRLbjV/So8P2pYre2K
 l5xIl+F72ULNM6KkjitPe972kFHnuR7IBlCy4zQhoNhc1XGd4qExE9T51v3dT3vg
 flIJwuc1lg/Knz96EoiNMufS3UjNqNDxxa54anplGXZW25jumnQELFlQTsxlRO3F
 FO+3JsiF7wzaxCNpWzmSsvMmXAaW9XerGKAIumtzRsXQ4EqPBfYdn/p+4A11QIUv
 R4cdsl/QP5XST3zR74ZXblaVCHXUZun3N0FHbdhmTcu2stdcs7q3GtWfR6H+OiQm
 igWwRn0MLPEiqmx12Ss+WELZOB/tyA14HHj6HE9gxbPcYcXO62Ok7Qg6gP4qMYkL
 3v6yqZ/WWdJEi7mzRrk5mTZAMgAdXf5Je6eAOxY3hCbwtC2UMOA1qCj3Ir9XW/pN
 TC/SDLDSqV9AfAZ2hRqxV1VLCASh0mTqt7yq+Vp/jBd4S8MevXV6LS9YGMxjTR2v
 OzUMS53oZhddLSemZRi/cp1B8PP0Ih2A/f816iY+9MswJWZAkR6TqJL8TfSbQh2u
 f3yTVAkRvavi7gnJ/s1HuzavRFtv452zmrBGhGh4IsGctSPWn/C+U2N5Pl3bEDhw
 88R+rkAJkfLRkw==
 =4SV6
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - A pair of defconfig additions, for NVMe and the EFI filesystem
   localization options.

 - A larger address space for stack randomization.

 - A cleanup to our install rules.

 - A DTS update for the Microchip Icicle board, to fix the serial
   console.

 - Support for build-time table sorting, which allows us to have
   __ex_table read-only.

* tag 'riscv-for-linus-5.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Move EXCEPTION_TABLE to RO_DATA segment
  riscv: Enable BUILDTIME_TABLE_SORT
  riscv: dts: microchip: mpfs-icicle: Fix serial console
  riscv: move the (z)install rules to arch/riscv/Makefile
  riscv: Improve stack randomisation on RV64
  riscv: defconfig: enable NLS_CODEPAGE_437, NLS_ISO8859_1
  riscv: defconfig: enable BLK_DEV_NVME
2021-09-11 14:29:42 -07:00
Jisheng Zhang 6f55ab36be
riscv: Move EXCEPTION_TABLE to RO_DATA segment
_ex_table section is read-only, so move it to RO_DATA.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-09-10 23:59:48 -07:00
Thomas Gleixner c2f4954c2d Merge branch 'linus' into smp/urgent
Ensure that all usage sites of get/put_online_cpus() except for the
struggler in drivers/thermal are gone. So the last user and the deprecated
inlines can be removed.
2021-09-11 00:38:47 +02:00
Linus Torvalds 2d338201d5 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "147 patches, based on 7d2a07b769.

  Subsystems affected by this patch series: mm (memory-hotplug, rmap,
  ioremap, highmem, cleanups, secretmem, kfence, damon, and vmscan),
  alpha, percpu, procfs, misc, core-kernel, MAINTAINERS, lib,
  checkpatch, epoll, init, nilfs2, coredump, fork, pids, criu, kconfig,
  selftests, ipc, and scripts"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits)
  scripts: check_extable: fix typo in user error message
  mm/workingset: correct kernel-doc notations
  ipc: replace costly bailout check in sysvipc_find_ipc()
  selftests/memfd: remove unused variable
  Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH
  configs: remove the obsolete CONFIG_INPUT_POLLDEV
  prctl: allow to setup brk for et_dyn executables
  pid: cleanup the stale comment mentioning pidmap_init().
  kernel/fork.c: unexport get_{mm,task}_exe_file
  coredump: fix memleak in dump_vma_snapshot()
  fs/coredump.c: log if a core dump is aborted due to changed file permissions
  nilfs2: use refcount_dec_and_lock() to fix potential UAF
  nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group
  nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group
  nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group
  nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group
  nilfs2: fix NULL pointer in nilfs_##name##_attr_release
  nilfs2: fix memory leak in nilfs_sysfs_create_device_group
  trap: cleanup trap_init()
  init: move usermodehelper_enable() to populate_rootfs()
  ...
2021-09-08 12:55:35 -07:00
Kefeng Wang 8b097881b5 trap: cleanup trap_init()
There are some empty trap_init() definitions in different ARCHs, Introduce
a new weak trap_init() function to clean them up.

Link: https://lkml.kernel.org/r/20210812123602.76356-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>	[arm32]
Acked-by: Vineet Gupta						[arc]
Acked-by: Michael Ellerman <mpe@ellerman.id.au>			[powerpc]
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Stafford Horne <shorne@gmail.com>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <palmerdabbelt@google.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 11:50:27 -07:00
Linus Torvalds 063df71a57 RISC-V Patches for the 5.15 Merge Window, Part 1
* Support for PC-relative instructions (auipc and branches) in kprobes.
 * Support for forced IRQ threading.
 * Support for the hlt/nohlt kernel command line options, via the generic
   idle loop.
 * Support for showing the edge/level triggered behavior of interrupts in
   /proc/interrupts.
 * A handful of cleanups to our address mapping mechanisms.
 * Support for allocating gigantic hugepages via CMA.
 * Support for the undefined behavior sanitizer.
 * A handful of cleanups to the VDSO that allow the kernel to build with
   LLD.
 * Support for hugepage migration.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmEzxA4THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYifuvD/93Tr9oAy6dwkGoa/VIzBc/nklCeTiO
 cv2W9eFQMG+PdRh1Ezi9FrLO6qZoADGOPPcnuc2HTTe58CWWWg0juC862H/N2k3V
 +3LGhHTv3E7NsEZNa4YhQXw6pPmj5OhbHab87HNowHpuNJGpKYE6Q+QbnYl5vorI
 9YS10pRpqvO6tdHUW335Z5N0nW9X3HCHD6NUb1ZV8qd34QzP+XeCJMqVXpoOijNO
 96lhyDYPivM2Rhi+uvtSjUeWzbddnN/P9IywfECddHFJyGvoEtiIROo9d2sR6pla
 lAa1YOw0ybD/UPHtL/diEU+K8ivNVKdRN/YvlzHS4CdCrPyuWSUmBMozsOBMBwf1
 UZCMDwmP2sQkHu1Ht9g1fXwk2R85j1s+qE5cZEXyAEULKfHlmt4vygd1xgm8DTf9
 xgS+np2g0frcmKU8cLPmC7Sr0YI1fPeDMRkBXTYE1SmBF3eHrZCCbeM1H97p2mUC
 zs/Dvyoj7e25HGyYqxN3lHZzhMluqh7wCQSOQMB2YxVwdiA4VjyjxLF4MKAdXL8i
 F1QTETsjLGu6UiMfNIQ4RVs981/fGYSANdOvMfwo447gVH6JaxC51sLo6YIDtXss
 tLzchHNu5IMNqB7XOb1gU8ZKeXruG9lrxwUOyEIYAVi6pHCbkdLV0A8ymwEjj06b
 aGV4xFPaAGvDLQ==
 =rDB3
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.15-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - support PC-relative instructions (auipc and branches) in kprobes

 - support for forced IRQ threading

 - support for the hlt/nohlt kernel command line options, via the
   generic idle loop

 - show the edge/level triggered behavior of interrupts
   in /proc/interrupts

 - a handful of cleanups to our address mapping mechanisms

 - support for allocating gigantic hugepages via CMA

 - support for the undefined behavior sanitizer (UBSAN)

 - a handful of cleanups to the VDSO that allow the kernel to build with
   LLD.

 - support for hugepage migration

* tag 'riscv-for-linus-5.15-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (21 commits)
  riscv: add support for hugepage migration
  RISC-V: Fix VDSO build for !MMU
  riscv: use strscpy to replace strlcpy
  riscv: explicitly use symbol offsets for VDSO
  riscv: Enable Undefined Behavior Sanitizer UBSAN
  riscv: Keep the riscv Kconfig selects sorted
  riscv: Support allocating gigantic hugepages using CMA
  riscv: fix the global name pfn_base confliction error
  riscv: Move early fdt mapping creation in its own function
  riscv: Simplify BUILTIN_DTB device tree mapping handling
  riscv: Use __maybe_unused instead of #ifdefs around variable declarations
  riscv: Get rid of map_size parameter to create_kernel_page_table
  riscv: Introduce va_kernel_pa_offset for 32-bit kernel
  riscv: Optimize kernel virtual address conversion macro
  dt-bindings: riscv: add starfive jh7100 bindings
  riscv: Enable GENERIC_IRQ_SHOW_LEVEL
  riscv: Enable idle generic idle loop
  riscv: Allow forced irq threading
  riscv: Implement thread_struct whitelist for hardened usercopy
  riscv: kprobes: implement the branch instructions
  ...
2021-09-05 11:31:23 -07:00
Thomas Gleixner 4b92d4add5 drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION()
DEFINE_SMP_CALL_CACHE_FUNCTION() was usefel before the CPU hotplug rework
to ensure that the cache related functions are called on the upcoming CPU
because the notifier itself could run on any online CPU.

The hotplug state machine guarantees that the callbacks are invoked on the
upcoming CPU. So there is no need to have this SMP function call
obfuscation. That indirection was missed when the hotplug notifiers were
converted.

This also solves the problem of ARM64 init_cache_level() invoking ACPI
functions which take a semaphore in that context. That's invalid as SMP
function calls run with interrupts disabled. Running it just from the
callback in context of the CPU hotplug thread solves this.

Fixes: 8571890e15 ("arm64: Add support for ACPI based firmware tables")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/871r69ersb.ffs@tglx
2021-09-01 10:29:10 +02:00
Jason Wang 803930ee35
riscv: use strscpy to replace strlcpy
The strlcpy should not be used because it doesn't limit the source
length. As linus says, it's a completely useless function if you
can't implicitly trust the source string - but that is almost always
why people think they should use it! All in all the BSD function
will lead some potential bugs.

But the strscpy doesn't require reading memory from the src string
beyond the specified "count" bytes, and since the return value is
easier to error-check than strlcpy()'s. In addition, the implementation
is robust to the string changing out from underneath it, unlike the
current strlcpy() implementation.

Thus, We prefer using strscpy instead of strlcpy.

Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-25 21:52:47 -07:00
Saleem Abdulrasool fde9c59aeb
riscv: explicitly use symbol offsets for VDSO
The current implementation of the `__rt_sigaction` reference computed an
absolute offset relative to the mapped base of the VDSO.  While this can
be handled in the medlow model, the medany model cannot handle this as
it is meant to be position independent.  The current implementation
relied on the BFD linker relaxing the PC-relative relocation into an
absolute relocation as it was a near-zero address allowing it to be
referenced relative to `zero`.

We now extract the offsets and create a generated header allowing the
build with LLVM and lld to succeed as we no longer depend on the linker
rewriting address references near zero.  This change was largely
modelled after the ARM64 target which does something similar.

Signed-off-by: Saleem Abdulrasool <abdulras@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-24 21:45:47 -07:00
Jisheng Zhang 8341dcfbd8
riscv: Enable Undefined Behavior Sanitizer UBSAN
Select ARCH_HAS_UBSAN_SANITIZE_ALL in order to allow the user to
enable CONFIG_UBSAN_SANITIZE_ALL and instrument the entire kernel for
ubsan checks.

VDSO is excluded because its build doesn't include the
__ubsan_handle_*() functions from lib/ubsan.c, and the VDSO has no
sane way to report errors even if it has definitions of these functions.

Passed lib/test_ubsan.c test.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-24 20:59:10 -07:00
Vincent Chen 379eb01c21
riscv: Ensure the value of FP registers in the core dump file is up to date
The value of FP registers in the core dump file comes from the
thread.fstate. However, kernel saves the FP registers to the thread.fstate
only before scheduling out the process. If no process switch happens
during the exception handling process, kernel will not have a chance to
save the latest value of FP registers to thread.fstate. It will cause the
value of FP registers in the core dump file may be incorrect. To solve this
problem, this patch force lets kernel save the FP register into the
thread.fstate if the target task_struct equals the current.

Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Fixes: b8c8a9590e ("RISC-V: Add FP register ptrace support for gdb.")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-24 20:54:10 -07:00
Petr Pavlu aa3e1ba32e
riscv: Fix a number of free'd resources in init_resources()
Function init_resources() allocates a boot memory block to hold an array of
resources which it adds to iomem_resource. The array is filled in from its
end and the function then attempts to free any unused memory at the
beginning. The problem is that size of the unused memory is incorrectly
calculated and this can result in releasing memory which is in use by
active resources. Their data then gets corrupted later when the memory is
reused by a different part of the system.

Fix the size of the released memory to correctly match the number of unused
resource entries.

Fixes: ffe0e52612 ("RISC-V: Improve init_resources()")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Nick Kossifidis <mick@ics.forth.gr>
Tested-by: Sunil V L <sunilvl@ventanamicro.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-20 10:15:51 -07:00
Changbin Du 030d6dbf0c
riscv: kexec: do not add '-mno-relax' flag if compiler doesn't support it
The RISC-V special option '-mno-relax' which to disable linker relaxations
is supported by GCC8+. For GCC7 and lower versions do not support this
option.

Fixes: fba8a8674f ("RISC-V: Add kexec support")
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-12 07:16:52 -07:00
Jisheng Zhang 78d9d8005e
riscv: stacktrace: Fix NULL pointer dereference
When CONFIG_FRAME_POINTER=y, calling dump_stack() can always trigger
NULL pointer dereference panic similar as below:

[    0.396060] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc5+ #47
[    0.396692] Hardware name: riscv-virtio,qemu (DT)
[    0.397176] Call Trace:
[    0.398191] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000960
[    0.399487] Oops [#1]
[    0.399739] Modules linked in:
[    0.400135] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc5+ #47
[    0.400570] Hardware name: riscv-virtio,qemu (DT)
[    0.400926] epc : walk_stackframe+0xc4/0xdc
[    0.401291]  ra : dump_backtrace+0x30/0x38
[    0.401630] epc : ffffffff80004922 ra : ffffffff8000496a sp : ffffffe000f3bd00
[    0.402115]  gp : ffffffff80cfdcb8 tp : ffffffe000f30000 t0 : ffffffff80d0b0cf
[    0.402602]  t1 : ffffffff80d0b0c0 t2 : 0000000000000000 s0 : ffffffe000f3bd60
[    0.403071]  s1 : ffffffff808bc2e8 a0 : 0000000000001000 a1 : 0000000000000000
[    0.403448]  a2 : ffffffff803d7088 a3 : ffffffff808bc2e8 a4 : 6131725dbc24d400
[    0.403820]  a5 : 0000000000001000 a6 : 0000000000000002 a7 : ffffffffffffffff
[    0.404226]  s2 : 0000000000000000 s3 : 0000000000000000 s4 : 0000000000000000
[    0.404634]  s5 : ffffffff803d7088 s6 : ffffffff808bc2e8 s7 : ffffffff80630650
[    0.405085]  s8 : ffffffff80912a80 s9 : 0000000000000008 s10: ffffffff804000fc
[    0.405388]  s11: 0000000000000000 t3 : 0000000000000043 t4 : ffffffffffffffff
[    0.405616]  t5 : 000000000000003d t6 : ffffffe000f3baa8
[    0.405793] status: 0000000000000100 badaddr: 0000000000000960 cause: 000000000000000d
[    0.406135] [<ffffffff80004922>] walk_stackframe+0xc4/0xdc
[    0.407032] [<ffffffff8000496a>] dump_backtrace+0x30/0x38
[    0.407797] [<ffffffff803d7100>] show_stack+0x40/0x4c
[    0.408234] [<ffffffff803d9e5c>] dump_stack+0x90/0xb6
[    0.409019] [<ffffffff8040423e>] ptdump_init+0x20/0xc4
[    0.409681] [<ffffffff800015b6>] do_one_initcall+0x4c/0x226
[    0.410110] [<ffffffff80401094>] kernel_init_freeable+0x1f4/0x258
[    0.410562] [<ffffffff803dba88>] kernel_init+0x22/0x148
[    0.410959] [<ffffffff800029e2>] ret_from_exception+0x0/0x14
[    0.412241] ---[ end trace b2ab92c901b96251 ]---
[    0.413099] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

The reason is the task is NULL when we finally call walk_stackframe()
the NULL is passed from __dump_stack():

|static void __dump_stack(void)
|{
|        dump_stack_print_info(KERN_DEFAULT);
|        show_stack(NULL, NULL, KERN_DEFAULT);
|}

Fix this issue by checking "task == NULL" case in walk_stackframe().

Fixes: eac2f3059e ("riscv: stacktrace: fix the riscv stacktrace when CONFIG_FRAME_POINTER enabled")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Tested-by: Wende Tan <twd2.me@gmail.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-24 12:58:51 -07:00
Jisheng Zhang 76f5dfacfb
riscv: stacktrace: pin the task's stack in get_wchan
Pin the task's stack before calling walk_stackframe() in get_wchan().
This can fix the panic as reported by Andreas when CONFIG_VMAP_STACK=y:

[   65.609696] Unable to handle kernel paging request at virtual address ffffffd0003bbde8
[   65.610460] Oops [#1]
[   65.610626] Modules linked in: virtio_blk virtio_mmio rtc_goldfish btrfs blake2b_generic libcrc32c xor raid6_pq sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua efivarfs
[   65.611670] CPU: 2 PID: 1 Comm: systemd Not tainted 5.14.0-rc1-1.g34fe32a-default #1 openSUSE Tumbleweed (unreleased) c62f7109153e5a0897ee58ba52393ad99b070fd2
[   65.612334] Hardware name: riscv-virtio,qemu (DT)
[   65.613008] epc : get_wchan+0x5c/0x88
[   65.613334]  ra : get_wchan+0x42/0x88
[   65.613625] epc : ffffffff800048a4 ra : ffffffff8000488a sp : ffffffd00021bb90
[   65.614008]  gp : ffffffff817709f8 tp : ffffffe07fe91b80 t0 : 00000000000001f8
[   65.614411]  t1 : 0000000000020000 t2 : 0000000000000000 s0 : ffffffd00021bbd0
[   65.614818]  s1 : ffffffd0003bbdf0 a0 : 0000000000000001 a1 : 0000000000000002
[   65.615237]  a2 : ffffffff81618008 a3 : 0000000000000000 a4 : 0000000000000000
[   65.615637]  a5 : ffffffd0003bc000 a6 : 0000000000000002 a7 : ffffffe27d370000
[   65.616022]  s2 : ffffffd0003bbd90 s3 : ffffffff8071a81e s4 : 0000000000003fff
[   65.616407]  s5 : ffffffffffffc000 s6 : 0000000000000000 s7 : ffffffff81618008
[   65.616845]  s8 : 0000000000000001 s9 : 0000000180000040 s10: 0000000000000000
[   65.617248]  s11: 000000000000016b t3 : 000000ff00000000 t4 : 0c6aec92de5e3fd7
[   65.617672]  t5 : fff78f60608fcfff t6 : 0000000000000078
[   65.618088] status: 0000000000000120 badaddr: ffffffd0003bbde8 cause: 000000000000000d
[   65.618621] [<ffffffff800048a4>] get_wchan+0x5c/0x88
[   65.619008] [<ffffffff8022da88>] do_task_stat+0x7a2/0xa46
[   65.619325] [<ffffffff8022e87e>] proc_tgid_stat+0xe/0x16
[   65.619637] [<ffffffff80227dd6>] proc_single_show+0x46/0x96
[   65.619979] [<ffffffff801ccb1e>] seq_read_iter+0x190/0x31e
[   65.620341] [<ffffffff801ccd70>] seq_read+0xc4/0x104
[   65.620633] [<ffffffff801a6bfe>] vfs_read+0x6a/0x112
[   65.620922] [<ffffffff801a701c>] ksys_read+0x54/0xbe
[   65.621206] [<ffffffff801a7094>] sys_read+0xe/0x16
[   65.621474] [<ffffffff8000303e>] ret_from_syscall+0x0/0x2
[   65.622169] ---[ end trace f24856ed2b8789c5 ]---
[   65.622832] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-23 17:29:03 -07:00
Chen Lifu 67979e927d
riscv: kprobes: implement the branch instructions
This has been tested by probing a module that contains each of the
flavors of branches we have.

Signed-off-by: Chen Lifu <chenlifu@huawei.com>
[Palmer: commit message, fix kconfig errors]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-21 23:22:34 -07:00
Chen Lifu b7d2be48cc
riscv: kprobes: implement the auipc instruction
This has been tested by probing a module that contains an auipc
instruction.

Signed-off-by: Chen Lifu <chenlifu@huawei.com>
[Palmer: commit message]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-21 23:22:25 -07:00
Linus Torvalds 9b76d71fa8 RISC-V Patches for the 5.14 Merge Window, Part 1
In addition to We have a handful of new features for 5.14:
 
 * Support for transparent huge pages.
 * Support for generic PCI resources mapping.
 * Support for the mem= kernel parameter.
 * Support for KFENCE.
 * A handful of fixes to avoid W+X mappings in the kernel.
 * Support for VMAP_STACK based overflow detection.
 * An optimized copy_{to,from}_user.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmDn3XITHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiW4UD/9Z0BJNXG9rOERofyFWwb7EYchT551A
 Coi8BFpuUCZfT9qonuBzQcPrAlXH/T9yMDGiShZio9jh29bnaOIqo5NrvNjB88VU
 LdarNeiumPM4SCQFIsbIBnRrk5OQDtzsPx+dS5xVUQlnHUV26xBakAJo3K3FxG9Y
 bl2JU+LTvP52eBKpKHp0i2pbuC2z0dBEu9Y3d4q8phI+YeolJ0rgOCbMra+maCls
 kk5TeROrDJpTg5INsWpVNvKvRk+sNlh0K9AsZHL1fIA8d2UFyTeCltUNjkWkIZA/
 SBiS+ruxtLD9rTPOrF1i+rvW+rs/gL1LkPjOvoilTMuNm5ltCu5MLsflX6oZ5wX4
 2vAXup6HQzLcJpCCq2QyMIl2s8ORV+gY89nYmRYHLzCoqGtF/WhbSFSKjqNM1+Kf
 /M4C9q3kEopkdlHuKahR8dP4wYz6kP1kEijv83P21ahUFsShSLyLAegYoKO0pNeC
 H/WOWMBqpG+rE80Tsctfo/z3g16Wc51yesYjXsOP5iRYoytG2uGLtzG7fPTjtyuC
 Fa3Ue/9d/W/XyZ7bzolvGRoaeoHqoIXb07MsDLDhO2cHYg6B/wIvHXfPcM7ovR6f
 VT0aRu85dvvewlukuqyH5+rwSPNQfzHo32GIiK7h8QjjIEh1axOFi+7oXGfbcIUw
 EP0u4AZsK9iHZg==
 =uB+H
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.14-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:
 "We have a handful of new features for 5.14:

   - Support for transparent huge pages.

   - Support for generic PCI resources mapping.

   - Support for the mem= kernel parameter.

   - Support for KFENCE.

   - A handful of fixes to avoid W+X mappings in the kernel.

   - Support for VMAP_STACK based overflow detection.

   - An optimized copy_{to,from}_user"

* tag 'riscv-for-linus-5.14-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (37 commits)
  riscv: xip: Fix duplicate included asm/pgtable.h
  riscv: Fix PTDUMP output now BPF region moved back to module region
  riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall
  riscv: add VMAP_STACK overflow detection
  riscv: ptrace: add argn syntax
  riscv: mm: fix build errors caused by mk_pmd()
  riscv: Introduce structure that group all variables regarding kernel mapping
  riscv: Map the kernel with correct permissions the first time
  riscv: Introduce set_kernel_memory helper
  riscv: Enable KFENCE for riscv64
  RISC-V: Use asm-generic for {in,out}{bwlq}
  riscv: add ASID-based tlbflushing methods
  riscv: pass the mm_struct to __sbi_tlb_flush_range
  riscv: Add mem kernel parameter support
  riscv: Simplify xip and !xip kernel address conversion macros
  riscv: Remove CONFIG_PHYS_RAM_BASE_FIXED
  riscv: Only initialize swiotlb when necessary
  riscv: fix typo in init.c
  riscv: Cleanup unused functions
  riscv: mm: Use better bitmap_zalloc()
  ...
2021-07-09 10:36:29 -07:00
Kefeng Wang 723a42f4f6 riscv: convert to setup_initial_init_mm()
Use setup_initial_init_mm() helper to simplify code.

Link: https://lkml.kernel.org/r/20210608083418.137226-13-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-08 11:48:21 -07:00
Jiapeng Chong 1958e5aef5
riscv: xip: Fix duplicate included asm/pgtable.h
Clean up the following includecheck warning:

./arch/riscv/kernel/vmlinux-xip.lds.S: asm/pgtable.h is included more
than once.

No functional change.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-06 16:17:40 -07:00
Tong Tiangen 31da94c25a
riscv: add VMAP_STACK overflow detection
This patch adds stack overflow detection to riscv, usable when
CONFIG_VMAP_STACK=y.

Overflow is detected in kernel exception entry(kernel/entry.S), if the
kernel stack is overflow and been detected, the overflow handler is
invoked on a per-cpu overflow stack. This approach preserves GPRs and
the original exception information.

The overflow detect is performed before any attempt is made to access
the stack and the principle of stack overflow detection: kernel stacks
are aligned to double their size, enabling overflow to be detected with
a single bit test. For example, a 16K stack is aligned to 32K, ensuring
that bit 14 of the SP must be zero. On an overflow (or underflow), this
bit is flipped. Thus, overflow (of less than the size of the stack) can
be detected by testing whether this bit is set.

This gives us a useful error message on stack overflow, as can be
trigger with the LKDTM overflow test:

[  388.053267] lkdtm: Performing direct entry EXHAUST_STACK
[  388.053663] lkdtm: Calling function with 1024 frame size to depth 32 ...
[  388.054016] lkdtm: loop 32/32 ...
[  388.054186] lkdtm: loop 31/32 ...
[  388.054491] lkdtm: loop 30/32 ...
[  388.054672] lkdtm: loop 29/32 ...
[  388.054859] lkdtm: loop 28/32 ...
[  388.055010] lkdtm: loop 27/32 ...
[  388.055163] lkdtm: loop 26/32 ...
[  388.055309] lkdtm: loop 25/32 ...
[  388.055481] lkdtm: loop 24/32 ...
[  388.055653] lkdtm: loop 23/32 ...
[  388.055837] lkdtm: loop 22/32 ...
[  388.056015] lkdtm: loop 21/32 ...
[  388.056188] lkdtm: loop 20/32 ...
[  388.058145] Insufficient stack space to handle exception!
[  388.058153] Task stack:     [0xffffffd014260000..0xffffffd014264000]
[  388.058160] Overflow stack: [0xffffffe1f8d2c220..0xffffffe1f8d2d220]
[  388.058168] CPU: 0 PID: 89 Comm: bash Not tainted 5.12.0-rc8-dirty #90
[  388.058175] Hardware name: riscv-virtio,qemu (DT)
[  388.058187] epc : number+0x32/0x2c0
[  388.058247]  ra : vsnprintf+0x2ae/0x3f0
[  388.058255] epc : ffffffe0002d38f6 ra : ffffffe0002d814e sp : ffffffd01425ffc0
[  388.058263]  gp : ffffffe0012e4010 tp : ffffffe08014da00 t0 : ffffffd0142606e8
[  388.058271]  t1 : 0000000000000000 t2 : 0000000000000000 s0 : ffffffd014260070
[  388.058303]  s1 : ffffffd014260158 a0 : ffffffd01426015e a1 : ffffffd014260158
[  388.058311]  a2 : 0000000000000013 a3 : ffff0a01ffffff10 a4 : ffffffe000c398e0
[  388.058319]  a5 : 511b02ec65f3e300 a6 : 0000000000a1749a a7 : 0000000000000000
[  388.058327]  s2 : ffffffff000000ff s3 : 00000000ffff0a01 s4 : ffffffe0012e50a8
[  388.058335]  s5 : 0000000000ffff0a s6 : ffffffe0012e50a8 s7 : ffffffe000da1cc0
[  388.058343]  s8 : ffffffffffffffff s9 : ffffffd0142602b0 s10: ffffffd0142602a8
[  388.058351]  s11: ffffffd01426015e t3 : 00000000000f0000 t4 : ffffffffffffffff
[  388.058359]  t5 : 000000000000002f t6 : ffffffd014260158
[  388.058366] status: 0000000000000100 badaddr: ffffffd01425fff8 cause: 000000000000000f
[  388.058374] Kernel panic - not syncing: Kernel stack overflow
[  388.058381] CPU: 0 PID: 89 Comm: bash Not tainted 5.12.0-rc8-dirty #90
[  388.058387] Hardware name: riscv-virtio,qemu (DT)
[  388.058393] Call Trace:
[  388.058400] [<ffffffe000004944>] walk_stackframe+0x0/0xce
[  388.058406] [<ffffffe0006f0b28>] dump_backtrace+0x38/0x46
[  388.058412] [<ffffffe0006f0b46>] show_stack+0x10/0x18
[  388.058418] [<ffffffe0006f3690>] dump_stack+0x74/0x8e
[  388.058424] [<ffffffe0006f0d52>] panic+0xfc/0x2b2
[  388.058430] [<ffffffe0006f0acc>] print_trace_address+0x0/0x24
[  388.058436] [<ffffffe0002d814e>] vsnprintf+0x2ae/0x3f0
[  388.058956] SMP: stopping secondary CPUs

Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-06 12:11:38 -07:00
Alexandre Ghiti 658e2c5125
riscv: Introduce structure that group all variables regarding kernel mapping
We have a lot of variables that are used to hold kernel mapping addresses,
offsets between physical and virtual mappings and some others used for XIP
kernels: they are all defined at different places in mm/init.c, so group
them into a single structure with, for some of them, more explicit and concise
names.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-05 18:04:00 -07:00
Palmer Dabbelt 01112e5e20
Merge branch 'riscv-wx-mappings' into for-next
This contains both the short-term fix for the W+X boot mappings and the
larger cleanup.

* riscv-wx-mappings:
  riscv: Map the kernel with correct permissions the first time
  riscv: Introduce set_kernel_memory helper
  riscv: Simplify xip and !xip kernel address conversion macros
  riscv: Remove CONFIG_PHYS_RAM_BASE_FIXED
  riscv: mm: Fix W+X mappings at boot
2021-06-30 21:50:32 -07:00
Alexandre Ghiti e5c35fa040
riscv: Map the kernel with correct permissions the first time
For 64-bit kernels, we map all the kernel with write and execute
permissions and afterwards remove writability from text and executability
from data.

For 32-bit kernels, the kernel mapping resides in the linear mapping, so we
map all the linear mapping as writable and executable and afterwards we
remove those properties for unused memory and kernel mapping as
described above.

Change this behavior to directly map the kernel with correct permissions
and avoid going through the whole mapping to fix the permissions.

At the same time, this fixes an issue introduced by commit 2bfc6cd81b
("riscv: Move kernel mapping outside of linear mapping") as reported
here https://github.com/starfive-tech/linux/issues/17.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-30 21:18:58 -07:00
Linus Torvalds 54a728dc5e Scheduler udpates for this cycle:
- Changes to core scheduling facilities:
 
     - Add "Core Scheduling" via CONFIG_SCHED_CORE=y, which enables
       coordinated scheduling across SMT siblings. This is a much
       requested feature for cloud computing platforms, to allow
       the flexible utilization of SMT siblings, without exposing
       untrusted domains to information leaks & side channels, plus
       to ensure more deterministic computing performance on SMT
       systems used by heterogenous workloads.
 
       There's new prctls to set core scheduling groups, which
       allows more flexible management of workloads that can share
       siblings.
 
     - Fix task->state access anti-patterns that may result in missed
       wakeups and rename it to ->__state in the process to catch new
       abuses.
 
  - Load-balancing changes:
 
      - Tweak newidle_balance for fair-sched, to improve
        'memcache'-like workloads.
 
      - "Age" (decay) average idle time, to better track & improve workloads
        such as 'tbench'.
 
      - Fix & improve energy-aware (EAS) balancing logic & metrics.
 
      - Fix & improve the uclamp metrics.
 
      - Fix task migration (taskset) corner case on !CONFIG_CPUSET.
 
      - Fix RT and deadline utilization tracking across policy changes
 
      - Introduce a "burstable" CFS controller via cgroups, which allows
        bursty CPU-bound workloads to borrow a bit against their future
        quota to improve overall latencies & batching. Can be tweaked
        via /sys/fs/cgroup/cpu/<X>/cpu.cfs_burst_us.
 
      - Rework assymetric topology/capacity detection & handling.
 
  - Scheduler statistics & tooling:
 
      - Disable delayacct by default, but add a sysctl to enable
        it at runtime if tooling needs it. Use static keys and
        other optimizations to make it more palatable.
 
      - Use sched_clock() in delayacct, instead of ktime_get_ns().
 
  - Misc cleanups and fixes.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDZcPoRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1g3yw//WfhIqy7Psa9d/MBMjQDRGbTuO4+w22Dj
 vmWFU44Q4KJxQHWeIgUlrK+dzvYWvNmflUs2CUUOiDVzxFTHMIyBtL4qCBUbx4Ns
 vKAcB9wsWZge2o3WzZqpProRhdoRaSKw8egUr2q7rACVBkckY7eGP/OjWxXU8BdA
 b7D0LPWwuIBFfN4pFYeCDLn32Dqr9s6Chyj+ZecabdG7EE6Gu+f1diVcxy7JE/mc
 4WWL0D1RqdgpGrBEuMJIxPYekdrZiuy4jtEbztz5gbTBteN1cj3BLfqn0Pc/e6rO
 Vyuc5mXCAmzRVi18z6g6bsVl+IA/nrbErENB2OHOhOYtqiZxqGTd4GPWZszMyY17
 5AsEO5+5pcaBsy4gyp09qURggBu9zhJnMVmOI3rIHZkmkhwzc6uUJlyhDCTiFWOz
 3ZF3LjbZEyCKodMD8qMHbs3axIBpIfZqjzkvSKyFnvfXEGVytVse7NUuWtQ36u92
 GnURxVeYY1TDVXvE1Y8owNKMxknKQ6YRlypP7Dtbeo/qG6hShp0xmS7qDLDi0ybZ
 ZlK+bDECiVoDf3nvJo+8v5M82IJ3CBt4UYldeRJsa1YCK/FsbK8tp91fkEfnXVue
 +U6LPX0AmMpXacR5HaZfb3uBIKRw/QMdP/7RFtBPhpV6jqCrEmuqHnpPQiEVtxwO
 UmG7bt94Trk=
 =3VDr
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler udpates from Ingo Molnar:

 - Changes to core scheduling facilities:

    - Add "Core Scheduling" via CONFIG_SCHED_CORE=y, which enables
      coordinated scheduling across SMT siblings. This is a much
      requested feature for cloud computing platforms, to allow the
      flexible utilization of SMT siblings, without exposing untrusted
      domains to information leaks & side channels, plus to ensure more
      deterministic computing performance on SMT systems used by
      heterogenous workloads.

      There are new prctls to set core scheduling groups, which allows
      more flexible management of workloads that can share siblings.

    - Fix task->state access anti-patterns that may result in missed
      wakeups and rename it to ->__state in the process to catch new
      abuses.

 - Load-balancing changes:

    - Tweak newidle_balance for fair-sched, to improve 'memcache'-like
      workloads.

    - "Age" (decay) average idle time, to better track & improve
      workloads such as 'tbench'.

    - Fix & improve energy-aware (EAS) balancing logic & metrics.

    - Fix & improve the uclamp metrics.

    - Fix task migration (taskset) corner case on !CONFIG_CPUSET.

    - Fix RT and deadline utilization tracking across policy changes

    - Introduce a "burstable" CFS controller via cgroups, which allows
      bursty CPU-bound workloads to borrow a bit against their future
      quota to improve overall latencies & batching. Can be tweaked via
      /sys/fs/cgroup/cpu/<X>/cpu.cfs_burst_us.

    - Rework assymetric topology/capacity detection & handling.

 - Scheduler statistics & tooling:

    - Disable delayacct by default, but add a sysctl to enable it at
      runtime if tooling needs it. Use static keys and other
      optimizations to make it more palatable.

    - Use sched_clock() in delayacct, instead of ktime_get_ns().

 - Misc cleanups and fixes.

* tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
  sched/doc: Update the CPU capacity asymmetry bits
  sched/topology: Rework CPU capacity asymmetry detection
  sched/core: Introduce SD_ASYM_CPUCAPACITY_FULL sched_domain flag
  psi: Fix race between psi_trigger_create/destroy
  sched/fair: Introduce the burstable CFS controller
  sched/uclamp: Fix uclamp_tg_restrict()
  sched/rt: Fix Deadline utilization tracking during policy change
  sched/rt: Fix RT utilization tracking during policy change
  sched: Change task_struct::state
  sched,arch: Remove unused TASK_STATE offsets
  sched,timer: Use __set_current_state()
  sched: Add get_current_state()
  sched,perf,kvm: Fix preemption condition
  sched: Introduce task_is_running()
  sched: Unbreak wakeups
  sched/fair: Age the average idle time
  sched/cpufreq: Consider reduced CPU capacity in energy calculation
  sched/fair: Take thermal pressure into account while estimating energy
  thermal/cpufreq_cooling: Update offline CPUs per-cpu thermal_pressure
  sched/fair: Return early from update_tg_cfs_load() if delta == 0
  ...
2021-06-28 12:14:19 -07:00
Linus Torvalds 28a27cbd86 Perf events updates for this cycle:
- Platform PMU driver updates:
 
      - x86 Intel uncore driver updates for Skylake (SNR) and Icelake (ICX) servers
      - Fix RDPMC support
      - Fix [extended-]PEBS-via-PT support
      - Fix Sapphire Rapids event constraints
      - Fix :ppp support on Sapphire Rapids
      - Fix fixed counter sanity check on Alder Lake & X86_FEATURE_HYBRID_CPU
      - Other heterogenous-PMU fixes
 
  - Kprobes:
 
      - Remove the unused and misguided kprobe::fault_handler callbacks.
      - Warn about kprobes taking a page fault.
      - Fix the 'nmissed' stat counter.
 
  - Misc cleanups and fixes.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDZaxMRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hPgw//f9SnGzFoP1uR5TBqM8j/QHulMewew/iD
 dM5lh2emdmqHWYPBeRxUHgag38K2Golr3Y+NxLA3R+RMx+OZQe8Mz/wYvPQcBvsV
 k1HHImU3GRMn4GM7GwxH3vPIottDUx3mNS2J6pzlw3kwRUVqrxUdj/0/pSY/4eJ7
 ZT4uq4yLV83Jd3qioU7o7e/u6MrdNIIcAXRpVDdE9Mm1+kWXSVN7/h3Vsiz4tj5E
 iS+UXEtSc1a2mnmekv63pYkJHHNUb6guD8jgI/wrm1KIFGjDRifM+3TV6R/kB96/
 TfD2LhCcTShfSp8KI191pgV7/NQbB/PmLdSYmff3rTBiii4cqXuCygJCHInZ09z0
 4fTSSqM6aHg7kfTQyOCp+DUQ+9vNVXWo8mxt9c6B8xA0GyCI3zhjQ4UIiSUWRpjs
 Be5ZyF0kNNuPxYrKFnGnBf8+51DURpCz3sDdYRuK4KNkj1+4ZvJo/KzGTMUUIE4B
 IDQG6wDP5Kb388eRDtKrG5X7IXg+L5F/kezin60j0QF5MwDgxirT217teN8H1lNn
 YgWMjRK8Tw0flUJsbCxa51/nl93UtByB+fIRIc88MSeLxcI6/ORW+TxBBEqkYm5Z
 6BLFtmHSuAqAXUuyZXSGLcW7XLJvIaDoHgvbDn6l4g7FMWHqPOIq6nJQY3L8ben2
 e+fQrGh4noI=
 =20Vc
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf events updates from Ingo Molnar:

 - Platform PMU driver updates:

     - x86 Intel uncore driver updates for Skylake (SNR) and Icelake (ICX) servers
     - Fix RDPMC support
     - Fix [extended-]PEBS-via-PT support
     - Fix Sapphire Rapids event constraints
     - Fix :ppp support on Sapphire Rapids
     - Fix fixed counter sanity check on Alder Lake & X86_FEATURE_HYBRID_CPU
     - Other heterogenous-PMU fixes

 - Kprobes:

     - Remove the unused and misguided kprobe::fault_handler callbacks.
     - Warn about kprobes taking a page fault.
     - Fix the 'nmissed' stat counter.

 - Misc cleanups and fixes.

* tag 'perf-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Fix task context PMU for Hetero
  perf/x86/intel: Fix instructions:ppp support in Sapphire Rapids
  perf/x86/intel: Add more events requires FRONTEND MSR on Sapphire Rapids
  perf/x86/intel: Fix fixed counter check warning for some Alder Lake
  perf/x86/intel: Fix PEBS-via-PT reload base value for Extended PEBS
  perf/x86: Reset the dirty counter to prevent the leak for an RDPMC task
  kprobes: Do not increment probe miss count in the fault handler
  x86,kprobes: WARN if kprobes tries to handle a fault
  kprobes: Remove kprobe::fault_handler
  uprobes: Update uprobe_write_opcode() kernel-doc comment
  perf/hw_breakpoint: Fix DocBook warnings in perf hw_breakpoint
  perf/core: Fix DocBook warnings
  perf/core: Make local function perf_pmu_snapshot_aux() static
  perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on ICX
  perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on SNR
  perf/x86/intel/uncore: Generalize I/O stacks to PMON mapping procedure
  perf/x86/intel/uncore: Drop unnecessary NULL checks after container_of()
2021-06-28 12:03:20 -07:00
Peter Zijlstra b03fbd4ff2 sched: Introduce task_is_running()
Replace a bunch of 'p->state == TASK_RUNNING' with a new helper:
task_is_running(p).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210611082838.222401495@infradead.org
2021-06-18 11:43:07 +02:00
Ingo Molnar b2c0931a07 Merge branch 'sched/urgent' into sched/core, to resolve conflicts
This commit in sched/urgent moved the cfs_rq_is_decayed() function:

  a7b359fc6a37: ("sched/fair: Correctly insert cfs_rq's to list on unthrottle")

and this fresh commit in sched/core modified it in the old location:

  9e077b52d86a: ("sched/pelt: Check that *_avg are null when *_sum are")

Merge the two variants.

Conflicts:
	kernel/sched/fair.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-06-18 11:31:25 +02:00
Kefeng Wang ce3aca0465
riscv: Only initialize swiotlb when necessary
The SWIOTLB buffer is not needed unless the physical address space
is beyond the limit of dma, only initialize swiotlb when swiotlb_force
is true or not all system memory is DMA-able.

Also move the swiotlb_init() into mem_init().

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-11 13:42:26 -07:00
Vitaly Wool 5e63215c2f
riscv: xip: support runtime trap patching
RISCV_ERRATA_ALTERNATIVE patches text at runtime which is currently
not possible when the kernel is executed from the flash in XIP mode.
Since runtime patching concerns only traps at the moment, let's just
have all the traps reside in RAM anyway if RISCV_ERRATA_ALTERNATIVE
is set. Thus, these functions will be patch-able even when the .text
section is in flash.

Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-10 16:16:06 -07:00
Ingo Molnar a9e906b71f Merge branch 'sched/urgent' into sched/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-06-03 19:00:49 +02:00
Naveen N. Rao 2e38eb04c9 kprobes: Do not increment probe miss count in the fault handler
Kprobes has a counter 'nmissed', that is used to count the number of
times a probe handler was not called. This generally happens when we hit
a kprobe while handling another kprobe.

However, if one of the probe handlers causes a fault, we are currently
incrementing 'nmissed'. The comment in fault handler indicates that this
can be used to account faults taken by the probe handlers. But, this has
never been the intention as is evident from the comment above 'nmissed'
in 'struct kprobe':

	/*count the number of times this probe was temporarily disarmed */
	unsigned long nmissed;

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20210601120150.672652-1-naveen.n.rao@linux.vnet.ibm.com
2021-06-03 15:47:26 +02:00
Wende Tan da2d48808f
RISC-V: Fix memblock_free() usages in init_resources()
`memblock_free()` takes a physical address as its first argument.
Fix the wrong usages in `init_resources()`.

Fixes: ffe0e52612 ("RISC-V: Improve init_resources()")
Fixes: 797f0375dd ("RISC-V: Do not allocate memblock while iterating reserved memblocks")
Signed-off-by: Wende Tan <twd2.me@gmail.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-01 21:16:42 -07:00
Peter Zijlstra ec6aba3d2b kprobes: Remove kprobe::fault_handler
The reason for kprobe::fault_handler(), as given by their comment:

 * We come here because instructions in the pre/post
 * handler caused the page_fault, this could happen
 * if handler tries to access user space by
 * copy_from_user(), get_user() etc. Let the
 * user-specified handler try to fix it first.

Is just plain bad. Those other handlers are ran from non-preemptible
context and had better use _nofault() functions. Also, there is no
upstream usage of this.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20210525073213.561116662@infradead.org
2021-06-01 16:00:08 +02:00
Jisheng Zhang 3df952ae2a
riscv: Add __init section marker to some functions again
These functions are not needed after booting, so mark them as __init
to move them to the __init section.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-29 13:39:27 -07:00
Jisheng Zhang 8c9f4940c2
riscv: kprobes: Remove redundant kprobe_step_ctx
Inspired by commit ba090f9caf ("arm64: kprobes: Remove redundant
kprobe_step_ctx"), the ss_pending and match_addr of kprobe_step_ctx
are redundant because those can be replaced by KPROBE_HIT_SS and
&cur_kprobe->ainsn.api.insn[0] + GET_INSN_LENGTH(cur->opcode)
respectively.

Remove the kprobe_step_ctx to simplify the code.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-29 11:17:08 -07:00
Jisheng Zhang 37a7a2a10e
riscv: Turn has_fpu into a static key if FPU=y
The has_fpu check sits at hot code path: switch_to(). Currently, has_fpu
is a bool variable if FPU=y, switch_to() checks it each time, we can
optimize out this check by turning the has_fpu into a static key.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-25 22:56:57 -07:00
Kefeng Wang f842f5ff6a
riscv: Move setup_bootmem into paging_init
Make setup_bootmem() static.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-25 22:50:50 -07:00
Jisheng Zhang bab0d47c0e
riscv: kexec: Fix W=1 build warnings
Fixes the following W=1 build warning(s):

In file included from include/linux/kexec.h:28,
                 from arch/riscv/kernel/machine_kexec.c:7:
arch/riscv/include/asm/kexec.h:45:1: warning: ‘extern’ is not at beginning of declaration [-Wold-style-declaration]
   45 | const extern unsigned char riscv_kexec_relocate[];
      | ^~~~~
arch/riscv/include/asm/kexec.h:46:1: warning: ‘extern’ is not at beginning of declaration [-Wold-style-declaration]
   46 | const extern unsigned int riscv_kexec_relocate_size;
      | ^~~~~
arch/riscv/kernel/machine_kexec.c:125:6: warning: no previous prototype for ‘machine_shutdown’ [-Wmissing-prototypes]
  125 | void machine_shutdown(void)
      |      ^~~~~~~~~~~~~~~~
arch/riscv/kernel/machine_kexec.c:147:1: warning: no previous prototype for ‘machine_crash_shutdown’ [-Wmissing-prototypes]
  147 | machine_crash_shutdown(struct pt_regs *regs)
      | ^~~~~~~~~~~~~~~~~~~~~~
arch/riscv/kernel/machine_kexec.c:23: warning: Function parameter or member 'image' not described in 'kexec_image_info'
arch/riscv/kernel/machine_kexec.c:53: warning: Function parameter or member 'image' not described in 'machine_kexec_prepare'
arch/riscv/kernel/machine_kexec.c:114: warning: Function parameter or member 'image' not described in 'machine_kexec_cleanup'
arch/riscv/kernel/machine_kexec.c:148: warning: Function parameter or member 'regs' not described in 'machine_crash_shutdown'
arch/riscv/kernel/machine_kexec.c:167: warning: Function parameter or member 'image' not described in 'machine_kexec'

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-22 22:05:30 -07:00
Jisheng Zhang 02ccdeed18
riscv: kprobes: Fix build error when MMU=n
lkp reported a randconfig failure:

arch/riscv/kernel/probes/kprobes.c:90:22: error: use of undeclared identifier 'PAGE_KERNEL_READ_EXEC'

We implemented the alloc_insn_page() to allocate PAGE_KERNEL_READ_EXEC
page for kprobes insn page for STRICT_MODULE_RWX. But if MMU=n, we
should fall back to the generic weak alloc_insn_page() by generic
kprobe subsystem.

Fixes: cdd1b2bd35 ("riscv: kprobes: Implement alloc_insn_page()")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-22 22:04:58 -07:00
Chen Huang eac2f3059e
riscv: stacktrace: fix the riscv stacktrace when CONFIG_FRAME_POINTER enabled
As [1] and [2] said, the arch_stack_walk should not to trace itself, or it will
leave the trace unexpectedly when called. The example is when we do "cat
/sys/kernel/debug/page_owner", all pages' stack is the same.

arch_stack_walk+0x18/0x20
stack_trace_save+0x40/0x60
register_dummy_stack+0x24/0x5e
init_page_owner+0x2e

So we use __builtin_frame_address(1) as the first frame to be walked. And mark
the arch_stack_walk() noinline.

We found that pr_cont will affact pages' stack whose task state is RUNNING when
testing "echo t > /proc/sysrq-trigger". So move the place of pr_cont and mark
the function dump_backtrace() noinline.

Also we move the case when task == NULL into else branch, and test for it in
"echo c > /proc/sysrq-trigger".

[1] https://lore.kernel.org/lkml/20210319184106.5688-1-mark.rutland@arm.com/
[2] https://lore.kernel.org/lkml/20210317142050.57712-1-chenjun102@huawei.com/

Signed-off-by: Chen Huang <chenhuang5@huawei.com>
Fixes: 5d8544e2d0 ("RISC-V: Generic library routines and assembly")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-22 18:59:20 -07:00
Valentin Schneider f1a0a376ca sched/core: Initialize the idle task with preemption disabled
As pointed out by commit

  de9b8f5dcb ("sched: Fix crash trying to dequeue/enqueue the idle thread")

init_idle() can and will be invoked more than once on the same idle
task. At boot time, it is invoked for the boot CPU thread by
sched_init(). Then smp_init() creates the threads for all the secondary
CPUs and invokes init_idle() on them.

As the hotplug machinery brings the secondaries to life, it will issue
calls to idle_thread_get(), which itself invokes init_idle() yet again.
In this case it's invoked twice more per secondary: at _cpu_up(), and at
bringup_cpu().

Given smp_init() already initializes the idle tasks for all *possible*
CPUs, no further initialization should be required. Now, removing
init_idle() from idle_thread_get() exposes some interesting expectations
with regards to the idle task's preempt_count: the secondary startup always
issues a preempt_disable(), requiring some reset of the preempt count to 0
between hot-unplug and hotplug, which is currently served by
idle_thread_get() -> idle_init().

Given the idle task is supposed to have preemption disabled once and never
see it re-enabled, it seems that what we actually want is to initialize its
preempt_count to PREEMPT_DISABLED and leave it there. Do that, and remove
init_idle() from idle_thread_get().

Secondary startups were patched via coccinelle:

  @begone@
  @@

  -preempt_disable();
  ...
  cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210512094636.2958515-1-valentin.schneider@arm.com
2021-05-12 13:01:45 +02:00
Rouven Czerwinski beaf5ae15a
riscv: remove unused handle_exception symbol
Since commit 79b1feba54 ("RISC-V: Setup exception vector early")
exception vectors are setup early and the handle_exception symbol from
the asm files is no longer referenced in traps.c. Remove it.

Signed-off-by: Rouven Czerwinski <rouven@czerwinskis.de>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-06 09:40:16 -07:00
Geert Uytterhoeven 8d91b09733
riscv: Consistify protect_kernel_linear_mapping_text_rodata() use
The various uses of protect_kernel_linear_mapping_text_rodata() are
not consistent:
  - Its definition depends on "64BIT && !XIP_KERNEL",
  - Its forward declaration depends on MMU,
  - Its single caller depends on "STRICT_KERNEL_RWX && 64BIT && MMU &&
    !XIP_KERNEL".

Fix this by settling on the dependencies of the caller, which can be
simplified as STRICT_KERNEL_RWX depends on "MMU && !XIP_KERNEL".
Provide a dummy definition, as the caller is protected by
"IS_ENABLED(CONFIG_STRICT_KERNEL_RWX)" instead of "#ifdef
CONFIG_STRICT_KERNEL_RWX".

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-06 09:40:15 -07:00
Linus Torvalds 939b7cbc00 RISC-V Patches for the 5.13 Merge Window, Part 1
* Support for the memtest= kernel command-line argument.
 * Support for building the kernel with FORTIFY_SOURCE.
 * Support for generic clockevent broadcasts.
 * Support for the buildtar build target.
 * Some build system cleanups to pass more LLVM-friendly arguments.
 * Support for kprobes.
 * A rearranged kernel memory map, the first part of supporting sv48
   systems.
 * Improvements to kexec, along with support for kdump and crash kernels.
 * An alternatives-based errata framework, along with support for
   handling a pair of errata that manifest on some SiFive designs
   (including the HiFive Unmatched).
 * Support for XIP.
 * A device tree for the Microchip PolarFire ICICLE SoC and associated
   dev board.
 
 Along with a bunch of cleanups.  There are already a handful of fixes
 on the list so there will likely be a part 2.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmCS4lITHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYieZqEACSihfcOgZ/oyGWN3chca917/yCWimM
 DOu37Zlh81TNPgzzJwbT44IY5sg/lSecwktxs665TChiJjr3JlM4jmz+u64KOTA8
 mTWhqZNr5zT9kFj/m3x0V9yYOVr9g43QRmIlc14d+8JaQDw0N8WeH/yK85/CXDSS
 X5gQK/e9q/yPf/NPyPuPm67jDsFnJERINWaAHI8lhA5fvFyy/xRLmSkuexchysss
 XOGfyxxX590jGLK1vD+5wccX7ZwfwU4jriTaxyah/VBl8QUur/xSPVyspHIdWiMG
 jrNXI1dg6oI861BdjryUpZI0iYJaRe5FRWUx7uTIqHfIyL/MnvYI7USVYOOPb72M
 yZgN903R++5NeUUVTzfXwaigTwfXAPB6USFqZpEfRAf204pgNybmznJWThAVBdYG
 rUixp7GsEMU3aAT2tE/iHR33JQxQfnZq8Tg43/4gB7MoACrzQrYrGcPnj9xssMyV
 F1hnao3dr+5Xjo3MwfkW9JvLPwvDuE3mdrdj+a0XZ45gbTJeuBhYxo3VOsFeijhQ
 gf/VYuoNn5iae9fiMzx5rlmFT9NJDYKDhla+BpAel84/6nRryyfCZCaE5FvDynOO
 CNQynaeJMIMEygPBYR9FVVCwm+EtVsz3NVFKEuo5ilQpgX8ipctxiqy2+moZALLN
 OWlEH6BKEgXqkw==
 =PsA8
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.13-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for the memtest= kernel command-line argument.

 - Support for building the kernel with FORTIFY_SOURCE.

 - Support for generic clockevent broadcasts.

 - Support for the buildtar build target.

 - Some build system cleanups to pass more LLVM-friendly arguments.

 - Support for kprobes.

 - A rearranged kernel memory map, the first part of supporting sv48
   systems.

 - Improvements to kexec, along with support for kdump and crash
   kernels.

 - An alternatives-based errata framework, along with support for
   handling a pair of errata that manifest on some SiFive designs
   (including the HiFive Unmatched).

 - Support for XIP.

 - A device tree for the Microchip PolarFire ICICLE SoC and associated
   dev board.

... along with a bunch of cleanups.  There are already a handful of fixes
on the list so there will likely be a part 2.

* tag 'riscv-for-linus-5.13-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (45 commits)
  RISC-V: Always define XIP_FIXUP
  riscv: Remove 32b kernel mapping from page table dump
  riscv: Fix 32b kernel build with CONFIG_DEBUG_VIRTUAL=y
  RISC-V: Fix error code returned by riscv_hartid_to_cpuid()
  RISC-V: Enable Microchip PolarFire ICICLE SoC
  RISC-V: Initial DTS for Microchip ICICLE board
  dt-bindings: riscv: microchip: Add YAML documentation for the PolarFire SoC
  RISC-V: Add Microchip PolarFire SoC kconfig option
  RISC-V: enable XIP
  RISC-V: Add crash kernel support
  RISC-V: Add kdump support
  RISC-V: Improve init_resources()
  RISC-V: Add kexec support
  RISC-V: Add EM_RISCV to kexec UAPI header
  riscv: vdso: fix and clean-up Makefile
  riscv/mm: Use BUG_ON instead of if condition followed by BUG.
  riscv/kprobe: fix kernel panic when invoking sys_read traced by kprobe
  riscv: Set ARCH_HAS_STRICT_MODULE_RWX if MMU
  riscv: module: Create module allocations without exec permissions
  riscv: bpf: Avoid breaking W^X
  ...
2021-05-06 09:24:18 -07:00
Anup Patel 533b4f3a78
RISC-V: Fix error code returned by riscv_hartid_to_cpuid()
We should return a negative error code upon failure in
riscv_hartid_to_cpuid() instead of NR_CPUS. This is also
aligned with all uses of riscv_hartid_to_cpuid() which
expect negative error code upon failure.

Fixes: 6825c7a80f ("RISC-V: Add logical CPU indexing for RISC-V")
Fixes: f99fb607fb ("RISC-V: Use Linux logical CPU number instead of hartid")
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-01 08:53:19 -07:00
Vitaly Wool 44c9225729
RISC-V: enable XIP
Introduce XIP (eXecute In Place) support for RISC-V platforms.
It allows code to be executed directly from non-volatile storage
directly addressable by the CPU, such as QSPI NOR flash which can
be found on many RISC-V platforms. This makes way for significant
optimization of RAM footprint. The XIP kernel is not compressed
since it has to run directly from flash, so it will occupy more
space on the non-volatile storage. The physical flash address used
to link the kernel object files and for storing it has to be known
at compile time and is represented by a Kconfig option.

XIP on RISC-V will for the time being only work on MMU-enabled
kernels.

Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com>
[Alex: Rebase on top of "Move kernel mapping outside the linear mapping" ]
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
[Palmer: disable XIP for allyesconfig]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26 08:31:28 -07:00
Nick Kossifidis 5640975003
RISC-V: Add crash kernel support
This patch allows Linux to act as a crash kernel for use with
kdump. Userspace will let the crash kernel know about the
memory region it can use through linux,usable-memory property
on the /memory node (overriding its reg property), and about the
memory region where the elf core header of the previous kernel
is saved, through a reserved-memory node with a compatible string
of "linux,elfcorehdr". This approach is the least invasive and
re-uses functionality already present.

I tested this on riscv64 qemu and it works as expected, you
may test it by retrieving the dmesg of the previous kernel
through /proc/vmcore, using the vmcore-dmesg utility from
kexec-tools.

Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26 08:25:24 -07:00
Nick Kossifidis e53d28180d
RISC-V: Add kdump support
This patch adds support for kdump, the kernel will reserve a
region for the crash kernel and jump there on panic. In order
for userspace tools (kexec-tools) to prepare the crash kernel
kexec image, we also need to expose some information on
/proc/iomem for the memory regions used by the kernel and for
the region reserved for crash kernel. Note that on userspace
the device tree is used to determine the system's memory
layout so the "System RAM" on /proc/iomem is ignored.

I tested this on riscv64 qemu and works as expected, you may
test it by triggering a crash through /proc/sysrq_trigger:

echo c > /proc/sysrq_trigger

Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26 08:25:23 -07:00