2021-04-11 19:41:45 +03:00
|
|
|
.. SPDX-License-Identifier: GPL-2.0
|
|
|
|
|
|
|
|
=====================================
|
|
|
|
Virtual Memory Layout on RISC-V Linux
|
|
|
|
=====================================
|
|
|
|
|
|
|
|
:Author: Alexandre Ghiti <alex@ghiti.fr>
|
|
|
|
:Date: 12 February 2021
|
|
|
|
|
|
|
|
This document describes the virtual memory layout used by the RISC-V Linux
|
|
|
|
Kernel.
|
|
|
|
|
|
|
|
RISC-V Linux Kernel 32bit
|
|
|
|
=========================
|
|
|
|
|
|
|
|
RISC-V Linux Kernel SV32
|
|
|
|
------------------------
|
|
|
|
|
|
|
|
TODO
|
|
|
|
|
|
|
|
RISC-V Linux Kernel 64bit
|
|
|
|
=========================
|
|
|
|
|
|
|
|
The RISC-V privileged architecture document states that the 64bit addresses
|
|
|
|
"must have bits 63–48 all equal to bit 47, or else a page-fault exception will
|
|
|
|
occur.": that splits the virtual address space into 2 halves separated by a very
|
|
|
|
big hole, the lower half is where the userspace resides, the upper half is where
|
|
|
|
the RISC-V Linux Kernel resides.
|
|
|
|
|
|
|
|
RISC-V Linux Kernel SV39
|
|
|
|
------------------------
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
========================================================================================================================
|
|
|
|
Start addr | Offset | End addr | Size | VM area description
|
|
|
|
========================================================================================================================
|
|
|
|
| | | |
|
|
|
|
0000000000000000 | 0 | 0000003fffffffff | 256 GB | user-space virtual memory, different per mm
|
|
|
|
__________________|____________|__________________|_________|___________________________________________________________
|
|
|
|
| | | |
|
|
|
|
0000004000000000 | +256 GB | ffffffbfffffffff | ~16M TB | ... huge, almost 64 bits wide hole of non-canonical
|
|
|
|
| | | | virtual memory addresses up to the -256 GB
|
|
|
|
| | | | starting offset of kernel mappings.
|
|
|
|
__________________|____________|__________________|_________|___________________________________________________________
|
|
|
|
|
|
|
|
|
| Kernel-space virtual memory, shared between all processes:
|
|
|
|
____________________________________________________________|___________________________________________________________
|
|
|
|
| | | |
|
2021-12-06 13:46:45 +03:00
|
|
|
ffffffc6fee00000 | -228 GB | ffffffc6feffffff | 2 MB | fixmap
|
|
|
|
ffffffc6ff000000 | -228 GB | ffffffc6ffffffff | 16 MB | PCI io
|
|
|
|
ffffffc700000000 | -228 GB | ffffffc7ffffffff | 4 GB | vmemmap
|
|
|
|
ffffffc800000000 | -224 GB | ffffffd7ffffffff | 64 GB | vmalloc/ioremap space
|
|
|
|
ffffffd800000000 | -160 GB | fffffff6ffffffff | 124 GB | direct mapping of all physical memory
|
|
|
|
fffffff700000000 | -36 GB | fffffffeffffffff | 32 GB | kasan
|
2021-04-11 19:41:45 +03:00
|
|
|
__________________|____________|__________________|_________|____________________________________________________________
|
|
|
|
|
|
|
|
|
|
|
|
|
|
____________________________________________________________|____________________________________________________________
|
|
|
|
| | | |
|
riscv: Ensure BPF_JIT_REGION_START aligned with PMD size
Andreas reported commit fc8504765ec5 ("riscv: bpf: Avoid breaking W^X")
breaks booting with one kind of defconfig, I reproduced a kernel panic
with the defconfig:
[ 0.138553] Unable to handle kernel paging request at virtual address ffffffff81201220
[ 0.139159] Oops [#1]
[ 0.139303] Modules linked in:
[ 0.139601] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc5-default+ #1
[ 0.139934] Hardware name: riscv-virtio,qemu (DT)
[ 0.140193] epc : __memset+0xc4/0xfc
[ 0.140416] ra : skb_flow_dissector_init+0x1e/0x82
[ 0.140609] epc : ffffffff8029806c ra : ffffffff8033be78 sp : ffffffe001647da0
[ 0.140878] gp : ffffffff81134b08 tp : ffffffe001654380 t0 : ffffffff81201158
[ 0.141156] t1 : 0000000000000002 t2 : 0000000000000154 s0 : ffffffe001647dd0
[ 0.141424] s1 : ffffffff80a43250 a0 : ffffffff81201220 a1 : 0000000000000000
[ 0.141654] a2 : 000000000000003c a3 : ffffffff81201258 a4 : 0000000000000064
[ 0.141893] a5 : ffffffff8029806c a6 : 0000000000000040 a7 : ffffffffffffffff
[ 0.142126] s2 : ffffffff81201220 s3 : 0000000000000009 s4 : ffffffff81135088
[ 0.142353] s5 : ffffffff81135038 s6 : ffffffff8080ce80 s7 : ffffffff80800438
[ 0.142584] s8 : ffffffff80bc6578 s9 : 0000000000000008 s10: ffffffff806000ac
[ 0.142810] s11: 0000000000000000 t3 : fffffffffffffffc t4 : 0000000000000000
[ 0.143042] t5 : 0000000000000155 t6 : 00000000000003ff
[ 0.143220] status: 0000000000000120 badaddr: ffffffff81201220 cause: 000000000000000f
[ 0.143560] [<ffffffff8029806c>] __memset+0xc4/0xfc
[ 0.143859] [<ffffffff8061e984>] init_default_flow_dissectors+0x22/0x60
[ 0.144092] [<ffffffff800010fc>] do_one_initcall+0x3e/0x168
[ 0.144278] [<ffffffff80600df0>] kernel_init_freeable+0x1c8/0x224
[ 0.144479] [<ffffffff804868a8>] kernel_init+0x12/0x110
[ 0.144658] [<ffffffff800022de>] ret_from_exception+0x0/0xc
[ 0.145124] ---[ end trace f1e9643daa46d591 ]---
After some investigation, I think I found the root cause: commit
2bfc6cd81bd ("move kernel mapping outside of linear mapping") moves
BPF JIT region after the kernel:
| #define BPF_JIT_REGION_START PFN_ALIGN((unsigned long)&_end)
The &_end is unlikely aligned with PMD size, so the front bpf jit
region sits with part of kernel .data section in one PMD size mapping.
But kernel is mapped in PMD SIZE, when bpf_jit_binary_lock_ro() is
called to make the first bpf jit prog ROX, we will make part of kernel
.data section RO too, so when we write to, for example memset the
.data section, MMU will trigger a store page fault.
To fix the issue, we need to ensure the BPF JIT region is PMD size
aligned. This patch acchieve this goal by restoring the BPF JIT region
to original position, I.E the 128MB before kernel .text section. The
modification to kasan_init.c is inspired by Alexandre.
Fixes: fc8504765ec5 ("riscv: bpf: Avoid breaking W^X")
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-18 17:09:13 +03:00
|
|
|
ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | modules, BPF
|
|
|
|
ffffffff80000000 | -2 GB | ffffffffffffffff | 2 GB | kernel
|
2021-04-11 19:41:45 +03:00
|
|
|
__________________|____________|__________________|_________|____________________________________________________________
|
2021-12-06 13:46:55 +03:00
|
|
|
|
|
|
|
|
|
|
|
RISC-V Linux Kernel SV48
|
|
|
|
------------------------
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
========================================================================================================================
|
|
|
|
Start addr | Offset | End addr | Size | VM area description
|
|
|
|
========================================================================================================================
|
|
|
|
| | | |
|
|
|
|
0000000000000000 | 0 | 00007fffffffffff | 128 TB | user-space virtual memory, different per mm
|
|
|
|
__________________|____________|__________________|_________|___________________________________________________________
|
|
|
|
| | | |
|
|
|
|
0000800000000000 | +128 TB | ffff7fffffffffff | ~16M TB | ... huge, almost 64 bits wide hole of non-canonical
|
|
|
|
| | | | virtual memory addresses up to the -128 TB
|
|
|
|
| | | | starting offset of kernel mappings.
|
|
|
|
__________________|____________|__________________|_________|___________________________________________________________
|
|
|
|
|
|
|
|
|
| Kernel-space virtual memory, shared between all processes:
|
|
|
|
____________________________________________________________|___________________________________________________________
|
|
|
|
| | | |
|
|
|
|
ffff8d7ffee00000 | -114.5 TB | ffff8d7ffeffffff | 2 MB | fixmap
|
|
|
|
ffff8d7fff000000 | -114.5 TB | ffff8d7fffffffff | 16 MB | PCI io
|
|
|
|
ffff8d8000000000 | -114.5 TB | ffff8f7fffffffff | 2 TB | vmemmap
|
|
|
|
ffff8f8000000000 | -112.5 TB | ffffaf7fffffffff | 32 TB | vmalloc/ioremap space
|
|
|
|
ffffaf8000000000 | -80.5 TB | ffffef7fffffffff | 64 TB | direct mapping of all physical memory
|
|
|
|
ffffef8000000000 | -16.5 TB | fffffffeffffffff | 16.5 TB | kasan
|
|
|
|
__________________|____________|__________________|_________|____________________________________________________________
|
|
|
|
|
|
|
|
|
| Identical layout to the 39-bit one from here on:
|
|
|
|
____________________________________________________________|____________________________________________________________
|
|
|
|
| | | |
|
|
|
|
ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | modules, BPF
|
|
|
|
ffffffff80000000 | -2 GB | ffffffffffffffff | 2 GB | kernel
|
|
|
|
__________________|____________|__________________|_________|____________________________________________________________
|
2022-11-18 20:15:56 +03:00
|
|
|
|
|
|
|
|
|
|
|
RISC-V Linux Kernel SV57
|
|
|
|
------------------------
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
========================================================================================================================
|
|
|
|
Start addr | Offset | End addr | Size | VM area description
|
|
|
|
========================================================================================================================
|
|
|
|
| | | |
|
|
|
|
0000000000000000 | 0 | 00ffffffffffffff | 64 PB | user-space virtual memory, different per mm
|
|
|
|
__________________|____________|__________________|_________|___________________________________________________________
|
|
|
|
| | | |
|
|
|
|
0100000000000000 | +64 PB | feffffffffffffff | ~16K PB | ... huge, almost 64 bits wide hole of non-canonical
|
|
|
|
| | | | virtual memory addresses up to the -64 PB
|
|
|
|
| | | | starting offset of kernel mappings.
|
|
|
|
__________________|____________|__________________|_________|___________________________________________________________
|
|
|
|
|
|
|
|
|
| Kernel-space virtual memory, shared between all processes:
|
|
|
|
____________________________________________________________|___________________________________________________________
|
|
|
|
| | | |
|
|
|
|
ff1bfffffee00000 | -57 PB | ff1bfffffeffffff | 2 MB | fixmap
|
|
|
|
ff1bffffff000000 | -57 PB | ff1bffffffffffff | 16 MB | PCI io
|
|
|
|
ff1c000000000000 | -57 PB | ff1fffffffffffff | 1 PB | vmemmap
|
|
|
|
ff20000000000000 | -56 PB | ff5fffffffffffff | 16 PB | vmalloc/ioremap space
|
|
|
|
ff60000000000000 | -40 PB | ffdeffffffffffff | 32 PB | direct mapping of all physical memory
|
|
|
|
ffdf000000000000 | -8 PB | fffffffeffffffff | 8 PB | kasan
|
|
|
|
__________________|____________|__________________|_________|____________________________________________________________
|
|
|
|
|
|
|
|
|
| Identical layout to the 39-bit one from here on:
|
|
|
|
____________________________________________________________|____________________________________________________________
|
|
|
|
| | | |
|
|
|
|
ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | modules, BPF
|
|
|
|
ffffffff80000000 | -2 GB | ffffffffffffffff | 2 GB | kernel
|
|
|
|
__________________|____________|__________________|_________|____________________________________________________________
|