Граф коммитов

966602 Коммитов

Автор SHA1 Сообщение Дата
Christophe Leroy 899367ea50 powerpc/vdso: Remove runtime generated sigtramp offsets
Signal trampoline offsets are now generated at buildtime.

Runtime generated offsets are not used anymore, remove them.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7c192d35a437151837cf4c48aeccb42380d6daac.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:18 +11:00
Christophe Leroy 49bf59fd03 powerpc/vdso: Remove __kernel_datapage_offset
__kernel_datapage_offset is not used anymore, remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ddb5c746bec4e1a026d7c85243213a1876ef844f.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:18 +11:00
Christophe Leroy b7fe9c15b5 powerpc/vdso: Remove vdso32_pages and vdso64_pages
vdso32_pages and vdso64_pages are not used anymore.

Remove them.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/bce021f616cbaf39dfb5766cf7ef114adcb918d9.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:18 +11:00
Christophe Leroy 0fc980db9a powerpc/vdso: Merge __kernel_sync_dicache_p5() into __kernel_sync_dicache()
__kernel_sync_dicache_p5() is an alternative to
__kernel_sync_dicache() when cpu has CPU_FTR_COHERENT_ICACHE

Remove this alternative function and merge
__kernel_sync_dicache_p5() into __kernel_sync_dicache() using
standard CPU feature fixup.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4c7dcc6544882761b2b0249d7a8ec2c3a8088cb5.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:17 +11:00
Christophe Leroy ed07f6353d powerpc/vdso: Use builtin symbols to locate fixup section
Add builtin symbols to locate fixup section and use them
instead of locating sections through elf headers at runtime.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/2954526981859ca1ccfcfc7a7c4263920e9ddfcb.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:17 +11:00
Christophe Leroy 91bf695596 powerpc/vdso: Retrieve sigtramp offsets at buildtime
This is copied from arm64.

Instead of using runtime generated signal trampoline offsets,
get offsets at buildtime.

If the said trampoline doesn't exist, build will fail. So no
need to check whether the trampoline exists or not in the VDSO.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f8bfd6812c3e3678b1cdb4d55a52f9eb022b40d3.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:17 +11:00
Christophe Leroy 550e6074c1 powerpc/vdso: Remove unused \tmp param in __get_datapage()
The \tmp param is not used anymore, remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4b13f897dcccce8ae03c031a4598cf26b32e2f1c.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:17 +11:00
Christophe Leroy 591857b635 powerpc/vdso: Simplify __get_datapage()
The VDSO datapage and the text pages are always located immediately
next to each other, so it can be hardcoded without an indirection
through __kernel_datapage_offset

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b08f5ef99d64cfc38f79b7ad5310d9b4d2479eeb.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:17 +11:00
Christophe Leroy 511157ab64 powerpc/vdso: Move vdso datapage up front
Move the vdso datapage in front of the VDSO area,
before vdso test.

This will allow to remove the __kernel_datapage_offset symbol
and simplify __get_datapage() in following patches.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b68c99b6e8ee0b1d99bfa4c7e34c359fc1bc1000.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:17 +11:00
Christophe Leroy c102f07667 powerpc/vdso: Replace vdso_base by vdso
All other architectures but s390 use a void pointer named 'vdso'
to reference the VDSO mapping.

In a following patch, the VDSO data page will be put in front of
text, vdso_base will then not anymore point to VDSO text.

To avoid confusion between vdso_base and VDSO text, rename vdso_base
into vdso and make it a void __user *.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/8e6cefe474aa4ceba028abb729485cd46c140990.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:16 +11:00
Christophe Leroy 526a9c4a72 powerpc/vdso: Provide vdso_remap()
Provide vdso_remap() through _install_special_mapping() and
drop arch_remap().

This adds a test of the size and returns -EINVAL if the size
is not correct.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/373c66f768fa9cc8890f3b55462209a98c522326.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:16 +11:00
Christophe Leroy c1bab64360 powerpc/vdso: Move to _install_special_mapping() and remove arch_vma_name()
Copied from commit 2fea7f6c98 ("arm64: vdso: move to
_install_special_mapping and remove arch_vma_name").

Use the new _install_special_mapping() API added by
commit a62c34bd2a ("x86, mm: Improve _install_special_mapping
and fix x86 vdso naming") which obsolete install_special_mapping().

And remove arch_vma_name() as the name is handled by the new API.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: kernel test robot <lkp@intel.com>
[mpe: Squash fix to use PTR_ERR_OR_ZERO() from lkp]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e7e5dfe0f93234e31051f2a610b4b07f50b0082f.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:16 +11:00
Christophe Leroy b2df3f60b4 powerpc/vdso: Simplify arch_setup_additional_pages() exit
To simplify arch_setup_additional_pages() exit, rename
it __arch_setup_additional_pages() and create a caller
arch_setup_additional_pages() which does the locking.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/603c1d039d3f928ee95e547fcd2219fcf4c3b514.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:16 +11:00
Christophe Leroy 7461a4f79b powerpc/vdso: Use VDSO size in arch_setup_additional_pages()
In arch_setup_additional_pages(), instead of using number of VDSO
pages and recalculate VDSO size, directly use the VDSO size.

As vdso_ready is set, vdso_pages can't be 0 so just remove the test.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4edfa548c3885a430b765335dc720105716e273f.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:16 +11:00
Christophe Leroy 4fe0e3c172 powerpc/vdso: Remove unnecessary ifdefs in vdso_pagelist initialization
No need of all those #ifdefs around the pagelist initialisation,
use IS_ENABLED(), GCC will kick out unused static variables.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f9333432e329b1fcbbbf846cb1cd4a1c4127a60b.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:16 +11:00
Christophe Leroy 3cf6382541 powerpc/vdso: Refactor 32 bits and 64 bits pages setup
The setup of VDSO pages is identical for 32 bits VDSO and
64 bits VDSO.

Refactor that setup.

And use &vdsoXX_start which is synonym of vdsoXX_kbase.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/269ffb54c37fc1d46128f77d7a39f88ef4a9957d.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:15 +11:00
Christophe Leroy 35c1c7c0bc powerpc/vdso: Remove NULL termination element in vdso_pagelist
No need of a NULL last element in pagelists, install_special_mapping()
knows how long the list is.

Remove that element.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e58d95ab859e3cbc9bae3c9ce2959e17d2864f5d.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:15 +11:00
Christophe Leroy abcdbd039e powerpc/vdso: Remove get_page() in vdso_pagelist initialization
Partly copied from commit 16fb1a9bec ("arm64: vdso: clean up
vdso_pagelist initialization").

No need to get_page() the vdso text/data - these are part of the
kernel image.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/9d14540bd10832b6c9519d74fb5728fdc4974b36.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:15 +11:00
Christophe Leroy 1bb30b7a45 powerpc/vdso: Rename syscall_map_32/64 to simplify vdso_setup_syscall_map()
Today vdso_data structure has:
- syscall_map_32[] and syscall_map_64[] on PPC64
- syscall_map_32[] on PPC32

On PPC32, syscall_map_32[] is populated using sys_call_table[].

On PPC64, syscall_map_64[] is populated using sys_call_table[]
and syscal_map_32[] is populated using compat_sys_call_table[].

To simplify vdso_setup_syscall_map(),
- On PPC32 rename syscall_map_32[] into syscall_map[],
- On PPC64 rename syscall_map_64[] into syscall_map[],
- On PPC64 rename syscall_map_32[] into compat_syscall_map[].

That way, syscall_map[] gets populated using sys_call_table[] and
compat_syscall_map[] gets population using compat_sys_call_table[].

Also define an empty compat_syscall_map[] on PPC32 to avoid ifdefs.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/472734be0d9991eee320a06824219a5b2663736b.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:15 +11:00
Christophe Leroy bc9d5bfc4d powerpc/vdso: Add missing includes and clean vdso_setup_syscall_map()
Instead of including extern references locally in
vdso_setup_syscall_map(), add the missing headers.

sys_ni_syscall() being a function, cast its address to
an unsigned long instead of declaring it as a fake
unsigned long object.

At the same time, remove a comment which paraphrases the
function name.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b4afedce748ed2858299ceab5ae29b52109263ef.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:15 +11:00
Christophe Leroy 7fe2de246e powerpc/vdso: Stripped VDSO is not needed, don't build it
Since commit 24b659a138 ("powerpc: Use unstripped VDSO image for
more accurate profiling data"), only the unstripped VDSO image
has been used.

Partially revert commit 8150caad02 ("[POWERPC] powerpc vDSO: install
unstripped copies on disk") to avoid building the stripped version.

And the unstripped version in $(MODLIB)/vdso/ is not required
anymore as it is the one embedded in the kernel image.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5986ca25be44fe6e9790486304507f240077d8c4.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:15 +11:00
Christophe Leroy ef75e73182 powerpc/signal32: Transform save_user_regs() and save_tm_user_regs() in 'unsafe' version
Change those two functions to be used within a user access block.

For that, change save_general_regs() to and unsafe_save_general_regs(),
then replace all user accesses by unsafe_ versions.

This series leads to a reduction from 2.55s to 1.73s of
the system CPU time with the following microbench app
on an mpc832x with KUAP (approx 32%)

Without KUAP, the difference is in the noise.

	void sigusr1(int sig) { }

	int main(int argc, char **argv)
	{
		int i = 100000;

		signal(SIGUSR1, sigusr1);
		for (;i--;)
		    raise(SIGUSR1);
		exit(0);
	}

An additional 0.10s reduction is achieved by removing
CONFIG_PPC_FPU, as the mpc832x has no FPU.

A bit less spectacular on an 8xx as KUAP is less heavy, prior to
the series (with KUAP) it ran in 8.10 ms. Once applies the removal
of FPU regs handling, we get 7.05s. With the full series, we get 6.9s.
If artificially re-activating FPU regs handling with the full series,
we get 7.6s.

So for the 8xx, the removal of the FPU regs copy is what makes the
difference, but the rework of handle_signal also have a benefit.

Same as above, without KUAP the difference is in the noise.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Fixup typo in SPE handling]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c7b37b385ccf9666066452e58f018a86573f83e8.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:15 +11:00
Christophe Leroy 968c4fccd1 powerpc/signal32: Isolate non-copy actions in save_user_regs() and save_tm_user_regs()
Reorder actions in save_user_regs() and save_tm_user_regs() to
regroup copies together in order to switch to user_access_begin()
logic in a later patch.

Move non-copy actions into new functions called
prepare_save_user_regs() and prepare_save_tm_user_regs().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f6eac65781b4a57220477c8864bca2b57f29a5d5.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:14 +11:00
Christophe Leroy b3484a1d4d powerpc/signal: Create 'unsafe' versions of copy_[ck][fpr/vsx]_to_user()
For the non VSX version, that's trivial. Just use unsafe_copy_to_user()
instead of __copy_to_user().

For the VSX version, remove the intermediate step through a buffer and
use unsafe_put_user() directly. This generates a far smaller code which
is acceptable to inline, see below:

Standard VSX version:

0000000000000000 <.copy_fpr_to_user>:
   0:	7c 08 02 a6 	mflr    r0
   4:	fb e1 ff f8 	std     r31,-8(r1)
   8:	39 00 00 20 	li      r8,32
   c:	39 24 0b 80 	addi    r9,r4,2944
  10:	7d 09 03 a6 	mtctr   r8
  14:	f8 01 00 10 	std     r0,16(r1)
  18:	f8 21 fe 71 	stdu    r1,-400(r1)
  1c:	39 41 00 68 	addi    r10,r1,104
  20:	e9 09 00 00 	ld      r8,0(r9)
  24:	39 4a 00 08 	addi    r10,r10,8
  28:	39 29 00 10 	addi    r9,r9,16
  2c:	f9 0a 00 00 	std     r8,0(r10)
  30:	42 00 ff f0 	bdnz    20 <.copy_fpr_to_user+0x20>
  34:	e9 24 0d 80 	ld      r9,3456(r4)
  38:	3d 42 00 00 	addis   r10,r2,0
			3a: R_PPC64_TOC16_HA	.toc
  3c:	eb ea 00 00 	ld      r31,0(r10)
			3e: R_PPC64_TOC16_LO_DS	.toc
  40:	f9 21 01 70 	std     r9,368(r1)
  44:	e9 3f 00 00 	ld      r9,0(r31)
  48:	81 29 00 20 	lwz     r9,32(r9)
  4c:	2f 89 00 00 	cmpwi   cr7,r9,0
  50:	40 9c 00 18 	bge     cr7,68 <.copy_fpr_to_user+0x68>
  54:	4c 00 01 2c 	isync
  58:	3d 20 40 00 	lis     r9,16384
  5c:	79 29 07 c6 	rldicr  r9,r9,32,31
  60:	7d 3d 03 a6 	mtspr   29,r9
  64:	4c 00 01 2c 	isync
  68:	38 a0 01 08 	li      r5,264
  6c:	38 81 00 70 	addi    r4,r1,112
  70:	48 00 00 01 	bl      70 <.copy_fpr_to_user+0x70>
			70: R_PPC64_REL24	.__copy_tofrom_user
  74:	60 00 00 00 	nop
  78:	e9 3f 00 00 	ld      r9,0(r31)
  7c:	81 29 00 20 	lwz     r9,32(r9)
  80:	2f 89 00 00 	cmpwi   cr7,r9,0
  84:	40 9c 00 18 	bge     cr7,9c <.copy_fpr_to_user+0x9c>
  88:	4c 00 01 2c 	isync
  8c:	39 20 ff ff 	li      r9,-1
  90:	79 29 00 44 	rldicr  r9,r9,0,1
  94:	7d 3d 03 a6 	mtspr   29,r9
  98:	4c 00 01 2c 	isync
  9c:	38 21 01 90 	addi    r1,r1,400
  a0:	e8 01 00 10 	ld      r0,16(r1)
  a4:	eb e1 ff f8 	ld      r31,-8(r1)
  a8:	7c 08 03 a6 	mtlr    r0
  ac:	4e 80 00 20 	blr

'unsafe' simulated VSX version (The ... are only nops) using
unsafe_copy_fpr_to_user() macro:

unsigned long copy_fpr_to_user(void __user *to,
			       struct task_struct *task)
{
	unsafe_copy_fpr_to_user(to, task, failed);
	return 0;
failed:
	return 1;
}

0000000000000000 <.copy_fpr_to_user>:
   0:	39 00 00 20 	li      r8,32
   4:	39 44 0b 80 	addi    r10,r4,2944
   8:	7d 09 03 a6 	mtctr   r8
   c:	7c 69 1b 78 	mr      r9,r3
...
  20:	e9 0a 00 00 	ld      r8,0(r10)
  24:	f9 09 00 00 	std     r8,0(r9)
  28:	39 4a 00 10 	addi    r10,r10,16
  2c:	39 29 00 08 	addi    r9,r9,8
  30:	42 00 ff f0 	bdnz    20 <.copy_fpr_to_user+0x20>
  34:	e9 24 0d 80 	ld      r9,3456(r4)
  38:	f9 23 01 00 	std     r9,256(r3)
  3c:	38 60 00 00 	li      r3,0
  40:	4e 80 00 20 	blr
...
  50:	38 60 00 01 	li      r3,1
  54:	4e 80 00 20 	blr

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/29f6c4b8e7a5bbc61e6a8801b78bbf493f9f819e.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:14 +11:00
Christophe Leroy 31147d7d61 powerpc/signal32: Switch swap_context() to user_access_begin() logic
As this was the last user of put_sigset_t(), remove it as well.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c3ac4f2d134a3391bb51bdaa2d00e9a409aba9f8.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:14 +11:00
Christophe Leroy de781ebdf6 powerpc/signal32: Add and use unsafe_put_sigset_t()
put_sigset_t() calls copy_to_user() for copying two words.

This is terribly inefficient for copying two words.

By switching to unsafe_put_user(), we end up with something as
simple as:

 3cc:   81 3d 00 00     lwz     r9,0(r29)
 3d0:   91 26 00 b4     stw     r9,180(r6)
 3d4:   81 3d 00 04     lwz     r9,4(r29)
 3d8:   91 26 00 b8     stw     r9,184(r6)

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/06def97e87ac1c4ae8e3197e0982e1fab7b3c8ae.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:14 +11:00
Christophe Leroy 14026b94cc signal: Add unsafe_put_compat_sigset()
Implement 'unsafe' version of put_compat_sigset()

For the bigendian, use unsafe_put_user() directly
to avoid intermediate copy through the stack.

For the littleendian, use a straight unsafe_copy_to_user().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/537c7082ee309a0bb9c67a50c5d9dd929aedb82d.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:14 +11:00
Christophe Leroy f1cf4f93de powerpc/signal32: Remove ifdefery in middle of if/else
MSR_TM_ACTIVE() is always defined and returns always 0 when
CONFIG_PPC_TRANSACTIONAL_MEM is not selected, so the awful
ifdefery in the middle of an if/else can be removed.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f3c36d687e4228f58d5c207a4036aa9ddcc7420a.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:14 +11:00
Christophe Leroy 9504db3e90 powerpc/signal32: Switch handle_rt_signal32() to user_access_begin() logic
On the same way as handle_signal32(), replace all user
accesses with equivalent unsafe_ versions, and move the
trampoline code icache flush outside the user access block.

Functions that have no unsafe_ equivalent also remains outside
the access block.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/2974314226256f958e2984912b48883ef1754185.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:13 +11:00
Christophe Leroy ad65f4909f powerpc/signal32: Switch handle_signal32() to user_access_begin() logic
Replace the access_ok() by user_access_begin() and change all user
accesses to unsafe_ version.

Move flush_icache_range() outside the user access block.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a27797f781aa00da96f8284c898173d18e952361.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:13 +11:00
Christophe Leroy 8d33001dd6 powerpc/signal32: Move signal trampoline setup to handle_[rt_]signal32
Move signal trampoline setup into handle_signal32()
and handle_rt_signal32().

At the same time, remove the define which hides the mc_pad field
used for trampoline.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e439cc0fa35aa45da6776520777a61848b92fd4b.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:13 +11:00
Christophe Leroy 91b8ecd419 powerpc/signal32: Misc changes to make handle_[rt_]_signal32() more similar
Miscellaneous changes to clean and make handle_signal32() and
handle_rt_signal32() even more similar.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/df0bc8c3b8fa96390c46f611df79b2a94ac21844.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:13 +11:00
Christophe Leroy 8e91cf8501 powerpc/signal32: Rename local pointers in handle_rt_signal32()
Rename pointers in handle_rt_signal32() to make it more similar to
handle_signal32()

tm_frame becomes tm_mctx
frame becomes mctx
rt_sf becomes frame

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/be77477b0f05397876015b218e36548ee8f5e10b.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:13 +11:00
Christophe Leroy 3eea688be0 powerpc/signal32: Move handle_signal32() close to handle_rt_signal32()
Those two functions are similar and serving the same purpose.
To ease refactorisation, move them close to each other.

This is pure move, no code change, no cosmetic. Yes, checkpatch is
not happy, most will clear later.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/dbce67900bf566bcf40179467bf1eb500814c405.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:13 +11:00
Christophe Leroy debf122c77 powerpc/signal32: Simplify logging in handle_rt_signal32()
If something is bad in the frame, there is no point in
knowing which part of the frame exactly is wrong as it
got allocated as a single block.

Always print the root address of the frame in case of
failed user access, just like handle_signal32().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/691895bd31fee89a2d8370befd66ad4eff5b63f2.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:12 +11:00
Christophe Leroy 7fe8f773ee powerpc/signal: Refactor bad frame logging
The logging of bad frame appears half a dozen of times
and is pretty similar.

Create signal_fault() fonction to perform that logging.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/fa094445c119fc00315e1c13783b493346306c6a.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:12 +11:00
Christophe Leroy c180cb305c powerpc/signal: Call get_tm_stackpointer() from get_sigframe()
Instead of calling get_tm_stackpointer() from the caller, call it
directly from get_sigframe(). This avoids a double call and
allows get_tm_stackpointer() to become static and be inlined
into get_sigframe() by GCC.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/abfdc105b8b28c4eb3ab9a26297d17f302b600ea.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:12 +11:00
Christophe Leroy 0ecbc6ad18 powerpc/signal: Remove get_clean_sp()
get_clean_sp() is only used once in kernel/signal.c .

GCC is smart enough to see that x & 0xffffffff is a nop
calculation on PPC32, no need of a special PPC32 trivial version.

Include the logic from the PPC64 version of get_clean_sp() directly
in get_sigframe().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/13ef6510ce30a4867e043157b93af5bb8c67fb3b.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:12 +11:00
Christophe Leroy 454b1abb58 powerpc/signal: Move access_ok() out of get_sigframe()
This access_ok() will soon be performed by user_access_begin().
So move it out of get_sigframe().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/900b93744732ed0887f28f5b6a40730fb04a43fa.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:12 +11:00
Christophe Leroy 3fcfb5d1bf powerpc/signal: Remove BUG_ON() in handler_signal functions
There is already the same BUG_ON() check in do_signal() which
is the only caller of handle_rt_signal64() handle_rt_signal32() and
handle_signal32().

Remove those three redundant BUG_ON().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/3582e10a341d523c9c3f1ac925c3aaefc9d9293d.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:12 +11:00
Christophe Leroy 7d68c89169 powerpc/32s: Allow deselecting CONFIG_PPC_FPU on mpc832x
The e300c2 core which is embedded in mpc832x CPU doesn't have
an FPU.

Make it possible to not select CONFIG_PPC_FPU when building a
kernel dedicated to that target.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/fcdc60d85baf80eaa0a7f3261d9d889282068216.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:12 +11:00
Christophe Leroy b6254ced4d powerpc/signal: Don't manage floating point regs when no FPU
There is no point in copying floating point regs when there
is no FPU and MATH_EMULATION is not selected.

Create a new CONFIG_PPC_FPU_REGS bool that is selected by
CONFIG_MATH_EMULATION and CONFIG_PPC_FPU, and use it to
opt out everything related to fp_state in thread_struct.

The asm const used only by fpu.S are opted out with CONFIG_PPC_FPU
as fpu.S build is conditionnal to CONFIG_PPC_FPU.

The following app spends approx 8.1 seconds system time on an 8xx
without the patch, and 7.0 seconds with the patch (13.5% reduction).

On an 832x, it spends approx 2.6 seconds system time without
the patch and 2.1 seconds with the patch (19% reduction).

	void sigusr1(int sig) { }

	int main(int argc, char **argv)
	{
		int i = 100000;

		signal(SIGUSR1, sigusr1);
		for (;i--;)
			raise(SIGUSR1);
		exit(0);
	}

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7569070083e6cd5b279bb5023da601aba3c06f3c.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:11 +11:00
Christophe Leroy 4d90eb97e2 powerpc/ptrace: Create ptrace_get_fpr() and ptrace_put_fpr()
On the same model as ptrace_get_reg() and ptrace_put_reg(),
create ptrace_get_fpr() and ptrace_put_fpr() to get/set
the floating points registers.

We move the boundary checkings in them.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/24a1baedea7f7ae7b6bf27be98bab6d01b5ca2c1.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:11 +11:00
Christophe Leroy e009fa4335 powerpc/ptrace: Consolidate reg index calculation
Today we have:

	#ifdef CONFIG_PPC32
		index = addr >> 2;
		if ((addr & 3) || child->thread.regs == NULL)
	#else
		index = addr >> 3;
		if ((addr & 7))
	#endif

sizeof(long) has value 4 for PPC32 and value 8 for PPC64.

Dividing by 4 is equivalent to >> 2 and dividing by 8 is equivalent
to >> 3.

And 3 and 7 are respectively (sizeof(long) - 1).

Use sizeof(long) to get rid of the #ifdef CONFIG_PPC32 and consolidate
the calculation and checking.

thread.regs have to be not NULL on both PPC32 and PPC64 so adding
that test on PPC64 is harmless.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/3cd1e284e93c60db981659585e18d1f6bb73ed2f.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:11 +11:00
Christophe Leroy 67e364b329 powerpc/ptrace: Move declaration of ptrace_get_reg() and ptrace_set_reg()
ptrace_get_reg() and ptrace_set_reg() are only used internally by
ptrace.

Move them in arch/powerpc/kernel/ptrace/ptrace-decl.h

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/376c258267aeae54a4423bc4a2e107a9611f0039.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:11 +11:00
Christophe Leroy 95593e930d powerpc/signal: Move inline functions in signal.h
To really be inlined, the functions need to be defined in the
same C file as the caller, or in an included header.

Move functions defined inline from signal .c in signal.h

Fixes: 3dd4eb83a9 ("powerpc: move common register copy functions from signal_32.c to signal.c")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/35b1bd44a1a66f5bcf9b457a1c480ac8d5ef50b2.1597770847.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:11 +11:00
Christophe Leroy d0e3fc69d0 powerpc/vdso: Provide __kernel_clock_gettime64() on vdso32
Provides __kernel_clock_gettime64() on vdso32. This is the
64 bits version of __kernel_clock_gettime() which is
y2038 compliant.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201126131006.2431205-9-mpe@ellerman.id.au
2020-12-04 01:01:11 +11:00
Christophe Leroy ab037dd87a powerpc/vdso: Switch VDSO to generic C implementation.
With the C VDSO, the performance is slightly lower, but it is worth
it as it will ease maintenance and evolution, and also brings clocks
that are not supported with the ASM VDSO.

On an 8xx at 132 MHz, vdsotest with the ASM VDSO:
  gettimeofday:    		  vdso:  828 nsec/call
  clock-getres-realtime-coarse:   vdso:  391 nsec/call
  clock-gettime-realtime-coarse:  vdso:  614 nsec/call
  clock-getres-realtime:    	  vdso:  460 nsec/call
  clock-gettime-realtime:    	  vdso:  876 nsec/call
  clock-getres-monotonic-coarse:  vdso:  399 nsec/call
  clock-gettime-monotonic-coarse: vdso:  691 nsec/call
  clock-getres-monotonic:    	  vdso:  460 nsec/call
  clock-gettime-monotonic:    	  vdso: 1026 nsec/call

On an 8xx at 132 MHz, vdsotest with the C VDSO:
  gettimeofday:    		  vdso:  955 nsec/call
  clock-getres-realtime-coarse:   vdso:  545 nsec/call
  clock-gettime-realtime-coarse:  vdso:  592 nsec/call
  clock-getres-realtime:          vdso:  545 nsec/call
  clock-gettime-realtime:    	  vdso:  941 nsec/call
  clock-getres-monotonic-coarse:  vdso:  545 nsec/call
  clock-gettime-monotonic-coarse: vdso:  591 nsec/call
  clock-getres-monotonic:         vdso:  545 nsec/call
  clock-gettime-monotonic:        vdso:  940 nsec/call

It is even better for gettime with monotonic clocks.

Unsupported clocks with ASM VDSO:
  clock-gettime-boottime:         vdso: 3851 nsec/call
  clock-gettime-tai:      	  vdso: 3852 nsec/call
  clock-gettime-monotonic-raw:    vdso: 3396 nsec/call

Same clocks with C VDSO:
  clock-gettime-tai:              vdso:  941 nsec/call
  clock-gettime-monotonic-raw:    vdso: 1001 nsec/call
  clock-gettime-monotonic-coarse: vdso:  591 nsec/call

On an 8321E at 333 MHz, vdsotest with the ASM VDSO:
  gettimeofday:     		  vdso: 220 nsec/call
  clock-getres-realtime-coarse:   vdso: 102 nsec/call
  clock-gettime-realtime-coarse:  vdso: 178 nsec/call
  clock-getres-realtime:          vdso: 129 nsec/call
  clock-gettime-realtime:    	  vdso: 235 nsec/call
  clock-getres-monotonic-coarse:  vdso: 105 nsec/call
  clock-gettime-monotonic-coarse: vdso: 208 nsec/call
  clock-getres-monotonic:         vdso: 129 nsec/call
  clock-gettime-monotonic:        vdso: 274 nsec/call

On an 8321E at 333 MHz, vdsotest with the C VDSO:
  gettimeofday:    		  vdso: 272 nsec/call
  clock-getres-realtime-coarse:   vdso: 160 nsec/call
  clock-gettime-realtime-coarse:  vdso: 184 nsec/call
  clock-getres-realtime:          vdso: 166 nsec/call
  clock-gettime-realtime:         vdso: 281 nsec/call
  clock-getres-monotonic-coarse:  vdso: 160 nsec/call
  clock-gettime-monotonic-coarse: vdso: 184 nsec/call
  clock-getres-monotonic:         vdso: 169 nsec/call
  clock-gettime-monotonic:        vdso: 275 nsec/call

On a Power9 Nimbus DD2.2 at 3.8GHz, with the ASM VDSO:
  clock-gettime-monotonic:    	  vdso:  35 nsec/call
  clock-getres-monotonic:    	  vdso:  16 nsec/call
  clock-gettime-monotonic-coarse: vdso:  18 nsec/call
  clock-getres-monotonic-coarse:  vdso: 522 nsec/call
  clock-gettime-monotonic-raw:    vdso: 598 nsec/call
  clock-getres-monotonic-raw:     vdso: 520 nsec/call
  clock-gettime-realtime:    	  vdso:  34 nsec/call
  clock-getres-realtime:    	  vdso:  16 nsec/call
  clock-gettime-realtime-coarse:  vdso:  18 nsec/call
  clock-getres-realtime-coarse:   vdso: 517 nsec/call
  getcpu:    			  vdso:   8 nsec/call
  gettimeofday:    		  vdso:  25 nsec/call

And with the C VDSO:
  clock-gettime-monotonic:    	  vdso:  37 nsec/call
  clock-getres-monotonic:    	  vdso:  20 nsec/call
  clock-gettime-monotonic-coarse: vdso:  21 nsec/call
  clock-getres-monotonic-coarse:  vdso:  19 nsec/call
  clock-gettime-monotonic-raw:    vdso:  38 nsec/call
  clock-getres-monotonic-raw:     vdso:  20 nsec/call
  clock-gettime-realtime:    	  vdso:  37 nsec/call
  clock-getres-realtime:    	  vdso:  20 nsec/call
  clock-gettime-realtime-coarse:  vdso:  20 nsec/call
  clock-getres-realtime-coarse:   vdso:  19 nsec/call
  getcpu:    			  vdso:   8 nsec/call
  gettimeofday:    		  vdso:  28 nsec/call

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201126131006.2431205-8-mpe@ellerman.id.au
2020-12-04 01:01:10 +11:00
Christophe Leroy 7fec9f5d41 powerpc/vdso: Save and restore TOC pointer on PPC64
On PPC64, the TOC pointer needs to be saved and restored.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201126131006.2431205-7-mpe@ellerman.id.au
2020-12-04 01:01:10 +11:00
Christophe Leroy ce7d8056e3 powerpc/vdso: Prepare for switching VDSO to generic C implementation.
Prepare for switching VDSO to generic C implementation in following
patch. Here, we:
- Prepare the helpers to call the C VDSO functions
- Prepare the required callbacks for the C VDSO functions
- Prepare the clocksource.h files to define VDSO_ARCH_CLOCKMODES
- Add the C trampolines to the generic C VDSO functions

powerpc is a bit special for VDSO as well as system calls in the
way that it requires setting CR SO bit which cannot be done in C.
Therefore, entry/exit needs to be performed in ASM.

Implementing __arch_get_vdso_data() would clobber the link register,
requiring the caller to save it. As the ASM calling function already
has to set a stack frame and saves the link register before calling
the C vdso function, retriving the vdso data pointer there is lighter.

Implement __arch_vdso_capable() and always return true.

Provide vdso_shift_ns(), as the generic x >> s gives the following
bad result:

  18:	35 25 ff e0 	addic.  r9,r5,-32
  1c:	41 80 00 10 	blt     2c <shift+0x14>
  20:	7c 64 4c 30 	srw     r4,r3,r9
  24:	38 60 00 00 	li      r3,0
  ...
  2c:	54 69 08 3c 	rlwinm  r9,r3,1,0,30
  30:	21 45 00 1f 	subfic  r10,r5,31
  34:	7c 84 2c 30 	srw     r4,r4,r5
  38:	7d 29 50 30 	slw     r9,r9,r10
  3c:	7c 63 2c 30 	srw     r3,r3,r5
  40:	7d 24 23 78 	or      r4,r9,r4

In our case the shift is always <= 32. In addition,  the upper 32 bits
of the result are likely nul. Lets GCC know it, it also optimises the
following calculations.

With the patch, we get:
   0:	21 25 00 20 	subfic  r9,r5,32
   4:	7c 69 48 30 	slw     r9,r3,r9
   8:	7c 84 2c 30 	srw     r4,r4,r5
   c:	7d 24 23 78 	or      r4,r9,r4
  10:	7c 63 2c 30 	srw     r3,r3,r5

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201126131006.2431205-6-mpe@ellerman.id.au
2020-12-04 01:01:10 +11:00