Bug 1568450 - explicitly specify a cpu for LTO linking on Windows; r=dmajor

By default, the linker chooses a "generic" 32-bit CPU to optimize for,
and LLVM's "generic" 32-bit CPU model doesn't include some features that
are helpful for performance on microbenchmarks.  We explicitly specify a
CPU model to ensure the model we want is selected.

On x86-64, we explicitly force a generically good processor model, even
though the automatically selected one didn't seem to hurt benchmarks.

Differential Revision: https://phabricator.services.mozilla.com/D40479

--HG--
extra : moz-landing-system : lando
This commit is contained in:
Nathan Froyd 2019-08-02 20:43:52 +00:00
Родитель e592c76817
Коммит 909e9c6f30
1 изменённых файлов: 25 добавлений и 0 удалений

Просмотреть файл

@ -207,6 +207,31 @@ def lto(value, pgo, profile_generate, c_compiler, ld64_known_good, target):
# With clang-cl, -flto can only be used with -c or -fuse-ld=lld.
# AC_TRY_LINKs during configure don't have -c, so pass -fuse-ld=lld.
cflags.append("-fuse-ld=lld");
# Explicitly set the CPU to optimize for so the linker doesn't
# choose a poor default. Rust compilation by default uses the
# pentium4 CPU on x86:
#
# https://github.com/rust-lang/rust/blob/master/src/librustc_target/spec/i686_pc_windows_msvc.rs#L5
#
# which specifically supports "long" (multi-byte) nops. See
# https://bugzilla.mozilla.org/show_bug.cgi?id=1568450#c8 for details.
#
# The pentium4 seems like kind of a weird CPU to optimize for, but
# it seems to have worked out OK thus far. LLVM does not seem to
# specifically schedule code for the pentium4's deep pipeline, so
# that probably contributes to it being an OK default for our
# purposes.
if target.cpu == 'x86':
ldflags.append('-mllvm:-mcpu=pentium4')
# This is also the CPU that Rust uses. The LLVM source code
# recommends this as the "generic 64-bit specific x86 processor model":
#
# https://github.com/llvm/llvm-project/blob/e7694f34ab6a12b8bb480cbfcb396d0a64fe965f/llvm/lib/Target/X86/X86.td#L1165-L1187
if target.cpu == 'x86_64':
ldflags.append('-mllvm:-mcpu=x86-64')
# We do not need special flags for arm64. Hooray for fixed-length
# instruction sets.
else:
num_cores = multiprocessing.cpu_count()
cflags.append("-flto")