If an interrupt is marked with the no balancing flag, we still allow
setting the affinity for such an interrupt from the kernel itself, but
for interrupts which move the affinity from interrupt context via
irq_move_mask_irq() this runs into a check for the no balancing flag,
which in turn ends up with an endless storm of stack dumps because the
move pending flag is not reset.
Allow the move for interrupts which have the no balancing flag set and
clear the move pending bit before checking for interrupts with the per
cpu flag set.
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1506201002570.4107@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
irq == 0 is not a valid irq for a irqdomain MSI allocation, but hpet
code checks only for negative return values.
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/558447AF.30703@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cisco has developed a new PCI HBA interface called sNIC, which stands for
SCSI NIC. This is a new storage feature supported on specialized network
adapter. The new PCI function provides a uniform host interface and abstracts
backend storage.
[jejb: fix up checkpatch errors]
Signed-off-by: Narsimhulu Musini <nmusini@cisco.com>
Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
User visible:
- Replace CTRL+z with 'f' as hotkey for enable/disable events (Arnaldo Carvalho de Melo)
- Do not exit when 'f' is pressed in 'report' mode (Arnaldo Carvalho de Melo)
- Tell the user how to unfreeze events after pressing 'f' in 'perf top' (Arnaldo Carvalho de Melo)
- React to unassigned hotkey pressing in 'top/report' (Arnaldo Carvalho de Melo)
- Display total number of samples with --show-total-period in 'annotate' (Martin Liška)
- Add timeout to make procfs mmap processing more robust (Kan Liang)
- Fix sort__sym_cmp to also compare end of symbol (Yannick Brosseau)
Infrastructure:
- Ensure thread-stack is flushed (Adrian Hunter)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVhI78AAoJENZQFvNTUqpAwyIQAKtFpvsrgWkv4ahxOmd4vXl/
NVxnbgP+zRf37aaQS2gH8Oe+77sQmSyY4ntTsHx7tA6QzqQ8Ag7xWfXPmyfxBkE3
yUHciu+n2++mpYUZWOzCp1JcwwPqqpcSiLedzMruhmwrZB2rkaB67Stl9GfRmEtn
FTnjtGo5OKuSeKdZHNbCT9GExkPaIHMv4Hd2IwrYtAPi4Ss7V0W9QteP9Sg6emFg
OgFhAjEietYKIEEphjHDiLqRe8q0fNTkaZMnlxsmzx4KWX0MkyP1siNALdr6jZCy
icf8w8v20l6yhlejZp6XzXQ6b5OMHjbtEs+8Oszz3nfrJl/Bpgb7yA0IhJl4SB7L
DFiasiTw7j7cPcD+XIT9lYT9cyCKjJwRPgZKdvaUNLl7/UmVCiPYjfnAawSy0ga2
t6eSLDqjgMrrq5WmIlQInWJC2+YUyYAPf/8YBQGbejiypic7LFvKfDAs2s5eC+vB
DoWY87rkPqSdXVLOGYjjKTr5YVnMSKOPxoDITTZy8ETOTNq4ivAV8oBQRt+GRwVr
bEqp911cdjK564gW2up7nzL5mb1JA9aU2xgFNd3APqtGGTcNiH9LTZ/anWe46MJQ
hIUonSD9pHNElexhP/vmUeNjRYYbGgWPOY/ftq3xxlOfQPy/fAETq6j2vi/9j6Hs
KKfgmgbYYG+IdoqsWfO0
=hRlL
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Replace CTRL+z with 'f' as hotkey for enable/disable events (Arnaldo Carvalho de Melo)
- Do not exit when 'f' is pressed in 'report' mode (Arnaldo Carvalho de Melo)
- Tell the user how to unfreeze events after pressing 'f' in 'perf top' (Arnaldo Carvalho de Melo)
- React to unassigned hotkey pressing in 'top/report' (Arnaldo Carvalho de Melo)
- Display total number of samples with --show-total-period in 'annotate' (Martin Liška)
- Add timeout to make procfs mmap processing more robust (Kan Liang)
- Fix sort__sym_cmp to also compare end of symbol (Yannick Brosseau)
Infrastructure changes:
- Ensure thread-stack is flushed (Adrian Hunter)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
User visible:
- Fix build breakage if prefix= is specified, this introduced
a regression for a build idiom used by the Debian "linux-tools"
package, that does:
MAKE_PERF := $(MAKE) prefix=/usr V=1 ARCH=$(KERNEL_ARCH_PERF) ...
Fix it. (Lukas Wunner)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVhHHmAAoJENZQFvNTUqpAwB0P/355w6PWHVsKZYrcfy6ipXVr
03PDvZPRHbwLuicxsNTxSC5hYvud4o79KPWFeagPb+8Y+1ZOTM0IGAjXy6uzrQ2Q
+MWFw5JmkL/cnay68EyykWcZ7nA6YWiROjP+MJQW6YtDxkgL11YfFXf+bNtfiYn2
51Dl+gSNNVdRcS3G25Qwx6sSBi8/feyu7R0xJR/xD/9NiQjVbKxfftGqgtOdLXAe
i8+toKz8DrWbNOPP+jFrjR3FGULoEddVJ6+0nzhzsTojTQGGFnTbWoiMa2kkIcAJ
stGkQkRPtuV0FhgsjsPTH/OKW9Yi+/lcauo8NlT4lFWUAQw/yabbMM4o7gz3Xq3t
F3jmIixIYN5HUzNe+PNVmJ7n3HnKtxvlJqsSoAJx5oOFmCCe/RCOv3b9HhBKaeuM
kM8RkA5WcYMMDuymi6GwthLpCBn/VYchepGYz5LNQYBfhx+sjs413fteXQL7+X6M
X8E8ZEn4MAJ/dFBFU6pbQVY+mkK1g7sffYOwhoEKHT4yfGEjKRblW9kQXMCZ2hv1
aBEebfbm+c0ZuA8hb2AouTQXK+zA6HMZpneLJTRpvsENI2IgyEp4+SacfnDE6SGu
DyH+FnySMwavwUv5VORmC+eBD8h5w//D9NOu7dZh/P+VdIWThf9/YzvUXlEoOELi
f+ozAmJ55gWm15PdurZe
=++27
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fix from Arnaldo Carvalho de Melo:
- Fix build breakage if prefix= is specified, this introduced
a regression for a build idiom used by the Debian "linux-tools"
package, that does:
MAKE_PERF := $(MAKE) prefix=/usr V=1 ARCH=$(KERNEL_ARCH_PERF) ...
Fix it. (Lukas Wunner)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The time out to limit the individual proc map processing was hard code
to 500ms. This patch introduce a new option --proc-map-timeout to make
the time limit configurable.
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ying Huang <ying.huang@intel.com>
Link: http://lkml.kernel.org/r/1434549071-25611-2-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
System wide sampling like 'perf top' or 'perf record -a' read all
threads /proc/xxx/maps before sampling. If there are any threads which
generating a keeping growing huge maps, perf will do infinite loop
during synthesizing. Nothing will be sampled.
This patch fixes this issue by adding per-thread timeout to force stop
this kind of endless proc map processing.
PERF_RECORD_MISC_PROC_MAP_PARSE_TIME_OUT is introduced to indicate that
the mmap record are truncated by time out. User will get warning
notification when truncated mmap records are detected.
Reported-by: Ying Huang <ying.huang@intel.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ying Huang <ying.huang@intel.com>
Link: http://lkml.kernel.org/r/1434549071-25611-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When using a map file from a JIT, due to memory reuse, we can obtain
multiple symbols with the same start address but a different length.
The symbols__find does check for the end so not doing it in
sort__sym_cmp was causing the hist_entry in the annotate part of a
report to match to the wrong entry, causing a fatal error.
Signed-off-by: Yannick Brosseau <scientist@fb.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@fb.com
Link: http://lkml.kernel.org/r/1434584470-17771-1-git-send-email-scientist@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When that happens we were just ignoring the key press, now this
message is presented in the bottom line (the help line):
"Press '?' for help on key bindings"
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-iyma2j5kj3q9i1stl4mfh90n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the user presses 'f' to disable events the visual cues are, well,
the percentages not changing and the number of events freezing.
Be more explicit by changing the help line at the bottom of the screen
to show the following messages when 'f' is pressed:
"Press 'f' again to re-enable the events"
And then, when 'f' is pressed again:
"Press 'f' to disable the events or 'h'
Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-uhiswg9a9rxm5gxg7ptjskjn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The hists_browser was replacing whatever helpline provided by 'top' or
'report' with a static "Press '?' for help on key bindings", fix it.
Now the message passed by top appears at the bottom of the screen:
"For a higher level overview, try: perf top --sort comm,dso"
As well the message that will be added when the user presses 'f' to
disable the events, something along the lines of "press f again to
re-enable...".
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-dacaja70mbfz3a0yj1n180gx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'f' hotkey is only used when in 'top', dynamic mode, to
enable/disable events, currently not making sense in the 'report',
static mode, where we can't go from showing the histogram entries
created from a perf.data file to adding more events after recreating the
evlist created from the perf.data file, albeit possible, this is not
implemented right now.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-lholzf472pu98dkkijggwx2m@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I.e. 'freeze'/'unfreeze', this is because CTRL+z has a well known
action, i.e. suspend the app, perf needs to follow that convention, that
will be done on a separate patch, tho.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-oedcl6ovohara4koig14ayip@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Invoking Makefile.perf with prefix= breaks the build since Makefile.perf
hands that variable down to Makefile.build where it overrides
prefix := $(subst ./,,$(OUTPUT)$(dir)/)
leading to errors like this:
No rule to make target '/usrabspath.o', needed by '/usrlibperf-in.o'
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Fixes: c819e2cf2e
Link: http://lkml.kernel.org/r/5582c48a.84a22b0a.a918.5285SMTPIN_ADDED_MISSING@mx.google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To better reflect the purpose of this struct, that is to hold
info about samples, its total number and is percentage.
Cc: Martin Liska <mliska@suse.cz>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-6bf8gwcl975uurl0ttpvtk69@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To compare two records on an instruction base, with --show-total-period
option provided, display total number of samples that belong to a line
in assembly language.
New hot key 't' is introduced for 'perf annotate' TUI.
Signed-off-by: Martin Liska <mliska@suse.cz>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/5583E26D.1040407@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The thread-stack represents a thread's current stack. When a thread
exits there can still be many functions on the stack e.g. exit() can be
called many levels deep, so all the callers will never return. To get
that information output, the thread-stack must be flushed.
Previously it was assumed the thread-stack would be flushed when the
struct thread was deleted. With thread ref-counting it is no longer
clear when that will be, if ever. So instead explicitly flush all the
thread-stacks at the end of a session.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1432906425-9911-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
unnoticed by me until recently, hence the late pull request.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVhCncAAoJEKI6nJvDJaTU+YUQAIPostZWwDMGWLcOWibg1168
aIhdrpvgS6eFshACYGQgLeyPsBd+UTRWvI0T48YPn3GWJxq/wfQU/41GKjHySlla
F95AtcJkG0HbqVetTZNtkVk003HMemoq20NC62gijiiK4pC4AEqyjGnziyhn02NT
Poz7wljr6G/pk26mKTTsx0e8v+7S9trSwwRVNopofjqZ2VGgYk/7Vp4rxM9PebuG
QNq5Ffy3E9dl9FisOLDl6KnVXBGOslRSx2Dt2liVLicXYodoFUIPq42LbTEVbsS4
C7Onnm3IOEOiT/nrYiN9xsZHd7jF+xJubvRxN1n3+Lb6FVcyMZGoZPKlwBr9WlLp
SEzU6V7fwN2t5JpzNYHzVXVYjzjTntAp1jQ0Q8945XyvMF4hHAvfDYX1BErfdVFB
YhX4yvC3y24GDw308xlwSxvwCIuItA5A3DE3AJj2WlKmvAg3FvlSW3odPATWFB4n
rg71u9iS1asBE1MYR2zg7HoAzQAwZuNhctAk+DrFkafQIoWxzLm+RWgsPDF69yEn
BtVjMAR0Cs+7AoI4cweN/W5Ik/p171KIYDsTawc7y9hEiAM2dvzWOfq/CS0OEGFA
SAGPeVPRVyfcvv0qVr2REkWBiW5JtfpSeYNMldOTEJLCd8QhSPkurZJ6rEIYIDYM
atDouBGRYTrzhLGnmALi
=0Lhn
-----END PGP SIGNATURE-----
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Michael Turquette:
"Very late clk regression fixes for the ARM-based AT91 platform.
These went unnoticed by me until recently, hence the late pull
request"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: at91: fix h32mx prototype inclusion in pmc header
clk: at91: trivial: typo in peripheral clock description
clk: at91: fix PERIPHERAL_MAX_SHIFT definition
clk: at91: pll: fix input range validity check
Nothing looks scary, just a few usual HD-audio regression fixes
and fixup, in addition to a minor Kconfig dependency fix for
the old MIPS drivers.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJVhAJbAAoJEGwxgFQ9KSmk46UP/0K9YGpUANqAoNdINPmLm6ug
VECymygbOVLBR7lDooBKxSPfDYH3jXwOcVYm9cBF2tgIpuaS3qgq6RYr+4pKGMQM
vW2VN0742Ag78X6+YJ70Tw7IbP0pHTvpoNvzYvOpfxhx2ebcF3Zw8Z65BLQtMc16
n+bYqgc8WeDc9RhnqfziVORD2CwXkATYiGhl1yHVrSAs9V75UKTFwCbV7fVoWcHU
DKbrkH+2FzWpdWraL01HAQ/z5bGECtww3khFvmPPFnxLcUF6C2bzGTc6OCcduHpX
NcwvL+NbP++tEAbw9sQiuuWhu2oRvFLhPrmmlN2ngHLVtCyPb5TEL+si6qLvtuRx
qmlP0Uco2bd5Ypb8SF/mJaoWRBD+AW+mhfF5n81+XrNrQRGZcV6LGTdqBKKag9yA
p7VX8/CpK4DLn5GggPAMMcO2SDBlwI66ivozGEKEFLYODoFZcDDZcH4dhafW7RCA
sZPkr8hNEghJr5V28orKFm1ogy6bRMsnUWxMuekJaR6Ux6mTjDZqM2LJRPsZaIu4
ApCcHi8KVWBV3Io5iU518/+ylobe5heg5lOl8Y1UYGFnfc0QePezzHvryKa7XaB/
xsCWe+qXaG4s1jZzmrqbryNXIvzfjaZ3SUGJfzFdTrbn5J+JcVKrUEtFZR4L3pAg
t+d8KJaKLCqajWyuDhlj
=KSPF
-----END PGP SIGNATURE-----
Merge tag 'sound-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Nothing looks scary, just a few usual HD-audio regression fixes and
fixup, in addition to a minor Kconfig dependency fix for the old MIPS
drivers"
* tag 'sound-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix unused label skip_i915
ALSA: hda - Fix noisy outputs on Dell XPS13 (2015 model)
ALSA: mips: let SND_SGI_O2 select SND_PCM
ALSA: hda - Fix audio crackles on Dell Latitude E7x40
ALSA: hda - adding a DAC/pin preference map for a HP Envy TS machine
n /= range->step_uV + 1; is equivalent to n /= (range->step_uV + 1);
which is wrong. Fix it.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
printk_ratelimit() shares the ratelimiting state with other callers what
may lead to scenarios where at the time we want to print out debug
information we already limited, so nothing appears in the dmesg - this
makes exception-trace quite poor helper in debugging.
Additionally, we have imbalance with some messages limited with global
ratelimit state and other messages limited with their private state
defined via pr_*_ratelimited().
To address this inconsistency show_unhandled_signals_ratelimited()
macro is introduced and caller sites are converted to use it.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Report unhandled SP/PC alignment faults if the show_unhandled_signals
variable is set (via /proc/sys/debug/exception-trace).
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Make sure that we are skipping over large PTEs while walking
the page-table tree.
Cc: stable@kernel.org
Fixes: 5c34c403b7 ("iommu/amd: Fix memory leak in free_pagetable")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This will be used for private function used by AMD- and Intel-specific
PMU implementations.
Signed-off-by: Wei Huang <wei@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Based on Intel's SDM, mapping huge page which do not have consistent
memory cache for each 4k page will cause undefined behavior
In order to avoiding this kind of undefined behavior, we force to use
4k pages under this case
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
mtrr_for_each_mem_type() is ready now, use it to simplify
kvm_mtrr_get_guest_memory_type()
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It walks all MTRRs and gets all the memory cache type setting for the
specified range also it checks if the range is fully covered by MTRRs
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
[Adjust for range_size->range_shift change. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Two functions are introduced:
- fixed_mtrr_addr_to_seg() translates the address to the fixed
MTRR segment
- fixed_mtrr_addr_seg_to_range_index() translates the address to
the index of kvm_mtrr.fixed_ranges[]
They will be used in the later patch
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
[Adjust for range_size->range_shift change. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sort all valid variable MTRRs based on its base address, it will help us to
check a range to see if it's fully contained in variable MTRRs
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
[Fix list insertion sort, simplify var_mtrr_range_is_valid to just
test the V bit. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It gets the range for the specified variable MTRR
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
[Simplify boolean operations. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This table summarizes the information of fixed MTRRs and introduce some APIs
to abstract its operation which helps us to clean up the code and will be
used in later patches
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
[Change range_size to range_shift, in order to avoid udivdi3 errors.
- Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- kvm_mtrr_get_guest_memory_type() only checks one page in MTRRs so
that it's unnecessary to check to see if the range is partially
covered in MTRR
- optimize the check of overlap memory type and add some comments
to explain the precedence
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Variable MTRR MSRs are 64 bits which are directly accessed with full length,
no reason to split them to two 32 bits
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Drop kvm_mtrr->enable, omit the decode/code workload and get rid of
all the hard code
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Only KVM_NR_VAR_MTRR variable MTRRs are available in KVM guest
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
vMTRR does not depend on any host MTRR feature and fixed MTRRs have always
been implemented, so drop this field
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MSR_MTRRcap is a MTRR msr so move the handler to the common place, also
add some comments to make the hard code more readable
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MTRR code locates in x86.c and mmu.c so that move them to a separate file to
make the organization more clearer and it will be the place where we fully
implement vMTRR
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, CR0.CD is not checked when we virtualize memory cache type for
noncoherent_dma guests, this patch fixes it by :
- setting UC for all memory if CR0.CD = 1
- zapping all the last sptes in MMU if CR0.CD is changed
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If hardware doesn't support DecodeAssist - a feature that provides
more information about the intercept in the VMCB, KVM decodes the
instruction and then updates the next_rip vmcb control field.
However, NRIP support itself depends on cpuid Fn8000_000A_EDX[NRIPS].
Since skip_emulated_instruction() doesn't verify nrip support
before accepting control.next_rip as valid, avoid writing this
field if support isn't present.
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The allocation size of the kvm_irq_routing_table depends on
the number of irq routing entries because they are all
allocated with one kzalloc call.
When the irq routing table gets bigger this requires high
order allocations which fail from time to time:
qemu-kvm: page allocation failure: order:4, mode:0xd0
This patch fixes this issue by breaking up the allocation of
the table and its entries into individual kzalloc calls.
These could all be satisfied with order-0 allocations, which
are less likely to fail.
The downside of this change is the lower performance, because
of more calls to kzalloc. But given how often kvm_set_irq_routing
is called in the lifetime of a guest, it doesn't really
matter much.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
[Avoid sparse warning through rcu_access_pointer. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>