Граф коммитов

1450 Коммитов

Автор SHA1 Сообщение Дата
Julien Grall 5d9404e118 xen: Export xen_reboot
The helper xen_reboot will be called by the EFI code in a later patch.

Note that the ARM version does not yet exist and will be added in a
later patch too.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:50:06 +02:00
Boris Ostrovsky f31b969217 xen/x86: Call xen_smp_intr_init_pv() on BSP
Recent code rework that split handling ov PV, HVM and PVH guests into
separate files missed calling xen_smp_intr_init_pv() on CPU0.

Add this call.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:18:13 +02:00
Boris Ostrovsky 84d582d236 xen: Revert commits da72ff5bfc and 72a9b18629
Recent discussion (http://marc.info/?l=xen-devel&m=149192184523741)
established that commit 72a9b18629 ("xen: Remove event channel
notification through Xen PCI platform device") (and thus commit
da72ff5bfc ("partially revert "xen: Remove event channel
notification through Xen PCI platform device"")) are unnecessary and,
in fact, prevent HVM guests from booting on Xen releases prior to 4.0

Therefore we revert both of those commits.

The summary of that discussion is below:

  Here is the brief summary of the current situation:

  Before the offending commit (72a9b18629):

  1) INTx does not work because of the reset_watches path.
  2) The reset_watches path is only taken if you have Xen > 4.0
  3) The Linux Kernel by default will use vector inject if the hypervisor
     support. So even INTx does not work no body running the kernel with
     Xen > 4.0 would notice. Unless he explicitly disabled this feature
     either in the kernel or in Xen (and this can only be disabled by
     modifying the code, not user-supported way to do it).

  After the offending commit (+ partial revert):

  1) INTx is no longer support for HVM (only for PV guests).
  2) Any HVM guest The kernel will not boot on Xen < 4.0 which does
     not have vector injection support. Since the only other mode
     supported is INTx which.

  So based on this summary, I think before commit (72a9b18629) we were
  in much better position from a user point of view.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:18:05 +02:00
Boris Ostrovsky 5f6a1614fa xen/pvh: Do not fill kernel's e820 map in init_pvh_bootparams()
e820 map is updated with information from the zeropage (i.e. pvh_bootparams)
by default_machine_specific_memory_setup(). With the way things are done
now,  we end up with a duplicated e820 map.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:17:39 +02:00
Juergen Gross 6807cf65f5 x86/xen: use capabilities instead of fake cpuid values for xsave
When running as pv domain xen_cpuid() is being used instead of
native_cpuid(). In xen_cpuid() the xsave feature availability is
indicated by special casing the related cpuid leaf.

Instead of delivering fake cpuid values set or clear the cpu
capability bits for xsave instead.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:14:17 +02:00
Juergen Gross e657fccb79 x86/xen: use capabilities instead of fake cpuid values for x2apic
When running as pv domain xen_cpuid() is being used instead of
native_cpuid(). In xen_cpuid() the x2apic feature is indicated as not
being present by special casing the related cpuid leaf.

Instead of delivering fake cpuid values clear the cpu capability bit
for x2apic instead.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:14:11 +02:00
Juergen Gross ea01598b4b x86/xen: use capabilities instead of fake cpuid values for mwait
When running as pv domain xen_cpuid() is being used instead of
native_cpuid(). In xen_cpuid() the mwait feature is indicated to be
present or not by special casing the related cpuid leaf.

Instead of delivering fake cpuid values use the cpu capability bit
for mwait instead.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:14:05 +02:00
Juergen Gross b778d6bf63 x86/xen: use capabilities instead of fake cpuid values for acpi
When running as pv domain xen_cpuid() is being used instead of
native_cpuid(). In xen_cpuid() the acpi feature is indicated as not
being present by special casing the related cpuid leaf in case we
are not the initial domain.

Instead of delivering fake cpuid values clear the cpu capability bit
for acpi instead.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:14:00 +02:00
Juergen Gross aa10715629 x86/xen: use capabilities instead of fake cpuid values for acc
When running as pv domain xen_cpuid() is being used instead of
native_cpuid(). In xen_cpuid() the acc feature (thermal monitoring)
is indicated as not being present by special casing the related
cpuid leaf.

Instead of delivering fake cpuid values clear the cpu capability bit
for acc instead.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:13:53 +02:00
Juergen Gross 88f3256f21 x86/xen: use capabilities instead of fake cpuid values for mtrr
When running as pv domain xen_cpuid() is being used instead of
native_cpuid(). In xen_cpuid() the mtrr feature is indicated as not
being present by special casing the related cpuid leaf.

Instead of delivering fake cpuid values clear the cpu capability bit
for mtrr instead.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:13:48 +02:00
Juergen Gross fd9145fd27 x86/xen: use capabilities instead of fake cpuid values for aperf
When running as pv domain xen_cpuid() is being used instead of
native_cpuid(). In xen_cpuid() the aperf/mperf feature is indicated
as not being present by special casing the related cpuid leaf.

Instead of delivering fake cpuid values clear the cpu capability bit
for aperf/mperf instead.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:13:42 +02:00
Juergen Gross 3ee99df333 x86/xen: don't indicate DCA support in pv domains
Xen doesn't support DCA (direct cache access) for pv domains. Clear
the corresponding capability indicator.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:13:36 +02:00
Juergen Gross 0808e80cb7 xen: set cpu capabilities from xen_start_kernel()
There is no need to set the same capabilities for each cpu
individually. This can easily be done for all cpus when starting the
kernel.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:13:29 +02:00
Juergen Gross 29985b0961 xen,kdump: handle pv domain in paddr_vmcoreinfo_note()
For kdump to work correctly it needs the physical address of
vmcoreinfo_note. When running as dom0 this means the virtual address
has to be translated to the related machine address.

paddr_vmcoreinfo_note() is meant to do the translation via
__pa_symbol() only, but being attributed "weak" it can be replaced
easily in Xen case.

Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Petr Tesarik <ptesarik@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:13:17 +02:00
Juergen Gross ab1570a427 x86/xen: remove unused static function from smp_pv.c
xen_call_function_interrupt() isn't used in smp_pv.c. Remove it.

Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:10:34 +02:00
Vitaly Kuznetsov 8cb6de3900 x86/xen: rename some PV-only functions in smp_pv.c
After code split between PV and HVM some functions in xen_smp_ops have
xen_pv_ prefix and some only xen_ which makes them look like they're
common for both PV and HVM while they're not. Rename all the rest to
have xen_pv_ prefix.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:10:25 +02:00
Vitaly Kuznetsov 33af746985 x86/xen: enable PVHVM-only builds
Now everything is in place and we can move PV-only code under
CONFIG_XEN_PV. CONFIG_XEN_PV_SMP is created to support the change.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:10:16 +02:00
Vitaly Kuznetsov c504b2f106 x86/xen: define startup_xen for XEN PV only
startup_xen references PV-only code, decorate it with #ifdef CONFIG_XEN_PV
to make PV-free builds possible.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:09:37 +02:00
Vitaly Kuznetsov 50a1062d61 x86/xen: put setup.c, pmu.c and apic.c under CONFIG_XEN_PV
xen_pmu_init/finish() functions are used in suspend.c and
enlighten.c, add stubs for now.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:09:28 +02:00
Vitaly Kuznetsov 9963236d7c x86/xen: split suspend.c for PV and PVHVM guests
Slit the code in suspend.c into suspend_pv.c and suspend_hvm.c.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:09:17 +02:00
Vitaly Kuznetsov 7e0563dea9 x86/xen: split off mmu_pv.c
Basically, mmu.c is renamed to mmu_pv.c and some code moved out to common
mmu.c.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:08:51 +02:00
Vitaly Kuznetsov feef87ebfd x86/xen: split off mmu_hvm.c
Move PVHVM related code to mmu_hvm.c.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:05:10 +02:00
Vitaly Kuznetsov 83b96794e0 x86/xen: split off smp_pv.c
Basically, smp.c is renamed to smp_pv.c and some code moved out to common
smp.c. struct xen_common_irq delcaration ended up in smp.h.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:05:00 +02:00
Vitaly Kuznetsov a52482d935 x86/xen: split off smp_hvm.c
Move PVHVM related code to smp_hvm.c. Drop 'static' qualifier from
xen_smp_send_reschedule(), xen_smp_send_call_function_ipi(),
xen_smp_send_call_function_single_ipi(), these functions will be moved to
common smp code when smp_pv.c is split.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:04:50 +02:00
Vitaly Kuznetsov aa1c84e8ca x86/xen: split xen_cpu_die()
Split xen_cpu_die() into xen_pv_cpu_die() and xen_hvm_cpu_die() to support
further splitting of smp.c.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:04:40 +02:00
Vitaly Kuznetsov a2d1078a35 x86/xen: split xen_smp_prepare_boot_cpu()
Split xen_smp_prepare_boot_cpu() into xen_pv_smp_prepare_boot_cpu() and
xen_hvm_smp_prepare_boot_cpu() to support further splitting of smp.c.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:04:31 +02:00
Vitaly Kuznetsov 04e95761fa x86/xen: split xen_smp_intr_init()/xen_smp_intr_free()
xen_smp_intr_init() and xen_smp_intr_free() have PV-specific code and as
a praparatory change to splitting smp.c we need to split these fucntions.
Create xen_smp_intr_init_pv()/xen_smp_intr_free_pv().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:04:18 +02:00
Vitaly Kuznetsov e1dab14cf6 x86/xen: split off enlighten_pv.c
Basically, enlighten.c is renamed to enlighten_pv.c and some code moved
out to common enlighten.c.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:04:05 +02:00
Vitaly Kuznetsov 98f2a47a00 x86/xen: split off enlighten_hvm.c
Move PVHVM related code to enlighten_hvm.c. Three functions:
xen_cpuhp_setup(), xen_reboot(), xen_emergency_restart() are shared, drop
static qualifier from them. These functions will go to common code once
it is split from enlighten.c.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:03:53 +02:00
Vitaly Kuznetsov 481d66325d x86/xen: split off enlighten_pvh.c
Create enlighten_pvh.c by splitting off PVH related code from enlighten.c,
put it under CONFIG_XEN_PVH.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 10:58:23 +02:00
Vitaly Kuznetsov 5e57f1d607 x86/xen: add CONFIG_XEN_PV to Kconfig
All code to support Xen PV will get under this new option. For the
beginning, check for it in the common code.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 10:50:19 +02:00
Vitaly Kuznetsov 52519f2af0 x86/xen: globalize have_vcpu_info_placement
have_vcpu_info_placement applies to both PV and HVM and as we're going
to split the code we need to make it global.

Rename to xen_have_vcpu_info_placement.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 10:50:07 +02:00
Vitaly Kuznetsov 0991d22d5e x86/xen: separate PV and HVM hypervisors
As a preparation to splitting the code we need to untangle it:

x86_hyper_xen -> x86_hyper_xen_hvm and x86_hyper_xen_pv
xen_platform() -> xen_platform_hvm() and xen_platform_pv()
xen_cpu_up_prepare() -> xen_cpu_up_prepare_pv() and xen_cpu_up_prepare_hvm()
xen_cpu_dead() -> xen_cpu_dead_pv() and xen_cpu_dead_pv_hvm()

Add two parameters to xen_cpuhp_setup() to pass proper cpu_up_prepare and
cpu_dead hooks. xen_set_cpu_features() is now PV-only so the redundant
xen_pv_domain() check can be dropped.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 10:49:44 +02:00
Linus Torvalds d3b5d35290 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Ingo Molnar:
 "The main x86 MM changes in this cycle were:

   - continued native kernel PCID support preparation patches to the TLB
     flushing code (Andy Lutomirski)

   - various fixes related to 32-bit compat syscall returning address
     over 4Gb in applications, launched from 64-bit binaries - motivated
     by C/R frameworks such as Virtuozzo. (Dmitry Safonov)

   - continued Intel 5-level paging enablement: in particular the
     conversion of x86 GUP to the generic GUP code. (Kirill A. Shutemov)

   - x86/mpx ABI corner case fixes/enhancements (Joerg Roedel)

   - ... plus misc updates, fixes and cleanups"

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (62 commits)
  mm, zone_device: Replace {get, put}_zone_device_page() with a single reference to fix pmem crash
  x86/mm: Fix flush_tlb_page() on Xen
  x86/mm: Make flush_tlb_mm_range() more predictable
  x86/mm: Remove flush_tlb() and flush_tlb_current_task()
  x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()
  x86/mm/64: Fix crash in remove_pagetable()
  Revert "x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation"
  x86/boot/e820: Remove a redundant self assignment
  x86/mm: Fix dump pagetables for 4 levels of page tables
  x86/mpx, selftests: Only check bounds-vs-shadow when we keep shadow
  x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space
  Revert "x86/mm/numa: Remove numa_nodemask_from_meminfo()"
  x86/espfix: Add support for 5-level paging
  x86/kasan: Extend KASAN to support 5-level paging
  x86/mm: Add basic defines/helpers for CONFIG_X86_5LEVEL=y
  x86/paravirt: Add 5-level support to the paravirt code
  x86/mm: Define virtual memory map for 5-level paging
  x86/asm: Remove __VIRTUAL_MASK_SHIFT==47 assert
  x86/boot: Detect 5-level paging support
  x86/mm/numa: Remove numa_nodemask_from_meminfo()
  ...
2017-05-01 23:54:56 -07:00
Linus Torvalds a52bbaf4a3 Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu updates from Ingo Molnar:
 "The biggest changes are an extension of the Intel RDT code to extend
  it with Intel Memory Bandwidth Allocation CPU support: MBA allows
  bandwidth allocation between cores, while CBM (already upstream)
  allows CPU cache partitioning.

  There's also misc smaller fixes and updates"

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  x86/intel_rdt: Return error for incorrect resource names in schemata
  x86/intel_rdt: Trim whitespace while parsing schemata input
  x86/intel_rdt: Fix padding when resource is enabled via mount
  x86/intel_rdt: Get rid of anon union
  x86/cpu: Keep model defines sorted by model number
  x86/intel_rdt/mba: Add schemata file support for MBA
  x86/intel_rdt: Make schemata file parsers resource specific
  x86/intel_rdt/mba: Add info directory files for Memory Bandwidth Allocation
  x86/intel_rdt: Make information files resource specific
  x86/intel_rdt/mba: Add primary support for Memory Bandwidth Allocation (MBA)
  x86/intel_rdt/mba: Memory bandwith allocation feature detect
  x86/intel_rdt: Add resource specific msr update function
  x86/intel_rdt: Move CBM specific data into a struct
  x86/intel_rdt: Cleanup namespace to support multiple resource types
  Documentation, x86: Intel Memory bandwidth allocation
  x86/intel_rdt: Organize code properly
  x86/intel_rdt: Init padding only if a device exists
  x86/intel_rdt: Add cpus_list rdtgroup file
  x86/intel_rdt: Cleanup kernel-doc
  x86/intel_rdt: Update schemata read to show data in tabular format
  ...
2017-05-01 21:15:50 -07:00
Linus Torvalds 16b76293c5 Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar:
 "The biggest changes in this cycle were:

   - reworking of the e820 code: separate in-kernel and boot-ABI data
     structures and apply a whole range of cleanups to the kernel side.

     No change in functionality.

   - enable KASLR by default: it's used by all major distros and it's
     out of the experimental stage as well.

   - ... misc fixes and cleanups"

* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
  x86/KASLR: Fix kexec kernel boot crash when KASLR randomization fails
  x86/reboot: Turn off KVM when halting a CPU
  x86/boot: Fix BSS corruption/overwrite bug in early x86 kernel startup
  x86: Enable KASLR by default
  boot/param: Move next_arg() function to lib/cmdline.c for later reuse
  x86/boot: Fix Sparse warning by including required header file
  x86/boot/64: Rename start_cpu()
  x86/xen: Update e820 table handling to the new core x86 E820 code
  x86/boot: Fix pr_debug() API braindamage
  xen, x86/headers: Add <linux/device.h> dependency to <asm/xen/page.h>
  x86/boot/e820: Simplify e820__update_table()
  x86/boot/e820: Separate the E820 ABI structures from the in-kernel structures
  x86/boot/e820: Fix and clean up e820_type switch() statements
  x86/boot/e820: Rename the remaining E820 APIs to the e820__*() prefix
  x86/boot/e820: Remove unnecessary #include's
  x86/boot/e820: Rename e820_mark_nosave_regions() to e820__register_nosave_regions()
  x86/boot/e820: Rename e820_reserve_resources*() to e820__reserve_resources*()
  x86/boot/e820: Use bool in query APIs
  x86/boot/e820: Document e820__reserve_setup_data()
  x86/boot/e820: Clean up __e820__update_table() et al
  ...
2017-05-01 20:51:12 -07:00
Nicolai Stange 3d18d661aa x86/xen/time: Set ->min_delta_ticks and ->max_delta_ticks
In preparation for making the clockevents core NTP correction aware,
all clockevent device drivers must set ->min_delta_ticks and
->max_delta_ticks rather than ->min_delta_ns and ->max_delta_ns: a
clockevent device's rate is going to change dynamically and thus, the
ratio of ns to ticks ceases to stay invariant.

Make the x86 arch's xen clockevent driver initialize these fields properly.

This patch alone doesn't introduce any change in functionality as the
clockevents core still looks exclusively at the (untouched) ->min_delta_ns
and ->max_delta_ns. As soon as this has changed, a followup patch will
purge the initialization of ->min_delta_ns and ->max_delta_ns from this
driver.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: xen-devel@lists.xenproject.org
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-04-14 13:11:03 -07:00
Ingo Molnar e5185a76a2 Merge branch 'x86/boot' into x86/mm, to avoid conflict
There's a conflict between ongoing level-5 paging support and
the E820 rewrite. Since the E820 rewrite is essentially ready,
merge it into x86/mm to reduce tree conflicts.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11 08:56:05 +02:00
Ingo Molnar 4729277156 Merge branch 'WIP.x86/boot' into x86/boot, to pick up ready branch
The E820 rework in WIP.x86/boot has gone through a couple of weeks
of exposure in -tip, merge it in a wider fashion.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11 08:49:31 +02:00
Ingo Molnar 73fa1362a7 Merge branch 'x86/cpu' into x86/mm, before applying dependent patch
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-30 09:07:54 +02:00
Kirill A. Shutemov f2a6a70501 x86: Convert the rest of the code to support p4d_t
This patch converts x86 to use proper folding of a new (fifth) page table level
with <asm-generic/pgtable-nop4d.h>.

That's a bit of a kitchen sink patch, but I don't see how to split it further
without hurting bisectability.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170317185515.8636-7-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-27 08:56:58 +02:00
Xiong Zhang 907cd43902 x86/xen: Change __xen_pgd_walk() and xen_cleanmfnmap() to support p4d
Split these helpers into a couple of per-level functions and add support for
an additional page table level.

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
[ Split off into separate patch ]
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170317185515.8636-6-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-27 08:56:49 +02:00
Andy Lutomirski b23adb7d3f x86/xen/gdt: Use X86_FEATURE_XENPV instead of globals for the GDT fixup
Xen imposes special requirements on the GDT.  Rather than using a
global variable for the pgprot, just use an explicit special case
for Xen -- this makes it clearer what's going on.  It also debloats
64-bit kernels very slightly.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e9ea96abbfd6a8c87753849171bb5987ecfeb523.1490218061.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-23 08:25:08 +01:00
Thomas Garnier 69218e4799 x86: Remap GDT tables in the fixmap section
Each processor holds a GDT in its per-cpu structure. The sgdt
instruction gives the base address of the current GDT. This address can
be used to bypass KASLR memory randomization. With another bug, an
attacker could target other per-cpu structures or deduce the base of
the main memory section (PAGE_OFFSET).

This patch relocates the GDT table for each processor inside the
fixmap section. The space is reserved based on number of supported
processors.

For consistency, the remapping is done by default on 32 and 64-bit.

Each processor switches to its remapped GDT at the end of
initialization. For hibernation, the main processor returns with the
original GDT and switches back to the remapping at completion.

This patch was tested on both architectures. Hibernation and KVM were
both tested specially for their usage of the GDT.

Thanks to Boris Ostrovsky <boris.ostrovsky@oracle.com> for testing and
recommending changes for Xen support.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Luis R . Rodriguez <mcgrof@kernel.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rafael J . Wysocki <rjw@rjwysocki.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: kasan-dev@googlegroups.com
Cc: kernel-hardening@lists.openwall.com
Cc: kvm@vger.kernel.org
Cc: lguest@lists.ozlabs.org
Cc: linux-doc@vger.kernel.org
Cc: linux-efi@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-pm@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: zijun_hu <zijun_hu@htc.com>
Link: http://lkml.kernel.org/r/20170314170508.100882-2-thgarnie@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-16 09:06:35 +01:00
Mathias Krause 6415813bae x86/cpu: Drop wp_works_ok member of struct cpuinfo_x86
Remove the wp_works_ok member of struct cpuinfo_x86. It's an
optimization back from Linux v0.99 times where we had no fixup support
yet and did the CR0.WP test via special code in the page fault handler.
The < 0 test was an optimization to not do the special casing for each
NULL ptr access violation but just for the first one doing the WP test.
Today it serves no real purpose as the test no longer needs special code
in the page fault handler and the only call side -- mem_init() -- calls
it just once, anyway. However, Xen pre-initializes it to 1, to skip the
test.

Doing the test again for Xen should be no issue at all, as even the
commit introducing skipping the test (commit d560bc6157 ("x86, xen:
Suppress WP test on Xen")) mentioned it being ban aid only. And, in
fact, testing the patch on Xen showed nothing breaks.

The pre-fixup times are long gone and with the removal of the fallback
handling code in commit a5c2a893db ("x86, 386 removal: Remove
CONFIG_X86_WP_WORKS_OK") the kernel requires a working CR0.WP anyway.
So just get rid of the "optimization" and do the test unconditionally.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Arnd Hannemann <hannemann@nets.rwth-aachen.de>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "David S. Miller" <davem@davemloft.net>
Link: http://lkml.kernel.org/r/1486933932-585-3-git-send-email-minipli@googlemail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-03-11 14:30:24 +01:00
Ingo Molnar 589ee62844 sched/headers: Prepare to remove the <linux/mm_types.h> dependency from <linux/sched.h>
Update code that relied on sched.h including various MM types for them.

This will allow us to remove the <linux/mm_types.h> include from <linux/sched.h>.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:37 +01:00
Ingo Molnar 38b8d208a4 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/nmi.h>
We are going to move softlockup APIs out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

<linux/nmi.h> already includes <linux/sched.h>.

Include the <linux/nmi.h> header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:30 +01:00
Ingo Molnar 687d77a5f7 x86/xen: Update e820 table handling to the new core x86 E820 code
Note that I restructured the Xen E820 logic a bit: instead of trying
to sort the boot parameters, only the kernel's E820 table is sorted.

This is how the x86 code does it and it reduces coupling between
the in-kernel E820 code and the (unchanged) boot parameters.

Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: <stefano.stabellini@eu.citrix.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-01 09:02:55 +01:00
Ingo Molnar 0871d5a66d Merge branch 'linus' into WIP.x86/boot, to fix up conflicts and to pick up updates
Conflicts:
	arch/x86/xen/setup.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-01 09:02:26 +01:00
Linus Torvalds ac1820fb28 This is a tree wide change and has been kept separate for that reason.
Bart Van Assche noted that the ib DMA mapping code was significantly
 similar enough to the core DMA mapping code that with a few changes
 it was possible to remove the IB DMA mapping code entirely and
 switch the RDMA stack to use the core DMA mapping code.  This resulted
 in a nice set of cleanups, but touched the entire tree.  This branch
 will be submitted separately to Linus at the end of the merge window
 as per normal practice for tree wide changes like this.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYo06oAAoJELgmozMOVy/d9Z8QALedWHdu98St1L0u2c8sxnR9
 2zo/4sF5Vb9u7FpmdIX32L4SQ9s9KhPE8Qp8NtZLf9v10zlDebIRJDpXknXtKooV
 CAXxX4sxBXV27/UrhbZEfXiPrmm6ccJFyIfRnMU6NlMqh2AtAsRa5AC2/RMp8oUD
 Med97PFiF0o6TD22/UH1VFbRpX1zjaKyqm7a3as5sJfzNA+UGIZAQ7Euz8000DKZ
 xCgVLTEwS0FmOujtBkCst7xa9TjuqR1HLOB4DdGvAhP6BHdz2yamM7Qmh9NN+NEX
 0BtjsuXomtn6j6AszGC+bpipCZh3NUigcwoFAARXCYFHibBvo4DPdFeGsraFgXdy
 1+KyR8CCeQG3Aly5Vwr264RFPGkGpwMj8PsBlXgQVtrlg4rriaCzOJNmIIbfdADw
 ftqhxBOzReZw77aH2s+9p2ILRfcAmPqhynLvFGFo9LBvsik8LVso7YgZN0xGxwcI
 IjI/XGC8UskPVsIZBIYA6sl2bYzgOjtBIHiXjRrPlW3uhduIXLrvKFfLPP/5XLAG
 ehLXK+J0bfsyY9ClmlNS8oH/WdLhXAyy/KNmnj5bRRm9qg6BRJR3bsOBhZJODuoC
 XgEXFfF6/7roNESWxowff7pK0rTkRg/m/Pa4VQpeO+6NWHE7kgZhL6kyIp5nKcwS
 3e7mgpcwC+3XfA/6vU3F
 =e0Si
 -----END PGP SIGNATURE-----

Merge tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma DMA mapping updates from Doug Ledford:
 "Drop IB DMA mapping code and use core DMA code instead.

  Bart Van Assche noted that the ib DMA mapping code was significantly
  similar enough to the core DMA mapping code that with a few changes it
  was possible to remove the IB DMA mapping code entirely and switch the
  RDMA stack to use the core DMA mapping code.

  This resulted in a nice set of cleanups, but touched the entire tree
  and has been kept separate for that reason."

* tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits)
  IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it
  IB/core: Remove ib_device.dma_device
  nvme-rdma: Switch from dma_device to dev.parent
  RDS: net: Switch from dma_device to dev.parent
  IB/srpt: Modify a debug statement
  IB/srp: Switch from dma_device to dev.parent
  IB/iser: Switch from dma_device to dev.parent
  IB/IPoIB: Switch from dma_device to dev.parent
  IB/rxe: Switch from dma_device to dev.parent
  IB/vmw_pvrdma: Switch from dma_device to dev.parent
  IB/usnic: Switch from dma_device to dev.parent
  IB/qib: Switch from dma_device to dev.parent
  IB/qedr: Switch from dma_device to dev.parent
  IB/ocrdma: Switch from dma_device to dev.parent
  IB/nes: Remove a superfluous assignment statement
  IB/mthca: Switch from dma_device to dev.parent
  IB/mlx5: Switch from dma_device to dev.parent
  IB/mlx4: Switch from dma_device to dev.parent
  IB/i40iw: Remove a superfluous assignment statement
  IB/hns: Switch from dma_device to dev.parent
  ...
2017-02-25 13:45:43 -08:00
Linus Torvalds 252b95c0ed xen: features and fixes for 4.11-rc0
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJYrElpAAoJELDendYovxMvNFQH/RJU7lwDSf7rF7ZzFGvdcsfi
 T4DDuowkYoJm2+GypoRVzZZ0lxJlxr0mNKPvgGDvuTogMY7pvjAf6B7/xCvTFsNU
 UoO2I7ljgXxCXFRiXH50nAjS7PC2PFW3Qx+8XPIWeZmnUPeJi4Q43fiSloUt+a6l
 JgS/autOCflGasR5MihCZXkvdVF81K6GuEd3hCh9GKZ/8RiwNPaY50vHnMv/hfqq
 SJNRKOTSRXioYlTohLnjuPWHDMayRJEO48IXl3c7aNxDTkHjn78yaoPhJ7m+M0Bq
 s2GyaQA4tCwADhP+tilmI5H1vFpt6w9x7O0dgWiSm7TB91lwfZOen2WhTPPp6L8=
 =gktV
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.11-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:
 "Xen features and fixes:

   - a series from Boris Ostrovsky adding support for booting Linux as
     Xen PVH guest

   - a series from Juergen Gross streamlining the xenbus driver

   - a series from Paul Durrant adding support for the new device model
     hypercall

   - several small corrections"

* tag 'for-linus-4.11-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/privcmd: add IOCTL_PRIVCMD_RESTRICT
  xen/privcmd: Add IOCTL_PRIVCMD_DM_OP
  xen/privcmd: return -ENOTTY for unimplemented IOCTLs
  xen: optimize xenbus driver for multiple concurrent xenstore accesses
  xen: modify xenstore watch event interface
  xen: clean up xenbus internal headers
  xenbus: Neaten xenbus_va_dev_error
  xen/pvh: Use Xen's emergency_restart op for PVH guests
  xen/pvh: Enable CPU hotplug
  xen/pvh: PVH guests always have PV devices
  xen/pvh: Initialize grant table for PVH guests
  xen/pvh: Make sure we don't use ACPI_IRQ_MODEL_PIC for SCI
  xen/pvh: Bootstrap PVH guest
  xen/pvh: Import PVH-related Xen public interfaces
  xen/x86: Remove PVH support
  x86/boot/32: Convert the 32-bit pgtable setup code from assembly to C
  xen/manage: correct return value check on xenbus_scanf()
  x86/xen: Fix APIC id mismatch warning on Intel
  xen/netback: set default upper limit of tx/rx queues to 8
  xen/netfront: set default upper limit of tx/rx queues to 8
2017-02-21 13:53:41 -08:00
Boris Ostrovsky 7a1c44ebc5 xen/pvh: Use Xen's emergency_restart op for PVH guests
Using native_machine_emergency_restart (called during reboot) will
lead PVH guests to machine_real_restart()  where we try to use
real_mode_header which is not initialized.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
2017-02-07 08:07:01 -05:00
Boris Ostrovsky bcc57df281 xen/pvh: PVH guests always have PV devices
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2017-02-07 08:07:01 -05:00
Boris Ostrovsky 5adad168e5 xen/pvh: Make sure we don't use ACPI_IRQ_MODEL_PIC for SCI
Since we are not using PIC and (at least currently) don't have IOAPIC
we want to make sure that acpi_irq_model doesn't stay set to
ACPI_IRQ_MODEL_PIC (which is the default value). If we allowed it to
stay then acpi_os_install_interrupt_handler() would try (and fail) to
request_irq() for PIC.

Instead we set the model to ACPI_IRQ_MODEL_PLATFORM which will prevent
this from happening.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
2017-02-07 08:07:01 -05:00
Boris Ostrovsky 7243b93345 xen/pvh: Bootstrap PVH guest
Start PVH guest at XEN_ELFNOTE_PHYS32_ENTRY address. Setup hypercall
page, initialize boot_params, enable early page tables.

Since this stub is executed before kernel entry point we cannot use
variables in .bss which is cleared by kernel. We explicitly place
variables that are initialized here into .data.

While adjusting xen_hvm_init_shared_info() make it use cpuid_e?x()
instead of cpuid() (wherever possible).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
2017-02-07 08:07:01 -05:00
Boris Ostrovsky 063334f305 xen/x86: Remove PVH support
We are replacing existing PVH guests with new implementation.

We are keeping xen_pvh_domain() macro (for now set to zero) because
when we introduce new PVH implementation later in this series we will
reuse current PVH-specific code (xen_pvh_gnttab_setup()), and that
code is conditioned by 'if (xen_pvh_domain())'. (We will also need
a noop xen_pvh_domain() for !CONFIG_XEN_PVH).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2017-02-07 08:07:01 -05:00
Mohit Gambhir cc272163ea x86/xen: Fix APIC id mismatch warning on Intel
This patch fixes the following warning message seen when booting the
kernel as Dom0 with Xen on Intel machines.

[0.003000] [Firmware Bug]: CPU1: APIC id mismatch. Firmware: 0 APIC: 1]

The code generating the warning in validate_apic_and_package_id() matches
cpu_data(cpu).apicid (initialized in init_intel()->
detect_extended_topology() using cpuid) against the apicid returned from
xen_apic_read(). Now, xen_apic_read() makes a hypercall to retrieve apicid
for the boot  cpu but returns 0 otherwise. Hence the warning gets thrown
for all but the boot cpu.

The idea behind xen_apic_read() returning 0 for apicid is that the
guests (even Dom0) should not need to know what physical processor their
vcpus are running on. This is because we currently  do not have topology
information in Xen and also because xen allows more vcpus than physical
processors. However, boot cpu's apicid is required for loading
xen-acpi-processor driver on AMD machines. Look at following patch for
details:

commit 558daa289a ("xen/apic: Return the APIC ID (and version) for CPU
0.")

So to get rid of the warning, this patch modifies
xen_cpu_present_to_apicid() to return cpu_data(cpu).apicid instead of
calling xen_apic_read().

The warning is not seen on AMD machines because init_amd() populates
cpu_data(cpu).apicid by calling hard_smp_processor_id()->xen_apic_read()
as opposed to using apicid from cpuid as is done on Intel machines.

Signed-off-by: Mohit Gambhir <mohit.gambhir@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-01-29 18:59:16 -05:00
Ingo Molnar f9748fa045 x86/boot/e820: Simplify the e820__update_table() interface
The e820__update_table() parameters are pretty complex:

  arch/x86/include/asm/e820/api.h:extern int  e820__update_table(struct e820_entry *biosmap, int max_nr_map, u32 *pnr_map);

But 90% of the usage is trivial:

  arch/x86/kernel/e820.c:	if (e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries))
  arch/x86/kernel/e820.c:	e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries);
  arch/x86/kernel/e820.c:	e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries);
  arch/x86/kernel/e820.c:		if (e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries) < 0)
  arch/x86/kernel/e820.c:	e820__update_table(boot_params.e820_table, ARRAY_SIZE(boot_params.e820_table), &new_nr);
  arch/x86/kernel/early-quirks.c:	e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries);
  arch/x86/kernel/setup.c:	e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries);
  arch/x86/kernel/setup.c:		e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries);
  arch/x86/platform/efi/efi.c:	e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries);
  arch/x86/xen/setup.c:	e820__update_table(xen_e820_table.entries, ARRAY_SIZE(xen_e820_table.entries),
  arch/x86/xen/setup.c:	e820__update_table(e820_table->entries, ARRAY_SIZE(e820_table->entries), &e820_table->nr_entries);
  arch/x86/xen/setup.c:	e820__update_table(xen_e820_table.entries, ARRAY_SIZE(xen_e820_table.entries),

as it only uses an exiting struct e820_table's entries array, its size and
its current number of entries as input and output arguments.

Only one use is non-trivial:

  arch/x86/kernel/e820.c:	e820__update_table(boot_params.e820_table, ARRAY_SIZE(boot_params.e820_table), &new_nr);

... which call updates the E820 table in the zeropage in-situ, and the layout there does not
match that of 'struct e820_table' (in particular nr_entries is at a different offset,
hardcoded by the boot protocol).

Simplify all this by introducing a low level __e820__update_table() API that
the zeropage update call can use, and simplifying the main e820__update_table()
call signature down to:

	int e820__update_table(struct e820_table *table);

This visibly simplifies all the call sites:

  arch/x86/include/asm/e820/api.h:extern int  e820__update_table(struct e820_table *table);
  arch/x86/include/asm/e820/types.h: * call to e820__update_table() to remove duplicates.  The allowance
  arch/x86/kernel/e820.c: * The return value from e820__update_table() is zero if it
  arch/x86/kernel/e820.c:int __init e820__update_table(struct e820_table *table)
  arch/x86/kernel/e820.c:	if (e820__update_table(e820_table))
  arch/x86/kernel/e820.c:	e820__update_table(e820_table_firmware);
  arch/x86/kernel/e820.c:	e820__update_table(e820_table);
  arch/x86/kernel/e820.c:	e820__update_table(e820_table);
  arch/x86/kernel/e820.c:		if (e820__update_table(e820_table) < 0)
  arch/x86/kernel/early-quirks.c:	e820__update_table(e820_table);
  arch/x86/kernel/setup.c:	e820__update_table(e820_table);
  arch/x86/kernel/setup.c:		e820__update_table(e820_table);
  arch/x86/platform/efi/efi.c:	e820__update_table(e820_table);
  arch/x86/xen/setup.c:	e820__update_table(&xen_e820_table);
  arch/x86/xen/setup.c:	e820__update_table(e820_table);
  arch/x86/xen/setup.c:	e820__update_table(&xen_e820_table);

No change in functionality.

Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28 22:55:24 +01:00
Ingo Molnar e7dbf7ad41 xen, x86/boot/e820: Simplify Xen's xen_e820_table construct
The Xen guest memory setup code has:

	static struct e820_entry xen_e820_table[E820_MAX_ENTRIES] __initdata;
	static u32 xen_e820_table_entries __initdata;

... which is really a 'struct e820_table', open-coded.

Convert the Xen code over to use a single struct e820_table, as this
will allow the simplification of the e820__update_table() API.

No intended change in functionality, but not runtime tested.

Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28 22:55:23 +01:00
Ingo Molnar 08b46d5dd8 x86/boot/e820: Clean up the E820 table size define names
We've got a number of defines related to the E820 table and its size:

	E820MAP
	E820NR
	E820_X_MAX
	E820MAX

The first two denote byte offsets into the zeropage (struct boot_params),
and can are not used in the kernel and can be removed.

The E820_*_MAX values have an inconsistent structure and it's unclear in any
case what they mean. 'X' presuably goes for extended - but it's not very
expressive altogether.

Change these over to:

	E820_MAX_ENTRIES_ZEROPAGE
	E820_MAX_ENTRIES

... which are self-explanatory names.

No change in functionality.

Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28 22:55:23 +01:00
Ingo Molnar 09821ff1d5 x86/boot/e820: Prefix the E820_* type names with "E820_TYPE_"
So there's a number of constants that start with "E820" but which
are not types - these create a confusing mixture when seen together
with 'enum e820_type' values:

	E820MAP
	E820NR
	E820_X_MAX
	E820MAX

To better differentiate the 'enum e820_type' values prefix them
with E820_TYPE_.

No change in functionality.

Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28 22:55:22 +01:00
Ingo Molnar ab6bc04cfd x86/boot/e820: Create coherent API function names for E820 range operations
We have these three related functions:

 extern void e820_add_region(u64 start, u64 size, int type);
 extern u64  e820_update_range(u64 start, u64 size, unsigned old_type, unsigned new_type);
 extern u64  e820_remove_range(u64 start, u64 size, unsigned old_type, int checktype);

But it's not clear from the naming that they are 3 operations based around the
same 'memory range' concept. Rename them to better signal this, and move
the prototypes next to each other:

 extern void e820__range_add   (u64 start, u64 size, int type);
 extern u64  e820__range_update(u64 start, u64 size, unsigned old_type, unsigned new_type);
 extern u64  e820__range_remove(u64 start, u64 size, unsigned old_type, int checktype);

Note that this improved organization of the functions shows another problem that was easy
to miss before: sometimes the E820 entry type is 'int', sometimes 'unsigned int' - but this
will be fixed in a separate patch.

No change in functionality.

Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28 14:42:32 +01:00
Ingo Molnar f52355a99f x86/boot/e820: Rename sanitize_e820_table() to e820__update_table()
sanitize_e820_table() is a minor misnomer in that it suggests that
the E820 table requires sanitizing - which implies that it will only
do anything if the E820 table is irregular (not sane).

That is wrong, because sanitize_e820_table() also does a very regular
sorting of the E820 table, which is a necessity in the basic
append-only flow of E820 updates the kernel is allowed to perform to
it.

So rename it to e820__update_table() to include that purpose as well.

This also lines up all the table-update functions into a coherent
naming family:

  int  e820__update_table(struct e820_entry *biosmap, int max_nr_map, u32 *pnr_map);

  void e820__update_table_print(void);
  void e820__update_table_firmware(void);

No change in functionality.

Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28 14:42:31 +01:00
Ingo Molnar bf495573fa x86/boot/e820: Harmonize the 'struct e820_table' fields
So the e820_table->map and e820_table->nr_map names are a bit
confusing, because it's not clear what a 'map' really means
(it could be a bitmap, or some other data structure), nor is
it clear what nr_map means (is it a current index, or some
other count).

Rename the fields from:

 e820_table->map        =>     e820_table->entries
 e820_table->nr_map     =>     e820_table->nr_entries

which makes it abundantly clear that these are entries
of the table, and that the size of the table is ->nr_entries.

Propagate the changes to all affected files. Where necessary,
adjust local variable names to better reflect the new field names.

No change in functionality.

Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28 09:33:16 +01:00
Ingo Molnar 61a5010163 x86/boot/e820: Rename everything to e820_table
No change in functionality.

Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28 09:33:16 +01:00
Ingo Molnar acd4c04872 x86/boot/e820: Rename 'e820_map' variables to 'e820_array'
In line with the rename to 'struct e820_array', harmonize the naming of common e820
table variable names as well:

 e820          =>  e820_array
 e820_saved    =>  e820_array_saved
 e820_map      =>  e820_array
 initial_e820  =>  e820_array_init

This makes the variable names more consistent  and easier to grep for.

No change in functionality.

Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28 09:33:15 +01:00
Ingo Molnar 8ec67d97bf x86/boot/e820: Rename the basic e820 data types to 'struct e820_entry' and 'struct e820_array'
The 'e820entry' and 'e820map' names have various annoyances:

 - the missing underscore departs from the usual kernel style
   and makes the code look weird,

 - in the past I kept confusing the 'map' with the 'entry', because
   a 'map' is ambiguous in that regard,

 - it's not really clear from the 'e820map' that this is a regular
   C array.

Rename them to 'struct e820_entry' and 'struct e820_array' accordingly.

( Leave the legacy UAPI header alone but do the rename in the bootparam.h
  and e820/types.h file - outside tools relying on these defines should
  either adjust their code, or should use the legacy header, or should
  create their private copies for the definitions. )

No change in functionality.

Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28 09:33:14 +01:00
Ingo Molnar 66441bd3cf x86/boot/e820: Move asm/e820.h to asm/e820/api.h
In line with asm/e820/types.h, move the e820 API declarations to
asm/e820/api.h and update all usage sites.

This is just a mechanical, obviously correct move & replace patch,
there will be subsequent changes to clean up the code and to make
better use of the new header organization.

Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28 09:31:13 +01:00
Bart Van Assche 5299709d0a treewide: Constify most dma_map_ops structures
Most dma_map_ops structures are never modified. Constify these
structures such that these can be write-protected. This patch
has been generated as follows:

git grep -l 'struct dma_map_ops' |
  xargs -d\\n sed -i \
    -e 's/struct dma_map_ops/const struct dma_map_ops/g' \
    -e 's/const struct dma_map_ops {/struct dma_map_ops {/g' \
    -e 's/^const struct dma_map_ops;$/struct dma_map_ops;/' \
    -e 's/const const struct dma_map_ops /const struct dma_map_ops /g';
sed -i -e 's/const \(struct dma_map_ops intel_dma_ops\)/\1/' \
  $(git grep -l 'struct dma_map_ops intel_dma_ops');
sed -i -e 's/const \(struct dma_map_ops dma_iommu_ops\)/\1/' \
  $(git grep -l 'struct dma_map_ops' | grep ^arch/powerpc);
sed -i -e '/^struct vmd_dev {$/,/^};$/ s/const \(struct dma_map_ops[[:blank:]]dma_ops;\)/\1/' \
       -e '/^static void vmd_setup_dma_ops/,/^}$/ s/const \(struct dma_map_ops \*dest\)/\1/' \
       -e 's/const \(struct dma_map_ops \*dest = \&vmd->dma_ops\)/\1/' \
    drivers/pci/host/*.c
sed -i -e '/^void __init pci_iommu_alloc(void)$/,/^}$/ s/dma_ops->/intel_dma_ops./' arch/ia64/kernel/pci-dma.c
sed -i -e 's/static const struct dma_map_ops sn_dma_ops/static struct dma_map_ops sn_dma_ops/' arch/ia64/sn/pci/pci_dma.c
sed -i -e 's/(const struct dma_map_ops \*)//' drivers/misc/mic/bus/vop_bus.c

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Cc: x86@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24 12:23:35 -05:00
Waiman Long aef591cd3d locking/spinlocks/x86, paravirt: Remove paravirt_ticketlocks_enabled
This is a follow-up of commit:

  cfd8983f03 ("x86, locking/spinlocks: Remove ticket (spin)lock implementation")

The static_key structure 'paravirt_ticketlocks_enabled' is now removed as it is
no longer used.

As a result, the init functions kvm_spinlock_init_jump() and
xen_init_spinlocks_jump() are also removed.

A simple build and boot test was done to verify it.

Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1484252878-1962-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14 09:33:46 +01:00
Linus Torvalds 2fd8774c79 Merge branch 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
Pull swiotlb fixes from Konrad Rzeszutek Wilk:
 "This has one fix to make i915 work when using Xen SWIOTLB, and a
  feature from Geert to aid in debugging of devices that can't do DMA
  outside the 32-bit address space.

  The feature from Geert is on top of v4.10 merge window commit
  (specifically you pulling my previous branch), as his changes were
  dependent on the Documentation/ movement patches.

  I figured it would just easier than me trying than to cherry-pick the
  Documentation patches to satisfy git.

  The patches have been soaking since 12/20, albeit I updated the last
  patch due to linux-next catching an compiler error and adding an
  Tested-and-Reported-by tag"

* 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
  swiotlb: Export swiotlb_max_segment to users
  swiotlb: Add swiotlb=noforce debug option
  swiotlb: Convert swiotlb_force from int to enum
  x86, swiotlb: Simplify pci_swiotlb_detect_override()
2017-01-06 10:53:21 -08:00
Linus Torvalds 383378d115 xen: features and fixes for 4.10 rc2
- small fixes for xenbus driver
 - one fix for xen dom0 boot on huge system
 - small cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJYbgWzAAoJELDendYovxMvzqQH/iO+SKCrT39q6fCP+fyov7Hi
 J67XrHVT/AAPUXizWzKdtBE5EdI+WZXBkdsCEh3+3XPCeCRL/t9dRYEytle0Ioy9
 hXC5otiJQ1hhm2N5dQKT5c0IMVh9mAjbeIqcG2dV1lSVaw0CYcJS4xh9eALxj7UY
 eXGpNMdNyeiEG2p5OgnDE5GqHavxPh+6ChNxmr8341T8E+C9U1BNtJeUiIQshKmC
 YAlt7YWoPzEJeLAYEiwrROYNyrLNd17IlYOeKXSwZUdkVtZahW+/jO+YYmhbx1C/
 Yvt93r7ewUFKslRgpZQjjl8y9eynKg+j2BWx8WjAwpdHfCa1DFEOxiAOraLp7Cc=
 =ro0H
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes and cleanups from Juergen Gross:

 - small fixes for xenbus driver

 - one fix for xen dom0 boot on huge system

 - small cleanups

* tag 'for-linus-4.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  Xen: ARM: Zero reserved fields of xatp before making hypervisor call
  xen: events: Replace BUG() with BUG_ON()
  xen: remove stale xs_input_avail() from header
  xen: return xenstore command failures via response instead of rc
  xen: xenbus driver must not accept invalid transaction ids
  xen/evtchn: use rb_entry()
  xen/setup: Don't relocate p2m over existing one
2017-01-05 10:29:40 -08:00
Linus Torvalds 3ddc76dfc7 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer type cleanups from Thomas Gleixner:
 "This series does a tree wide cleanup of types related to
  timers/timekeeping.

   - Get rid of cycles_t and use a plain u64. The type is not really
     helpful and caused more confusion than clarity

   - Get rid of the ktime union. The union has become useless as we use
     the scalar nanoseconds storage unconditionally now. The 32bit
     timespec alike storage got removed due to the Y2038 limitations
     some time ago.

     That leaves the odd union access around for no reason. Clean it up.

  Both changes have been done with coccinelle and a small amount of
  manual mopping up"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ktime: Get rid of ktime_equal()
  ktime: Cleanup ktime_set() usage
  ktime: Get rid of the union
  clocksource: Use a plain u64 instead of cycle_t
2016-12-25 14:30:04 -08:00
Linus Torvalds b272f732f8 Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP hotplug notifier removal from Thomas Gleixner:
 "This is the final cleanup of the hotplug notifier infrastructure. The
  series has been reintgrated in the last two days because there came a
  new driver using the old infrastructure via the SCSI tree.

  Summary:

   - convert the last leftover drivers utilizing notifiers

   - fixup for a completely broken hotplug user

   - prevent setup of already used states

   - removal of the notifiers

   - treewide cleanup of hotplug state names

   - consolidation of state space

  There is a sphinx based documentation pending, but that needs review
  from the documentation folks"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/armada-xp: Consolidate hotplug state space
  irqchip/gic: Consolidate hotplug state space
  coresight/etm3/4x: Consolidate hotplug state space
  cpu/hotplug: Cleanup state names
  cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions
  staging/lustre/libcfs: Convert to hotplug state machine
  scsi/bnx2i: Convert to hotplug state machine
  scsi/bnx2fc: Convert to hotplug state machine
  cpu/hotplug: Prevent overwriting of callbacks
  x86/msr: Remove bogus cleanup from the error path
  bus: arm-ccn: Prevent hotplug callback leak
  perf/x86/intel/cstate: Prevent hotplug callback leak
  ARM/imx/mmcd: Fix broken cpu hotplug handling
  scsi: qedi: Convert to hotplug state machine
2016-12-25 14:05:56 -08:00
Thomas Gleixner a5a1d1c291 clocksource: Use a plain u64 instead of cycle_t
There is no point in having an extra type for extra confusion. u64 is
unambiguous.

Conversion was done with the following coccinelle script:

@rem@
@@
-typedef u64 cycle_t;

@fix@
typedef cycle_t;
@@
-cycle_t
+u64

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
2016-12-25 11:04:12 +01:00
Thomas Gleixner 73c1b41e63 cpu/hotplug: Cleanup state names
When the state names got added a script was used to add the extra argument
to the calls. The script basically converted the state constant to a
string, but the cleanup to convert these strings into meaningful ones did
not happen.

Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
are used in all the other places already.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-25 10:47:44 +01:00
Linus Torvalds 7c0f6ba682 Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24 11:46:01 -08:00
Ross Lagerwall 7ecec8503a xen/setup: Don't relocate p2m over existing one
When relocating the p2m, take special care not to relocate it so
that is overlaps with the current location of the p2m/initrd. This is
needed since the full extent of the current location is not marked as a
reserved region in the e820.

This was seen to happen to a dom0 with a large initial p2m and a small
reserved region in the middle of the initial p2m.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2016-12-22 10:04:02 +01:00
Geert Uytterhoeven ae7871be18 swiotlb: Convert swiotlb_force from int to enum
Convert the flag swiotlb_force from an int to an enum, to prepare for
the advent of more possible values.

Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2016-12-19 09:05:20 -05:00
Linus Torvalds 1bbb05f520 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes and cleanups from Thomas Gleixner:
 "This set of updates contains:

   - Robustification for the logical package managment. Cures the AMD
     and virtualization issues.

   - Put the correct start_cpu() return address on the stack of the idle
     task.

   - Fixups for the fallout of the nodeid <-> cpuid persistent mapping
     modifciations

   - Move the x86/MPX specific mm_struct member to the arch specific
     mm_context where it belongs

   - Cleanups for C89 struct initializers and useless function
     arguments"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/floppy: Use designated initializers
  x86/mpx: Move bd_addr to mm_context_t
  x86/mm: Drop unused argument 'removed' from sync_global_pgds()
  ACPI/NUMA: Do not map pxm to node when NUMA is turned off
  x86/acpi: Use proper macro for invalid node
  x86/smpboot: Prevent false positive out of bounds cpumask access warning
  x86/boot/64: Push correct start_cpu() return address
  x86/boot/64: Use 'push' instead of 'call' in start_cpu()
  x86/smpboot: Make logical package management more robust
2016-12-18 11:12:53 -08:00
Linus Torvalds aa3ecf388a xen: features and fixes for 4.10 rc0
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJYT5HMAAoJELDendYovxMvhNQH/1g/3ahM4JKN8Z0SbjKBEdQm
 yj2xOj6cE3l6wMSUblKjZD2DLLhpmcHT/E97Xro/lZQEfQJoMXXWWDFowMU/P1LA
 mJxb7Fzq5Wr+6eGSAlIQB270MrpNi/luf+CWHMwVA3V7R3KRXwonOdGQSkISIzCd
 tgIydEA3a9r2+HgeIBpZFZ4GcSrJQU75krMyl2tjD1C+jeYVd+zdoj2OnDsZQDZQ
 hDWApMpNbpSBAn7JtSSdXWSTBsGH0lUECebeYPhPQ2sX2P6Y8+UCGwA7i6FFdbTa
 agXfVSdRz8dCe3k19VcKDAw6nK9BTTMnEeEHmkmygIh6wuHPP44CzigTXIbJoXI=
 =zjfm
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.10-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:
 "Xen features and fixes for 4.10

  These are some fixes, a move of some arm related headers to share them
  between arm and arm64 and a series introducing a helper to make code
  more readable.

  The most notable change is David stepping down as maintainer of the
  Xen hypervisor interface. This results in me sending you the pull
  requests for Xen related code from now on"

* tag 'for-linus-4.10-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (29 commits)
  xen/balloon: Only mark a page as managed when it is released
  xenbus: fix deadlock on writes to /proc/xen/xenbus
  xen/scsifront: don't request a slot on the ring until request is ready
  xen/x86: Increase xen_e820_map to E820_X_MAX possible entries
  x86: Make E820_X_MAX unconditionally larger than E820MAX
  xen/pci: Bubble up error and fix description.
  xen: xenbus: set error code on failure
  xen: set error code on failures
  arm/xen: Use alloc_percpu rather than __alloc_percpu
  arm/arm64: xen: Move shared architecture headers to include/xen/arm
  xen/events: use xen_vcpu_id mapping for EVTCHNOP_status
  xen/gntdev: Use VM_MIXEDMAP instead of VM_IO to avoid NUMA balancing
  xen-scsifront: Add a missing call to kfree
  MAINTAINERS: update XEN HYPERVISOR INTERFACE
  xenfs: Use proc_create_mount_point() to create /proc/xen
  xen-platform: use builtin_pci_driver
  xen-netback: fix error handling output
  xen: make use of xenbus_read_unsigned() in xenbus
  xen: make use of xenbus_read_unsigned() in xen-pciback
  xen: make use of xenbus_read_unsigned() in xen-fbfront
  ...
2016-12-13 16:07:55 -08:00
Linus Torvalds b5cab0da75 Merge branch 'stable/for-linus-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
Pull swiotlb updates from Konrad Rzeszutek Wilk:

 - minor fixes (rate limiting), remove certain functions

 - support for DMA_ATTR_SKIP_CPU_SYNC which is an optimization
   in the DMA API

* 'stable/for-linus-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
  swiotlb: Minor fix-ups for DMA_ATTR_SKIP_CPU_SYNC support
  swiotlb: Add support for DMA_ATTR_SKIP_CPU_SYNC
  swiotlb-xen: Enforce return of DMA_ERROR_CODE in mapping function
  swiotlb: Drop unused functions swiotlb_map_sg and swiotlb_unmap_sg
  swiotlb: Rate-limit printing when running out of SW-IOMMU space
2016-12-13 15:52:23 -08:00
Thomas Gleixner 9d85eb9119 x86/smpboot: Make logical package management more robust
The logical package management has several issues:

 - The APIC ids provided by ACPI are not required to be the same as the
   initial APIC id which can be retrieved by CPUID. The APIC ids provided
   by ACPI are those which are written by the BIOS into the APIC. The
   initial id is set by hardware and can not be changed. The hardware
   provided ids contain the real hardware package information.

   Especially AMD sets the effective APIC id different from the hardware id
   as they need to reserve space for the IOAPIC ids starting at id 0.

   As a consequence those machines trigger the currently active firmware
   bug printouts in dmesg, These are obviously wrong.

 - Virtual machines have their own interesting of enumerating APICs and
   packages which are not reliably covered by the current implementation.

The sizing of the mapping array has been tweaked to be generously large to
handle systems which provide a wrong core count when HT is disabled so the
whole magic which checks for space in the physical hotplug case is not
needed anymore.

Simplify the whole machinery and do the mapping when the CPU starts and the
CPUID derived physical package information is available. This solves the
observed problems on AMD machines and works for the virtualization issues
as well.

Remove the extra call from XEN cpu bringup code as it is not longer
required.

Fixes: d49597fd3b ("x86/cpu: Deal with broken firmware (VMWare/XEN)")
Reported-and-tested-by: Borislav Petkov <bp@suse.de>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: M. Vefa Bicakci <m.v.b@runbox.com>
Cc: xen-devel <xen-devel@lists.xen.org>
Cc: Charles (Chas) Williams <ciwillia@brocade.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Alok Kataria <akataria@vmware.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1612121102260.3429@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-13 10:22:39 +01:00
Linus Torvalds 518bacf5a5 Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 FPU updates from Ingo Molnar:
 "The main changes in this cycle were:

   - do a large round of simplifications after all CPUs do 'eager' FPU
     context switching in v4.9: remove CR0 twiddling, remove leftover
     eager/lazy bts, etc (Andy Lutomirski)

   - more FPU code simplifications: remove struct fpu::counter, clarify
     nomenclature, remove unnecessary arguments/functions and better
     structure the code (Rik van Riel)"

* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Remove clts()
  x86/fpu: Remove stts()
  x86/fpu: Handle #NM without FPU emulation as an error
  x86/fpu, lguest: Remove CR0.TS support
  x86/fpu, kvm: Remove host CR0.TS manipulation
  x86/fpu: Remove irq_ts_save() and irq_ts_restore()
  x86/fpu: Stop saving and restoring CR0.TS in fpu__init_check_bugs()
  x86/fpu: Get rid of two redundant clts() calls
  x86/fpu: Finish excising 'eagerfpu'
  x86/fpu: Split old_fpu & new_fpu handling into separate functions
  x86/fpu: Remove 'cpu' argument from __cpu_invalidate_fpregs_state()
  x86/fpu: Split old & new FPU code paths
  x86/fpu: Remove __fpregs_(de)activate()
  x86/fpu: Rename lazy restore functions to "register state valid"
  x86/fpu, kvm: Remove KVM vcpu->fpu_counter
  x86/fpu: Remove struct fpu::counter
  x86/fpu: Remove use_eager_fpu()
  x86/fpu: Remove the XFEATURE_MASK_EAGER/LAZY distinction
  x86/fpu: Hard-disable lazy FPU mode
  x86/crypto, x86/fpu: Remove X86_FEATURE_EAGER_FPU #ifdef from the crc32c code
2016-12-12 14:27:49 -08:00
Alex Thorlton 738662c35c xen/x86: Increase xen_e820_map to E820_X_MAX possible entries
On systems with sufficiently large e820 tables, and several IOAPICs, it
is possible for the XENMEM_machine_memory_map callback (and its
counterpart, XENMEM_memory_map) to attempt to return an e820 table with
more than 128 entries.  This callback adds entries to the BIOS-provided
e820 table to account for IOAPIC registers, which, on sufficiently large
systems, can result in an e820 table that is too large to copy back into
xen_e820_map.

This change simply increases the size of xen_e820_map to E820_X_MAX to
ensure that there is enough room to store the entire e820 map returned
from this callback.

Signed-off-by: Alex Thorlton <athorlton@sgi.com>
Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
2016-12-09 10:59:08 +01:00
Peter Zijlstra 3cded41794 x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()
Avoid the pointless function call to pv_lock_ops.vcpu_is_preempted()
when a paravirt spinlock enabled kernel is ran on native hardware.

Do this by patching out the CALL instruction with "XOR %RAX,%RAX"
which has the same effect (0 return value).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: borntraeger@de.ibm.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: jgross@suse.com
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: pbonzini@redhat.com
Cc: rkrcmar@redhat.com
Cc: will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:48:11 +01:00
Juergen Gross de7689cf8f x86/xen: Support the vCPU preemption check
Support the vcpu_is_preempted() functionality under Xen. This will
enhance lock performance on overcommitted hosts (more runnable vCPUs
than physical CPUs in the system) as doing busy waits for preempted
vCPUs will hurt system performance far worse than early yielding.

A quick test (4 vCPUs on 1 physical CPU doing a parallel build job
with "make -j 8") reduced system time by about 5% with this patch.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: borntraeger@de.ibm.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: pbonzini@redhat.com
Cc: rkrcmar@redhat.com
Cc: virtualization@lists.linux-foundation.org
Cc: will.deacon@arm.com
Cc: xen-devel-request@lists.xenproject.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1478077718-37424-11-git-send-email-xinhui.pan@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:48:09 +01:00
Alexander Duyck 7641842164 swiotlb-xen: Enforce return of DMA_ERROR_CODE in mapping function
The mapping function should always return DMA_ERROR_CODE when a mapping has
failed as this is what the DMA API expects when a DMA error has occurred.
The current function for mapping a page in Xen was returning either
DMA_ERROR_CODE or 0 depending on where it failed.

On x86 DMA_ERROR_CODE is 0, but on other architectures such as ARM it is
~0. We need to make sure we return the same error value if either the
mapping failed or the device is not capable of accessing the mapping.

If we are returning DMA_ERROR_CODE as our error value we can drop the
function for checking the error code as the default is to compare the
return value against DMA_ERROR_CODE if no function is defined.

Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad@kernel.org>
2016-11-07 15:06:32 -05:00
Andy Lutomirski af25ed59b5 x86/fpu: Remove clts()
The kernel doesn't use clts() any more.  Remove it and all of its
paravirt infrastructure.

A careful reader may notice that xen_clts() appears to have been
buggy -- it didn't update xen_cr0_value.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm list <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/3d3c8ca62f17579b9849a013d71e59a4d5d1b079.1477951965.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-01 07:47:55 +01:00
Linus Torvalds aa34e07e45 xen: fixes for 4.9-rc2
- Advertise control feature flags in xenstore.
 - Fix x86 build when XEN_PVHVM is disabled.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJYDjVtAAoJEFxbo/MsZsTRv2UH/0YR95ajlgJnN/ldeG4KhBdV
 Oe6piyw1cbHDPvFrFFl7HgYgAiiuaMxOFk+j/XKVJ7naAOD06kWHoVzZNkpNFF4i
 2m81jGfvW3msbXd77aR+IHulWxRxQ9TE4HV2s94DiQiSJa2f02PqVCdqyJws736m
 mjDdDRzd90xb2rDI3XrcRNnjgNaFtfMLGhtwtgXI5U+Ic+uVW1VBwLefZXCI2SKw
 yUSVBwsYENgfGUJ+NmYrl53WmlSnAatrs1wClLVqm/0fD7+J2XLHRAonISTwoKtp
 z+XJthe7uWq0Fb/DMiWhvTrTn852chy9BEC6QsRBmGM6RRZG9n7x8k97NgTiqiw=
 =lM7p
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from David Vrabel:

 - advertise control feature flags in xenstore

 - fix x86 build when XEN_PVHVM is disabled

* tag 'for-linus-4.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xenbus: check return value of xenbus_scanf()
  xenbus: prefer list_for_each()
  x86: xen: move cpu_up functions out of ifdef
  xenbus: advertise control feature flags
2016-10-24 19:52:24 -07:00
Arnd Bergmann cb5f7e7c1d x86: xen: move cpu_up functions out of ifdef
Three newly introduced functions are not defined when CONFIG_XEN_PVHVM is
disabled, but are still being used:

arch/x86/xen/enlighten.c:141:12: warning: ‘xen_cpu_up_prepare’ used but never defined
arch/x86/xen/enlighten.c:142:12: warning: ‘xen_cpu_up_online’ used but never defined
arch/x86/xen/enlighten.c:143:12: warning: ‘xen_cpu_dead’ used but never defined

Fixes: 4d737042d6 ("xen/x86: Convert to hotplug state machine")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-10-24 15:49:07 +01:00
Linus Torvalds 541efb7632 xen: features and fixes for 4.9-rc0
- Switch to new CPU hotplug mechanism.
 - Support driver_override in pciback.
 - Require vector callback for HVM guests (the alternate mechanism via
   the platform device has been broken for ages).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJX9lkNAAoJEFxbo/MsZsTRnd8IAKnCH9pd2c1GgAPse2s8yBUL
 jh/nTQh+niVvpFA9elpfz+TrAIu4P0KLcnx6jhZ0Uv+Cmeaz5Ps+IaqyXBmqmeCm
 hjrnDo6wEVB/1LMtzibNk0hQcIN73MUEIfUESjl1iiIw3lPDPMIihMbpCAzVzaRf
 M8sInTTwcx0A9njUijEwT1wKV45hM7bpnAufChkxk3V3G2+JxBDYAQJCfW0u1DjR
 WFpbGKyNetXSVSf6QVZhW+lTnqTAUk0a5IqOg6UbzzbsHM7KgzwxB0FXYMRsL8jV
 3VNiRJovNy+0F3T1VewPXWFlWs+QFK1GH0Hbncc5kUATNBm/VOjNt8H0dwUlfLM=
 =n1rz
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.9-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from David Vrabel:
 "xen features and fixes for 4.9:

   - switch to new CPU hotplug mechanism

   - support driver_override in pciback

   - require vector callback for HVM guests (the alternate mechanism via
     the platform device has been broken for ages)"

* tag 'for-linus-4.9-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/x86: Update topology map for PV VCPUs
  xen/x86: Initialize per_cpu(xen_vcpu, 0) a little earlier
  xen/pciback: support driver_override
  xen/pciback: avoid multiple entries in slot list
  xen/pciback: simplify pcistub device handling
  xen: Remove event channel notification through Xen PCI platform device
  xen/events: Convert to hotplug state machine
  xen/x86: Convert to hotplug state machine
  x86/xen: add missing \n at end of printk warning message
  xen/grant-table: Use kmalloc_array() in arch_gnttab_valloc()
  xen: Make VPMU init message look less scary
  xen: rename xen_pmu_init() in sys-hypervisor.c
  hotplug: Prevent alloc/free of irq descriptors during cpu up/down (again)
  xen/x86: Move irq allocation from Xen smp_op.cpu_up()
2016-10-06 11:19:10 -07:00
Boris Ostrovsky a6a198bc60 xen/x86: Update topology map for PV VCPUs
Early during boot topology_update_package_map() computes
logical_pkg_ids for all present processors.

Later, when processors are brought up, identify_cpu() updates
these values based on phys_pkg_id which is a function of
initial_apicid. On PV guests the latter may point to a
non-existing node, causing logical_pkg_ids to be set to -1.

Intel's RAPL uses logical_pkg_id (as topology_logical_package_id())
to index its arrays and therefore in this case will point to index
65535 (since logical_pkg_id is a u16). This could lead to either a
crash or may actually access random memory location.

As a workaround, we recompute topology during CPU bringup to reset
logical_pkg_id to a valid value.

(The reason for initial_apicid being bogus is because it is
initial_apicid of the processor from which the guest is launched.
This value is CPUID(1).EBX[31:24])

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-10-06 11:47:05 +01:00
Boris Ostrovsky 565fdc6a2a xen/x86: Initialize per_cpu(xen_vcpu, 0) a little earlier
xen_cpuhp_setup() calls mutex_lock() which, when CONFIG_DEBUG_MUTEXES
is defined, ends up calling xen_save_fl(). That routine expects
per_cpu(xen_vcpu, 0) to be already initialized.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-10-05 10:32:22 +01:00
Linus Torvalds 3ef0a61a46 Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar:
 "The changes in this cycle were:

   - Save e820 table RAM footprint on larger kernel configurations.
     (Denys Vlasenko)

   - pmem related fixes (Dan Williams)

   - theoretical e820 boundary condition fix (Wei Yang)"

* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation
  x86/e820: Use much less memory for e820/e820_saved, save up to 120k
  x86/e820: Prepare e280 code for switch to dynamic storage
  x86/e820: Mark some static functions __init
  x86/e820: Fix very large 'size' handling boundary condition
2016-10-03 16:46:53 -07:00
Linus Torvalds 1a4a2bc460 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull low-level x86 updates from Ingo Molnar:
 "In this cycle this topic tree has become one of those 'super topics'
  that accumulated a lot of changes:

   - Add CONFIG_VMAP_STACK=y support to the core kernel and enable it on
     x86 - preceded by an array of changes. v4.8 saw preparatory changes
     in this area already - this is the rest of the work. Includes the
     thread stack caching performance optimization. (Andy Lutomirski)

   - switch_to() cleanups and all around enhancements. (Brian Gerst)

   - A large number of dumpstack infrastructure enhancements and an
     unwinder abstraction. The secret long term plan is safe(r) live
     patching plus maybe another attempt at debuginfo based unwinding -
     but all these current bits are standalone enhancements in a frame
     pointer based debug environment as well. (Josh Poimboeuf)

   - More __ro_after_init and const annotations. (Kees Cook)

   - Enable KASLR for the vmemmap memory region. (Thomas Garnier)"

[ The virtually mapped stack changes are pretty fundamental, and not
  x86-specific per se, even if they are only used on x86 right now. ]

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
  x86/asm: Get rid of __read_cr4_safe()
  thread_info: Use unsigned long for flags
  x86/alternatives: Add stack frame dependency to alternative_call_2()
  x86/dumpstack: Fix show_stack() task pointer regression
  x86/dumpstack: Remove dump_trace() and related callbacks
  x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder
  oprofile/x86: Convert x86_backtrace() to use the new unwinder
  x86/stacktrace: Convert save_stack_trace_*() to use the new unwinder
  perf/x86: Convert perf_callchain_kernel() to use the new unwinder
  x86/unwind: Add new unwind interface and implementations
  x86/dumpstack: Remove NULL task pointer convention
  fork: Optimize task creation by caching two thread stacks per CPU if CONFIG_VMAP_STACK=y
  sched/core: Free the stack early if CONFIG_THREAD_INFO_IN_TASK
  lib/syscall: Pin the task stack in collect_syscall()
  x86/process: Pin the target stack in get_wchan()
  x86/dumpstack: Pin the target stack when dumping it
  kthread: Pin the stack via try_get_task_stack()/put_task_stack() in to_live_kthread() function
  sched/core: Add try_get_task_stack() and put_task_stack()
  x86/entry/64: Fix a minor comment rebase error
  iommu/amd: Don't put completion-wait semaphore on stack
  ...
2016-10-03 16:13:28 -07:00
Linus Torvalds 00bcf5cdd6 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The main changes in this cycle were:

   - rwsem micro-optimizations (Davidlohr Bueso)

   - Improve the implementation and optimize the performance of
     percpu-rwsems. (Peter Zijlstra.)

   - Convert all lglock users to better facilities such as percpu-rwsems
     or percpu-spinlocks and remove lglocks. (Peter Zijlstra)

   - Remove the ticket (spin)lock implementation. (Peter Zijlstra)

   - Korean translation of memory-barriers.txt and related fixes to the
     English document. (SeongJae Park)

   - misc fixes and cleanups"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86/cmpxchg, locking/atomics: Remove superfluous definitions
  x86, locking/spinlocks: Remove ticket (spin)lock implementation
  locking/lglock: Remove lglock implementation
  stop_machine: Remove stop_cpus_lock and lg_double_lock/unlock()
  fs/locks: Use percpu_down_read_preempt_disable()
  locking/percpu-rwsem: Add down_read_preempt_disable()
  fs/locks: Replace lg_local with a per-cpu spinlock
  fs/locks: Replace lg_global with a percpu-rwsem
  locking/percpu-rwsem: Add DEFINE_STATIC_PERCPU_RWSEMand percpu_rwsem_assert_held()
  locking/pv-qspinlock: Use cmpxchg_release() in __pv_queued_spin_unlock()
  locking/rwsem, x86: Drop a bogus cc clobber
  futex: Add some more function commentry
  locking/hung_task: Show all locks
  locking/rwsem: Scan the wait_list for readers only once
  locking/rwsem: Remove a few useless comments
  locking/rwsem: Return void in __rwsem_mark_wake()
  locking, rcu, cgroup: Avoid synchronize_sched() in __cgroup_procs_write()
  locking/Documentation: Add Korean translation
  locking/Documentation: Fix a typo of example result
  locking/Documentation: Fix wrong section reference
  ...
2016-10-03 12:15:00 -07:00
KarimAllah Ahmed 72a9b18629 xen: Remove event channel notification through Xen PCI platform device
Ever since commit 254d1a3f02 ("xen/pv-on-hvm kexec: shutdown watches
from old kernel") using the INTx interrupt from Xen PCI platform
device for event channel notification would just lockup the guest
during bootup.  postcore_initcall now calls xs_reset_watches which
will eventually try to read a value from XenStore and will get stuck
on read_reply at XenBus forever since the platform driver is not
probed yet and its INTx interrupt handler is not registered yet. That
means that the guest can not be notified at this moment of any pending
event channels and none of the per-event handlers will ever be invoked
(including the XenStore one) and the reply will never be picked up by
the kernel.

The exact stack where things get stuck during xenbus_init:

-xenbus_init
 -xs_init
  -xs_reset_watches
   -xenbus_scanf
    -xenbus_read
     -xs_single
      -xs_single
       -xs_talkv

Vector callbacks have always been the favourite event notification
mechanism since their introduction in commit 38e20b07ef ("x86/xen:
event channels delivery on HVM.") and the vector callback feature has
always been advertised for quite some time by Xen that's why INTx was
broken for several years now without impacting anyone.

Luckily this also means that event channel notification through INTx
is basically dead-code which can be safely removed without impacting
anybody since it has been effectively disabled for more than 4 years
with nobody complaining about it (at least as far as I'm aware of).

This commit removes event channel notification through Xen PCI
platform device.

Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Julien Grall <julien.grall@citrix.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Ross Lagerwall <ross.lagerwall@citrix.com>
Cc: xen-devel@lists.xenproject.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-09-30 11:44:34 +01:00
Andy Lutomirski 1ef55be16e x86/asm: Get rid of __read_cr4_safe()
We use __read_cr4() vs __read_cr4_safe() inconsistently.  On
CR4-less CPUs, all CR4 bits are effectively clear, so we can make
the code simpler and more robust by making __read_cr4() always fix
up faults on 32-bit kernels.

This may fix some bugs on old 486-like CPUs, but I don't have any
easy way to test that.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: david@saggiorato.net
Link: http://lkml.kernel.org/r/ea647033d357d9ce2ad2bbde5a631045f5052fb6.1475178370.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-30 12:40:12 +02:00
Boris Ostrovsky 4d737042d6 xen/x86: Convert to hotplug state machine
Switch to new CPU hotplug infrastructure.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-09-30 11:22:59 +01:00