WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Xiao Guangrong	ca7d58f375	KVM: x86: fix broken read emulation spans a page boundary If the range spans a page boundary, the mmio access can be broke, fix it as write emulation. And we already get the guest physical address, so use it to read guest data directly to avoid walking guest page table again Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-09-25 19:17:17 +03:00
Avi Kivity	9be3be1f15	KVM: x86 emulator: fix Src2CL decode Src2CL decode (used for double width shifts) erronously decodes only bit 3 of %rcx, instead of bits 7:0. Fix by decoding %cl in its entirety. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-09-25 19:14:58 +03:00
Zhao Jin	41bc3186b3	KVM: MMU: fix incorrect return of spte __update_clear_spte_slow should return original spte while the current code returns low half of original spte combined with high half of new spte. Signed-off-by: Zhao Jin <cronozhj@gmail.com> Reviewed-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-09-25 19:13:25 +03:00
Konrad Rzeszutek Wilk	0f4b49eaf2	xen/p2m: Use SetPagePrivate and its friends for M2P overrides. We use the page->private field and hence should use the proper macros and set proper bits. Also WARN_ON in case somebody tries to overwrite our data. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-09-23 22:22:33 -04:00
Konrad Rzeszutek Wilk	a867db10e8	xen/p2m: Make debug/xen/mmu/p2m visible again. We dropped a lot of the MMU debugfs in favour of using tracing API - but there is one which just provides mostly static information that was made invisible by this change. Bring it back. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-09-23 22:22:32 -04:00
Jan Beulich	55e901fc1f	xen/pci: support multi-segment systems Now that the hypercall interface changes are in -unstable, make the kernel side code not ignore the segment (aka domain) number anymore (which results in pretty odd behavior on such systems). Rather, if only the old interfaces are available, don't call them for devices on non-zero segments at all. Signed-off-by: Jan Beulich <jbeulich@suse.com> [v1: Edited git description] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-09-22 16:23:46 -04:00
H Hartley Sweeten	4a4cc2b6bf	crypto: aes-x86 - quiet sparse noise about symbol not declared Include <asm/aes.h> to pick up the declarations for crypto_aes_encrypt_x86 and crypto_aes_decrypt_x86 to quiet the sparse noise: warning: symbol 'crypto_aes_encrypt_x86' was not declared. Should it be static? warning: symbol 'crypto_aes_decrypt_x86' was not declared. Should it be static? Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Acked-by: Mandeep Singh Baines <msb@chromium.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2011-09-22 21:33:01 +10:00
Jussi Kivilinna	64b94ceae8	crypto: blowfish - add x86_64 assembly implementation Patch adds x86_64 assembly implementation of blowfish. Two set of assembler functions are provided. First set is regular 'one-block at time' encrypt/decrypt functions. Second is 'four-block at time' functions that gain performance increase on out-of-order CPUs. Performance of 4-way functions should be equal to 1-way functions with in-order CPUs. Summary of the tcrypt benchmarks: Blowfish assembler vs blowfish C (256bit 8kb block ECB) encrypt: 2.2x speed decrypt: 2.3x speed Blowfish assembler vs blowfish C (256bit 8kb block CBC) encrypt: 1.12x speed decrypt: 2.5x speed Blowfish assembler vs blowfish C (256bit 8kb block CTR) encrypt: 2.5x speed Full output: http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-asm-x86_64.txt http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-c-x86_64.txt Tests were run on: vendor_id : AuthenticAMD cpu family : 16 model : 10 model name : AMD Phenom(tm) II X6 1055T Processor stepping : 0 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2011-09-22 21:25:26 +10:00
Matt Fleming	47997d756a	x86/rtc: Don't recursively acquire rtc_lock A deadlock was introduced on x86 in commit `ef68c8f87e` ("x86: Serialize EFI time accesses on rtc_lock") because efi_get_time() and friends can be called with rtc_lock already held by read_persistent_time(), e.g.: timekeeping_init() read_persistent_clock() <-- acquire rtc_lock efi_get_time() phys_efi_get_time() <-- acquire rtc_lock <DEADLOCK> To fix this let's push the locking down into the get_wallclock() and set_wallclock() implementations. Only the clock implementations that access the x86 RTC directly need to acquire rtc_lock, so it makes sense to push the locking down into the rtc, vrtc and efi code. The virtualization implementations don't require rtc_lock to be held because they provide their own serialization. Signed-off-by: Matt Fleming <matt.fleming@intel.com> Acked-by: Jan Beulich <jbeulich@novell.com> Acked-by: Avi Kivity <avi@redhat.com> [for the virtualization aspect] Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Josh Boyer <jwboyer@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 16:16:09 +02:00
Jack Steiner	6a469e4665	x86: uv2: Workaround for UV2 Hub bug (system global address format) This is a workaround for a UV2 hub bug that affects the format of system global addresses. The GRU API for UV2 was inadvertently broken by a hardware change. The format of the physical address used for TLB dropins and for addresses used with instructions running in unmapped mode has changed. This change was not documented and became apparent only when diags failed running on system simulators. For UV1, TLB and GRU instruction physical addresses are identical to socket physical addresses (although high NASID bits must be OR'ed into the address). For UV2, socket physical addresses need to be converted. The NODE portion of the physical address needs to be shifted so that the low bit is in bit 39 or bit 40, depending on an MMR value. It is not yet clear if this bug will be fixed in a silicon respin. If it is fixed, the hub revision will be incremented & the workaround disabled. Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-21 11:23:15 +02:00
Ed Wildgoose	d4f3e35017	x86: geode: New PCEngines Alix system driver This new driver replaces the old PCEngines Alix 2/3 LED driver with a new driver that controls the LEDs through the leds-gpio driver. The old driver accessed GPIOs directly, which created a conflict and prevented also loading the cs5535-gpio driver to read other GPIOs on the Alix board. With this new driver, we hook into leds-gpio which in turn uses GPIO to control the LEDs and therefore it's possible to control both the LEDs and access onboard GPIOs Driver is moved to platform/geode as requested by Grant and any other geode initialisation modules should move here also This driver is inspired by leds-net5501.c by Alessandro Zummo. Ideally, leds-net5501.c should also be moved to platform/geode. Additionally the driver relies on parts of the patch: `7f131cf3ed` ("leds: leds-alix2c - take port address from MSR) by Daniel Mack to perform detection of the Alix board. [akpm@linux-foundation.org: include module.h] Signed-off-by: Ed Wildgoose <kernel@wildgooses.com> Cc: git@wildgooses.com Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Daniel Mack <daniel@caiaq.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Richard Purdie <rpurdie@rpsys.net> Reviewed-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-21 11:19:42 +02:00
Suresh Siddha	c020570138	x86, ioapic: Consolidate the explicit EOI code Consolidate the io-apic EOI code in clear_IO_APIC_pin() and eoi_ioapic_irq(). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Thomas Renninger <trenn@suse.de> Cc: Rafael Wysocki <rjw@novell.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: lchiquitto@novell.com Cc: jbeulich@novell.com Cc: yinghai@kernel.org Link: http://lkml.kernel.org/r/20110825190657.259696697@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 10:26:28 +02:00
Suresh Siddha	e57253a81d	x86, ioapic: Restore the mask bit correctly in eoi_ioapic_irq() For older IO-APIC's, we were clearing the remote-IRR by changing the RTE trigger mode to edge and then back to level. We wanted to mask the RTE during this process, so we were essentially doing mask+edge and then to unmask+level. As part of the commit `ca64c47cec`, we moved this EOI process earlier where the IO-APIC RTE is masked. So we were wrongly unmasking it in the eoi_ioapic_irq(). So change the remote-IRR clear sequence in eoi_ioapic_irq() to mask + edge and then restore the previous RTE entry which will restore the mask status as well as the level trigger. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Thomas Renninger <trenn@suse.de> Cc: Rafael Wysocki <rjw@novell.com> Cc: lchiquitto@novell.com Cc: jbeulich@novell.com Cc: yinghai@kernel.org Link: http://lkml.kernel.org/r/20110825190657.210286410@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 10:26:26 +02:00
Suresh Siddha	1e75b31d63	x86, kdump, ioapic: Reset remote-IRR in clear_IO_APIC In the kdump scenario mentioned below, we can have a case where the device using level triggered interrupt will not generate any interrupts in the kdump kernel. 1. IO-APIC sends a level triggered interrupt to the CPU's local APIC. 2. Kernel crashed before the CPU services this interrupt, leaving the remote-IRR in the IO-APIC set. 3. kdump kernel boot sequence does clear_IO_APIC() as part of IO-APIC initialization. But this fails to reset remote-IRR bit of the IO-APIC RTE as the remote-IRR bit is read-only. 4. Device using that level triggered entry can't generate any more interrupts because of the remote-IRR bit. In clear_IO_APIC_pin(), check if the remote-IRR bit is set and if so do an explicit attempt to clear it (by doing EOI write on modern io-apic's and changing trigger mode to edge/level on older io-apic's). Also before doing the explicit EOI to the io-apic, ensure that the trigger mode is indeed set to level. This will enable the explicit EOI to the io-apic to reset the remote-IRR bit. Tested-by: Leonardo Chiquitto <lchiquitto@novell.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Fixes: https://bugzilla.novell.com/show_bug.cgi?id=701686 Cc: Rafael Wysocki <rjw@novell.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Thomas Renninger <trenn@suse.de> Cc: jbeulich@novell.com Cc: yinghai@kernel.org Link: http://lkml.kernel.org/r/20110825190657.157502602@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 10:26:25 +02:00
Suresh Siddha	d3f138106b	iommu: Rename the DMAR and INTR_REMAP config options Change the CONFIG_DMAR to CONFIG_INTEL_IOMMU to be consistent with the other IOMMU options. Rename the CONFIG_INTR_REMAP to CONFIG_IRQ_REMAP to match the irq subsystem name. And define the CONFIG_DMAR_TABLE for the common ACPI DMAR routines shared by both CONFIG_INTEL_IOMMU and CONFIG_IRQ_REMAP. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: yinghai@kernel.org Cc: youquan.song@intel.com Cc: joerg.roedel@amd.com Cc: tony.luck@intel.com Cc: dwmw2@infradead.org Link: http://lkml.kernel.org/r/20110824001456.558630224@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 10:22:03 +02:00
Suresh Siddha	c39d77ffa2	x86, ioapic: Define irq_remap_modify_chip_defaults() Define irq_remap_modify_chip_defaults() and remove the duplicate code, cleanup the unnecessary ifdefs. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: yinghai@kernel.org Cc: youquan.song@intel.com Cc: joerg.roedel@amd.com Cc: tony.luck@intel.com Cc: dwmw2@infradead.org Link: http://lkml.kernel.org/r/20110824001456.499225692@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 10:22:01 +02:00
Suresh Siddha	13ea20f7a2	x86, msi, intr-remap: Use the ioapic set affinity routine IRQ set affinity routine is same for the IO-APIC IRQ's aswell as the MSI IRQ's in the presence of interrupt-remapping. This is because we modify the interrupt-remapping table entry and doesn't touch the IO-APIC RTE or the MSI entry. So remove the ir_msi_set_affinity() and re-use the ir_ioapic_set_affinity() Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: yinghai@kernel.org Cc: youquan.song@intel.com Cc: joerg.roedel@amd.com Cc: tony.luck@intel.com Cc: dwmw2@infradead.org Link: http://lkml.kernel.org/r/20110824001456.452760446@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 10:21:59 +02:00
Suresh Siddha	41750d31fc	x86, x2apic: Enable the bios request for x2apic optout On the platforms which are x2apic and interrupt-remapping capable, Linux kernel is enabling x2apic even if the BIOS doesn't. This is to take advantage of the features that x2apic brings in. Some of the OEM platforms are running into issues because of this, as their bios is not x2apic aware. For example, this was resulting in interrupt migration issues on one of the platforms. Also if the BIOS SMI handling uses APIC interface to send SMI's, then the BIOS need to be aware of x2apic mode that OS has enabled. On some of these platforms, BIOS doesn't have a HW mechanism to turnoff the x2apic feature to prevent OS from enabling it. To resolve this mess, recent changes to the VT-d2 specification: http://download.intel.com/technology/computing/vptech/Intel(r)_VT_for_Direct_IO.pdf includes a mechanism that provides BIOS a way to request system software to opt out of enabling x2apic mode. Look at the x2apic optout flag in the DMAR tables before enabling the x2apic mode in the platform. Also print a warning that we have disabled x2apic based on the BIOS request. Kernel boot parameter "intremap=no_x2apic_optout" can be used to override the BIOS x2apic optout request. Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: yinghai@kernel.org Cc: joerg.roedel@amd.com Cc: tony.luck@intel.com Cc: dwmw2@infradead.org Link: http://lkml.kernel.org/r/20110824001456.171766616@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 10:21:50 +02:00
Linus Torvalds	abbe0d3c26	Merge branch 'stable/bug.fixes' of git://oss.oracle.com/git/kwilk/xen * 'stable/bug.fixes' of git://oss.oracle.com/git/kwilk/xen: xen/i386: follow-up to "replace order-based range checking of M2P table by linear one" xen/irq: Alter the locking to use a mutex instead of a spinlock. xen/e820: if there is no dom0_mem=, don't tweak extra_pages. xen: disable PV spinlocks on HVM	2011-09-16 11:28:11 -07:00
Linus Torvalds	a7f934d4f1	asm alternatives: remove incorrect alignment notes On x86-64, they were just wasteful: with the explicitly added (now unnecessary) padding, the size of the alternatives structure was 16 bytes, and an alignment of 8 bytes didn't hurt much. However, it was still silly, since the natural size and alignment for the structure is actually just 12 bytes, 4-byte aligned since commit `59e97e4d6f` ("x86: Make alternative instruction pointers relative"). So removing the padding, and removing the extra alignment is just a good idea. On x86-32, the alignment of 4 bytes was correct, but was incorrectly hardcoded as 8 bytes in <asm/alternative-asm.h>. That header file had used to be an x86-64 only header file, but various unification efforts have made it be used for x86-32 too (ie the unification of rwlock and rwsem). That in turn caused x86-32 boot failures, because the extra alignment would result in random zero-filled words in the altinstructions section, causing oopses early at boot when doing alternative instruction replacement. So just remove all the alignment noise entirely. It's wrong, and it's unnecessary. The section itself is already properly aligned by the linker scripts, and all additions to the section had better be of the proper 12-byte format, keeping it aligned. So if the align directive were to ever make a difference, that would be an indication of a serious bug to begin with. Reported-by: Werner Landgraf <w.landgraf@ru.r> Acked-by: Andrew Lutomirski <luto@mit.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-09-15 13:28:33 -07:00
Jiri Kosina	e060c38434	Merge branch 'master' into for-next Fast-forward merge with Linus to be able to merge patches based on more recent version of the tree.	2011-09-15 15:08:18 +02:00
Jesper Juhl	469bfb4a50	Remove unneeded version.h include from arch/x86/ It was pointed out by 'make versioncheck' that the include of linux/version.h is not needed in arch/x86/mm/mmio-mod.c . This patch removes it. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-09-15 14:57:06 +02:00
Kamalesh Babulal	ea70ef3d9d	sched: x86_32 Fix typo in switch_to() description This patch fixes the typo in parameters passed to x86_32 switch_to() description. Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-09-15 14:13:48 +02:00
Jan Beulich	61cca2fab7	xen/i386: follow-up to "replace order-based range checking of M2P table by linear one" The numbers obtained from the hypervisor really can't ever lead to an overflow here, only the original calculation going through the order of the range could have. This avoids the (as Jeremy points outs) somewhat ugly NULL-based calculation here. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-09-15 04:39:46 -04:00
Hidetoshi Seto	9aaef96f61	x86, mce: Do not call del_timer_sync() in IRQ context del_timer_sync() can cause a deadlock when called in interrupt context. It is used with on_each_cpu() in some parts for sysfs files like bank*, check_interval, cmci_disabled and ignore_ce. However, use of on_each_cpu() results in calling the function passed as the argument in interrupt context. This causes a flood of nested warnings from del_timer_sync() (it runs on each CPU) caused even by a simple file access like: $ echo 300 > /sys/devices/system/machinecheck/machinecheck0/check_interval Fortunately, these MCE-specific files are rarely used and AFAIK only few MCE geeks experience this warning. To remove the warning, move timer deletion outside of the interrupt context. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-09-14 15:50:15 +02:00
David Vrabel	e3b73c4a25	xen/e820: if there is no dom0_mem=, don't tweak extra_pages. The patch "xen: use maximum reservation to limit amount of usable RAM" (`d312ae878b`) breaks machines that do not use 'dom0_mem=' argument with: reserve RAM buffer: 000000133f2e2000 - 000000133fffffff (XEN) mm.c:4976:d0 Global bit is set to kernel page fffff8117e (XEN) domain_crash_sync called from entry.S (XEN) Domain 0 (vcpu#0) crashed on cpu#0: ... The reason being that the last E820 entry is created using the 'extra_pages' (which is based on how many pages have been freed). The mentioned git commit sets the initial value of 'extra_pages' using a hypercall which returns the number of pages (if dom0_mem has been used) or -1 otherwise. If the later we return with MAX_DOMAIN_PAGES as basis for calculation: return min(max_pages, MAX_DOMAIN_PAGES); and use it: extra_limit = xen_get_max_pages(); if (extra_limit >= max_pfn) extra_pages = extra_limit - max_pfn; else extra_pages = 0; which means we end up with extra_pages = 128GB in PFNs (33554432) - 8GB in PFNs (2097152, on this specific box, can be larger or smaller), and then we add that value to the E820 making it: Xen: 00000000ff000000 - 0000000100000000 (reserved) Xen: 0000000100000000 - 000000133f2e2000 (usable) which is clearly wrong. It should look as so: Xen: 00000000ff000000 - 0000000100000000 (reserved) Xen: 0000000100000000 - 000000027fbda000 (usable) Naturally this problem does not present itself if dom0_mem=max:X is used. CC: stable@kernel.org Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-09-13 10:17:32 -04:00
Thomas Gleixner	59d958d2c7	locking, x86: mce: Annotate cmci_discover_lock as raw The cmci_discover_lock can be taken in atomic context (cpu bring up sequence) and therefore cannot be preempted on -rt. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-13 11:12:09 +02:00
Thomas Gleixner	2d21a29fb6	locking, oprofile: Annotate oprofilefs lock as raw The oprofilefs_lock can be taken in atomic context (in profiling interrupts) and therefore cannot cannot be preempted on -rt - annotate it. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-13 11:12:05 +02:00
Linus Torvalds	d9543314ee	Merge branch 'upstream/bugfix' of git://github.com/jsgf/linux-xen * 'upstream/bugfix' of git://github.com/jsgf/linux-xen: xen: use non-tracing preempt in xen_clocksource_read()	2011-09-12 17:22:31 -07:00
Frank Arnold	77e75fc764	x86: cache_info: Update calculation of AMD L3 cache indices L3 subcaches 0 and 1 of AMD Family 15h CPUs can have a size of 2MB. Update the calculation routine for the number of L3 indices to reflect that. Signed-off-by: Frank Arnold <frank.arnold@amd.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Rosenfeld Hans <Hans.Rosenfeld@amd.com> Cc: Herrmann3 Andreas <Andreas.Herrmann3@amd.com> Cc: Mike Travis <travis@sgi.com> Cc: Frank Arnold <Frank.Arnold@amd.com> Link: http://lkml.kernel.org/r/20110726170449.GB32536@aftab Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-12 19:29:03 +02:00
Thomas Gleixner	d2946041ff	x86: cache_info: Kill the atomic allocation in amd_init_l3_cache() It's not a good reason to allocate memory in the smp function call just because someone thought it's the most conveniant place. The AMD L3 data is coupled to the northbridge info by a pointer to the corresponding north bridge data. So allocating it with the northbridge data and referencing the northbridge in the cache_info code instead uses less memory and gets rid of that atomic allocation hack in the smp function call. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Mike Travis <travis@sgi.com> Link: http://lkml.kernel.org/r/20110723212626.688229918@linutronix.de Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-12 19:28:37 +02:00
Thomas Gleixner	b7d11a768b	x86: cache_info: Kill the moronic shadow struct Commit `f9b90566c` ("x86: reduce stack usage in init_intel_cacheinfo") introduced a shadow structure to reduce the stack usage on large machines instead of making the smaller structure embedded into the large one. That's definitely a candidate for the bad taste award. Move the small struct into the large one and get rid of the ugly type casts. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Mike Travis <travis@sgi.com> Link: http://lkml.kernel.org/r/20110723212626.625651773@linutronix.de Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-12 19:28:33 +02:00
Thomas Gleixner	05b217b021	x86: cache_info: Remove bogus free of amd_l3_cache data free_cache_attributes() kfree's: per_cpu(ici_cpuid4_info, cpu)->l3 which is a pointer to memory which was allocated as a block in amd_init_l3_cache(). l3 of a particular cpu points to a part of this memory blob. The part and the rest of the blob are still referenced by other cpus. As far as I can tell from the git history this is a leftover from the conversion from per cpu to node data with commit ba06edb63(x86, cacheinfo: Make L3 cache info per node) and the following commit f658bcfb2(x86, cacheinfo: Cleanup L3 cache index disable support) Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Mike Travis <travis@sgi.com> Link: http://lkml.kernel.org/r/20110723212626.550539989@linutronix.de Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-12 19:28:29 +02:00
Shyam Iyer	5307f6d5fb	Fix pointer dereference before call to pcie_bus_configure_settings Commit `b03e7495a8` ("PCI: Set PCI-E Max Payload Size on fabric") introduced a potential NULL pointer dereference in calls to pcie_bus_configure_settings due to attempts to access pci_bus self variables when the self pointer is NULL. To correct this, verify that the self pointer in pci_bus is non-NULL before dereferencing it. Reported-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Shyam Iyer <shyam_iyer@dell.com> Signed-off-by: Jon Mason <mason@myri.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-09-09 19:49:58 -07:00
Stefano Stabellini	f10cd522c5	xen: disable PV spinlocks on HVM PV spinlocks cannot possibly work with the current code because they are enabled after pvops patching has already been done, and because PV spinlocks use a different data structure than native spinlocks so we cannot switch between them dynamically. A spinlock that has been taken once by the native code (__ticket_spin_lock) cannot be taken by __xen_spin_lock even after it has been released. Reported-and-Tested-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-09-08 13:59:06 -04:00
Martin Schwidefsky	d1748302f7	clockevents: Make minimum delay adjustments configurable The automatic increase of the min_delta_ns of a clockevents device should be done in the clockevents code as the minimum delay is an attribute of the clockevents device. In addition not all architectures want the automatic adjustment, on a massively virtualized system it can happen that the programming of a clock event fails several times in a row because the virtual cpu has been rescheduled quickly enough. In that case the minimum delay will erroneously be increased with no way back. The new config symbol GENERIC_CLOCKEVENTS_MIN_ADJUST is used to enable the automatic adjustment. The config option is selected only for x86. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: john stultz <johnstul@us.ibm.com> Link: http://lkml.kernel.org/r/20110823133142.494157493@de.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-08 11:10:56 +02:00
K. Y. Srinivasan	6f4151c89b	x86: Hyper-V: Integrate the clocksource with Hyper-V detection code The Hyper-V clocksource driver is best integrated with Hyper-V detection code since: (a) Linux guests running on Hyper-V require it (b) Integration into that code significanly reduces code size Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Cc: gregkh@suse.de Cc: devel@linuxdriverproject.org Cc: virtualization@lists.osdl.org Link: http://lkml.kernel.org/r/1315434310-4827-1-git-send-email-kys@microsoft.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-08 10:33:59 +02:00
Linus Torvalds	b0fb422281	Merge branch 'perf-fixes-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip * 'perf-fixes-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip: x86, perf: Check that current->mm is alive before getting user callchain perf_event: Fix broken calc_timer_values() perf events: Fix slow and broken cgroup context switch code	2011-09-07 13:00:11 -07:00
Linus Torvalds	1154526753	Merge branch 'stable/bug.fixes' of git://oss.oracle.com/git/kwilk/xen * 'stable/bug.fixes' of git://oss.oracle.com/git/kwilk/xen: xen/smp: Warn user why they keel over - nosmp or noapic and what to use instead. xen: x86_32: do not enable iterrupts when returning from exception in interrupt context xen: use maximum reservation to limit amount of usable RAM	2011-09-07 07:46:48 -07:00
Linus Torvalds	768b56f598	Merge branch 'kvm-updates/3.1' of git://github.com/avikivity/kvm * 'kvm-updates/3.1' of git://github.com/avikivity/kvm: KVM: Fix instruction size issue in pvclock scaling	2011-09-07 07:45:43 -07:00
Konrad Rzeszutek Wilk	ed467e69f1	xen/smp: Warn user why they keel over - nosmp or noapic and what to use instead. We have hit a couple of customer bugs where they would like to use those parameters to run an UP kernel - but both of those options turn of important sources of interrupt information so we end up not being able to boot. The correct way is to pass in 'dom0_max_vcpus=1' on the Xen hypervisor line and the kernel will patch itself to be a UP kernel. Fixes bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=637308 CC: stable@kernel.org Acked-by: Ian Campbell <Ian.Campbell@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-09-01 12:54:49 -04:00
Igor Mammedov	d198d49914	xen: x86_32: do not enable iterrupts when returning from exception in interrupt context If vmalloc page_fault happens inside of interrupt handler with interrupts disabled then on exit path from exception handler when there is no pending interrupts, the following code (arch/x86/xen/xen-asm_32.S:112): cmpw $0x0001, XEN_vcpu_info_pending(%eax) sete XEN_vcpu_info_mask(%eax) will enable interrupts even if they has been previously disabled according to eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99) testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp) setz XEN_vcpu_info_mask(%eax) Solution is in setting XEN_vcpu_info_mask only when it should be set according to cmpw $0x0001, XEN_vcpu_info_pending(%eax) but not clearing it if there isn't any pending events. Reproducer for bug is attached to RHBZ 707552 CC: stable@kernel.org Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-09-01 12:54:42 -04:00
David Vrabel	d312ae878b	xen: use maximum reservation to limit amount of usable RAM Use the domain's maximum reservation to limit the amount of extra RAM for the memory balloon. This reduces the size of the pages tables and the amount of reserved low memory (which defaults to about 1/32 of the total RAM). On a system with 8 GiB of RAM with the domain limited to 1 GiB the kernel reports: Before: Memory: 627792k/4472000k available After: Memory: 549740k/11132224k available A increase of about 76 MiB (~1.5% of the unused 7 GiB). The reserved low memory is also reduced from 253 MiB to 32 MiB. The total additional usable RAM is 329 MiB. For dom0, this requires at patch to Xen ('x86: use 'dom0_mem' to limit the number of pages for dom0') (c/s 23790) CC: stable@kernel.org Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-09-01 09:41:40 -04:00
Andrey Vagin	20afc60f89	x86, perf: Check that current->mm is alive before getting user callchain An event may occur when an mm is already released. I added an event in dequeue_entity() and caught a panic with the following backtrace: [ 434.421110] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050 [ 434.421258] IP: [<ffffffff810464ac>] __get_user_pages_fast+0x9c/0x120 ... [ 434.421258] Call Trace: [ 434.421258] [<ffffffff8101ae81>] copy_from_user_nmi+0x51/0xf0 [ 434.421258] [<ffffffff8109a0d5>] ? sched_clock_local+0x25/0x90 [ 434.421258] [<ffffffff8101b048>] perf_callchain_user+0x128/0x170 [ 434.421258] [<ffffffff811154cd>] ? __perf_event_header__init_id+0xed/0x100 [ 434.421258] [<ffffffff81116690>] perf_prepare_sample+0x200/0x280 [ 434.421258] [<ffffffff81118da8>] __perf_event_overflow+0x1b8/0x290 [ 434.421258] [<ffffffff81065240>] ? tg_shares_up+0x0/0x670 [ 434.421258] [<ffffffff8104fe1a>] ? walk_tg_tree+0x6a/0xb0 [ 434.421258] [<ffffffff81118f44>] perf_swevent_overflow+0xc4/0xf0 [ 434.421258] [<ffffffff81119150>] do_perf_sw_event+0x1e0/0x250 [ 434.421258] [<ffffffff81119204>] perf_tp_event+0x44/0x70 [ 434.421258] [<ffffffff8105701f>] ftrace_profile_sched_block+0xdf/0x110 [ 434.421258] [<ffffffff8106121d>] dequeue_entity+0x2ad/0x2d0 [ 434.421258] [<ffffffff810614ec>] dequeue_task_fair+0x1c/0x60 [ 434.421258] [<ffffffff8105818a>] dequeue_task+0x9a/0xb0 [ 434.421258] [<ffffffff810581e2>] deactivate_task+0x42/0xe0 [ 434.421258] [<ffffffff814bc019>] thread_return+0x191/0x808 [ 434.421258] [<ffffffff81098a44>] ? switch_task_namespaces+0x24/0x60 [ 434.421258] [<ffffffff8106f4c4>] do_exit+0x464/0x910 [ 434.421258] [<ffffffff8106f9c8>] do_group_exit+0x58/0xd0 [ 434.421258] [<ffffffff8106fa57>] sys_exit_group+0x17/0x20 [ 434.421258] [<ffffffff8100b202>] system_call_fastpath+0x16/0x1b Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1314693156-24131-1-git-send-email-avagin@openvz.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-31 15:56:31 +02:00
Duncan Sands	3b217116ed	KVM: Fix instruction size issue in pvclock scaling Commit `de2d1a524e` ("KVM: Fix register corruption in pvclock_scale_delta") introduced a mul instruction that may have only a memory operand; the assembler therefore cannot select the correct size: pvclock.s:229: Error: no instruction mnemonic suffix given and no register operands; can't size instruction In this example the assembler is: #APP mul -48(%rbp) ; shrd $32, %rdx, %rax #NO_APP A simple solution is to use mulq. Signed-off-by: Duncan Sands <baldrick@free.fr> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-08-30 14:42:30 +03:00
Jeremy Fitzhardinge	61e2cd0acc	x86, cmpxchg: Use __compiletime_error() to make usage messages a bit nicer Use __compiletime_error() to produce a compile-time error rather than link-time, where available. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-29 17:20:40 -07:00
Jeremy Fitzhardinge	229855d6f3	x86, ticketlock: Make __ticket_spin_trylock common Make trylock code common regardless of ticket size. (Also, rename arch_spinlock.slock to head_tail.) Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-29 13:46:34 -07:00
Jeremy Fitzhardinge	2994488fe5	x86, ticketlock: Convert __ticket_spin_lock to use xadd() Convert the two variants of __ticket_spin_lock() to use xadd(), which has the effect of making them identical, so remove the duplicate function. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-29 13:46:07 -07:00
Jeremy Fitzhardinge	c576a3ea90	x86, ticketlock: Convert spin loop to C The inner loop of __ticket_spin_lock isn't doing anything very special, so reimplement it in C. For the 8 bit ticket lock variant, we use a register union to get direct access to the lower and upper bytes in the tickets, but unfortunately gcc won't generate a direct comparison between the two halves of the register, so the generated asm isn't quite as pretty as the hand-coded version. However benchmarking shows that this is actually a small improvement in runtime performance on some benchmarks, and never a slowdown. We also need to make sure there's a barrier at the end of the lock loop to make sure that the compiler doesn't move any instructions from within the locked region into the region where we don't yet own the lock. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-29 13:45:43 -07:00
Jeremy Fitzhardinge	84eb950db1	x86, ticketlock: Clean up types and accessors A few cleanups to the way spinlocks are defined and accessed: - define __ticket_t which is the size of a spinlock ticket (ie, enough bits to hold all the cpus) - Define struct arch_spinlock as a union containing plain slock and the head and tail tickets - Use head and tail to implement some of the spinlock predicates. - Make all ticket variables unsigned. - Use TICKET_SHIFT to form constants Most of this will be used in later patches. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-29 13:45:19 -07:00
Jeremy Fitzhardinge	8b8bc2f731	x86: Use xadd helper more widely This covers the trivial cases from open-coded xadd to the xadd macros. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-29 13:44:12 -07:00
Jeremy Fitzhardinge	433b352061	x86: Add xadd helper macro Add a common xadd implementation. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-29 13:42:20 -07:00
Jeremy Fitzhardinge	e9826380d8	x86, cmpxchg: Unify cmpxchg into cmpxchg.h Everything that's actually common between 32 and 64-bit is moved into cmpxchg.h. xchg/cmpxchg will fail with a link error if they're passed an unsupported size (which includes 64-bit args on 32-bit systems). Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-29 13:42:10 -07:00
Jeremy Fitzhardinge	00a41546e8	x86, cmpxchg: Move 64-bit set64_bit() to match 32-bit Reduce arbitrary differences between 32 and 64 bits. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-29 13:41:40 -07:00
Jeremy Fitzhardinge	416185bd5a	x86, cmpxchg: Move 32-bit __cmpxchg_wrong_size to match 64 bit. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-29 13:41:17 -07:00
Jeremy Fitzhardinge	4009338d62	x86, cmpxchg: <linux/alternative.h> has LOCK_PREFIX Not <linux/bitops.h>. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-29 13:40:25 -07:00
Greg Kroah-Hartman	6eafa4604c	Merge 3.1-rc4 into staging-next This resolves a conflict with: drivers/staging/brcm80211/brcmsmac/types.h Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-08-29 08:47:46 -07:00
NeilBrown	f5b9409973	All Arch: remove linkage for sys_nfsservctl system call The nfsservctl system call is now gone, so we should remove all linkage for it. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-08-26 15:09:58 -07:00
Feng Tang	efe3ed9837	x86/mrst: Add platform data for Max3110 devices Those info will be used when spi controller driver setup max3110 as a slave device Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Dirk Brandewie <dirk.brandewie@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-08-26 11:01:14 -07:00
Kirill A. Shutemov	a94cc4e6c0	sfi: table irq 0xFF means 'no interrupt' According to the SFI specification irq number 0xFF means device has no interrupt or interrupt attached via GPIO. Currently, we don't handle this special case and set irq field in *_board_info structs to 255. It leads to confusion in some drivers. Accelerometer driver tries to register interrupt 255, fails and prints "Cannot get IRQ" to dmesg. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-08-26 09:03:29 -07:00
K. Y. Srinivasan	5289d3d160	Staging: hv: vmbus: Retry vmbus_post_msg() before giving up The function hv_post_msg() can fail because of transient resource conditions. It may be useful to retry the operation. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-08-25 15:23:19 -07:00
Andy Lutomirski	b4ca46e4e8	x86-32: Fix boot with CONFIG_X86_INVD_BUG entry_32.S contained a hardcoded alternative instruction entry, and the format changed in commit `59e97e4d6f` ("x86: Make alternative instruction pointers relative"). Replace the hardcoded entry with the altinstruction_entry macro. This fixes the 32-bit boot with CONFIG_X86_INVD_BUG=y. Reported-and-tested-by: Arnaud Lacombe <lacombar@gmail.com> Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-08-25 13:27:14 -07:00
Tejun Heo	cbbfa38fcb	mtrr: fix UP breakage caused during switch to stop_machine While removing custom rendezvous code and switching to stop_machine, commit `192d885742` ("x86, mtrr: use stop_machine APIs for doing MTRR rendezvous") completely dropped mtrr setting code on !CONFIG_SMP breaking MTRR settting on UP. Fix it by removing the incorrect CONFIG_SMP. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Anders Eriksson <aeriksson@fastmail.fm> Tested-and-acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-08-25 11:02:29 -07:00
Andy Lutomirski	7e49b1c8c6	x86-64, unistd: Remove bogus __IGNORE_getcpu The change: commit `fce8dc0642` Author: Andy Lutomirski <luto@mit.edu> Date: Wed Aug 10 11:15:31 2011 -0400 x86-64: Wire up getcpu syscall added getcpu as a real syscall, so we shouldn't ignore it any more. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/b4cb60ef45db3a675a0e2b9d51bcb022b0a9ab9c.1314195481.git.luto@mit.edu Reported-by: H.J. Lu <hjl.tools@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-24 10:23:51 -07:00
Jeremy Fitzhardinge	f1c39625d6	xen: use non-tracing preempt in xen_clocksource_read() The tracing code used sched_clock() to get tracing timestamps, which ends up calling xen_clocksource_read(). xen_clocksource_read() must disable preemption, but if preemption tracing is enabled, this results in infinite recursion. I've only noticed this when boot-time tracing tests are enabled, but it seems like a generic bug. It looks like it would also affect kvm_clocksource_read(). Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com>	2011-08-24 09:54:24 -07:00
Linus Torvalds	14c62e78dc	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-32, vdso: On system call restart after SYSENTER, use int $0x80 x86, UV: Remove UV delay in starting slave cpus x86, olpc: Wait for last byte of EC command to be accepted	2011-08-23 18:09:08 -07:00
H. Peter Anvin	7ca0758cdb	x86-32, vdso: On system call restart after SYSENTER, use int $0x80 When we enter a 32-bit system call via SYSENTER or SYSCALL, we shuffle the arguments to match the int $0x80 calling convention. This was probably a design mistake, but it's what it is now. This causes errors if the system call as to be restarted. For SYSENTER, we have to invoke the instruction from the vdso as the return address is hardcoded. Accordingly, we can simply replace the jump in the vdso with an int $0x80 instruction and use the slower entry point for a post-restart. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/CA%2B55aFztZ=r5wa0x26KJQxvZOaQq8s2v3u50wCyJcA-Sc4g8gQ@mail.gmail.com Cc: <stable@kernel.org>	2011-08-23 16:20:10 -07:00
Zhao Jin	c812d8f7ad	x86, mm, trivial: Remove unnecessary get_order() in free_thread_info() Because THREAD_SIZE is defined as PAGE_SIZE << THREAD_ORDER on x86, the call of get_order(THREAD_SIZE) can be replaced with THREAD_ORDER. Signed-off-by: Zhao Jin <cronozhj@gmail.com> Link: http://lkml.kernel.org/r/4E4FB5A9.700@gmail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-23 21:49:35 +02:00
Jesper Juhl	7a4e8beabc	x86, cleanup: Remove unneeded version.h include from arch/x86/ It was pointed out by 'make versioncheck' that the include of linux/version.h is not needed in arch/x86/mm/mmio-mod.c . This patch removes it. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1108012305570.31999@swampdragon.chaosbits.net Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-22 15:35:43 -07:00
Arun Thomas	be604e695f	x86, cpu: Add cpufeature flag for PCIDs This patch add a flag for Process-Context Identifiers (PCIDs) aka Address Space Identifiers (ASIDs) aka Tagged TLB support. Signed-off-by: Arun Thomas <arun.thomas@gmail.com> Link: http://lkml.kernel.org/r/1313782943-3898-1-git-send-email-arun.thomas@gmail.com Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-22 13:52:32 -07:00
Linus Torvalds	4762e252f4	Merge branch 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/tracing: Fix tracing config option properly xen: Do not enable PV IPIs when vector callback not present xen/x86: replace order-based range checking of M2P table by linear one xen: xen-selfballoon.c needs more header files	2011-08-22 11:25:44 -07:00
Jeremy Fitzhardinge	60c5f08e15	xen/tracing: Fix tracing config option properly Steven Rostedt says we should use CONFIG_EVENT_TRACING. Cc:Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-08-22 11:28:33 -04:00
Stefano Stabellini	3c05c4bed4	xen: Do not enable PV IPIs when vector callback not present Fix regression for HVM case on older (<4.1.1) hypervisors caused by commit `99bbb3a84a` Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Date: Thu Dec 2 17:55:10 2010 +0000 xen: PV on HVM: support PV spinlocks and IPIs This change replaced the SMP operations with event based handlers without taking into account that this only works when the hypervisor supports callback vectors. This causes unexplainable hangs early on boot for HVM guests with more than one CPU. BugLink: http://bugs.launchpad.net/bugs/791850 CC: stable@kernel.org Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-and-Reported-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-08-22 11:28:09 -04:00
Kevin Winchester	2f3aa7a06f	x86: jump_label: arch_jump_label_text_poke_early: add missing __init arch_jump_label_text_poke_early calls text_poke_early, which is an __init function. Thus arch_jump_label_text_poke_early should be the same. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Cc: jbaron@redhat.com Cc: tglx@linutronix.de Cc: mingo@redhat.com Cc: hpa@zytor.com Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1313539478-30303-1-git-send-email-kjwinchester@gmail.com [ Use __init_or_module instead of __init ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-19 14:35:04 -04:00
Linus Torvalds	0c3bef6128	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: OF: Don't crash when bridge parent is NULL. PCI: export pcie_bus_configure_settings symbol PCI: code and comments cleanup PCI: make cardbus-bridge resources optional PCI: make SRIOV resources optional PCI : ability to relocate assigned pci-resources PCI: honor child buses add_size in hot plug configuration PCI: Set PCI-E Max Payload Size on fabric	2011-08-19 10:02:37 -07:00
Ingo Molnar	8bc84f8731	Merge branch 'perf/urgent' into perf/core Merge reason: add the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-18 21:56:47 +02:00
Jan Beulich	ccbcdf7cf1	xen/x86: replace order-based range checking of M2P table by linear one The order-based approach is not only less efficient (requiring a shift and a compare, typical generated code looking like this mov eax, [machine_to_phys_order] mov ecx, eax shr ebx, cl test ebx, ebx jnz ... whereas a direct check requires just a compare, like in cmp ebx, [machine_to_phys_nr] jae ... ), but also slightly dangerous in the 32-on-64 case - the element address calculation can wrap if the next power of two boundary is sufficiently far away from the actual upper limit of the table, and hence can result in user space addresses being accessed (with it being unknown what may actually be mapped there). Additionally, the elimination of the mistaken use of fls() here (should have been __fls()) fixes a latent issue on x86-64 that would trigger if the code was run on a system with memory extending beyond the 44-bit boundary. CC: stable@kernel.org Signed-off-by: Jan Beulich <jbeulich@novell.com> [v1: Based on Jeremy's feedback] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-08-17 10:26:48 -04:00
Robert Richter	298557db42	oprofile, x86: Fix overflow and warning (commit `1d12d35`) Following fixes for: `1d12d35` oprofile, x86: Convert memory allocation to static array Fix potential buffer overflow. Fix the following warning: arch/x86/oprofile/op_model_ppro.c: In function ‘ppro_check_ctrs’: arch/x86/oprofile/op_model_ppro.c:143: warning: label ‘out’ defined but not used Cc: Maarten Lankhorst <m.b.lankhorst@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Robert Richter <robert.richter@amd.com>	2011-08-16 23:51:00 +02:00
Linus Torvalds	ab7e2dbf9b	Merge branch 'kvm-updates/3.1' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/3.1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: uses TASKSTATS, depends on NET KVM: fix TASK_DELAY_ACCT kconfig warning	2011-08-16 11:14:44 -07:00
Randy Dunlap	df3d8ae1f8	KVM: uses TASKSTATS, depends on NET CONFIG_TASKSTATS just had a change to use netlink, including a change to "depends on NET". Since "select" does not follow dependencies, KVM also needs to depend on NET to prevent build errors when CONFIG_NET is not enabled. Sample of the reported "undefined reference" build errors: taskstats.c:(.text+0x8f686): undefined reference to `nla_put' taskstats.c:(.text+0x8f721): undefined reference to `nla_reserve' taskstats.c:(.text+0x8f8fb): undefined reference to `init_net' taskstats.c:(.text+0x8f905): undefined reference to `netlink_unicast' taskstats.c:(.text+0x8f934): undefined reference to `kfree_skb' taskstats.c:(.text+0x8f9e9): undefined reference to `skb_clone' taskstats.c:(.text+0x90060): undefined reference to `__alloc_skb' taskstats.c:(.text+0x901e9): undefined reference to `skb_put' taskstats.c:(.init.text+0x4665): undefined reference to `genl_register_family' taskstats.c:(.init.text+0x4699): undefined reference to `genl_register_ops' taskstats.c:(.init.text+0x4710): undefined reference to `genl_unregister_ops' taskstats.c:(.init.text+0x471c): undefined reference to `genl_unregister_family' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-08-16 19:00:41 +03:00
H. Peter Anvin	fab1167c46	x86, vsyscall: Add missing <asm/fixmap.h> to arch/x86/mm/fault.c arch/x86/mm/fault.c now depend on having the symbol VSYSCALL_START defined, which is best handled by including <asm/fixmap.h> (it isn't unreasonable we may want other fixed addresses in this file in the future, and so it is cleaner than including <asm/vsyscall.h> directly.) This addresses an x86-64 allnoconfig build failure. On other configurations it was masked by an indirect path: <asm/smp.h> -> <asm/apic.h> -> <asm/fixmap.h> -> <asm/vsyscall.h> ... however, the first such include is conditional on CONFIG_X86_LOCAL_APIC. Originally-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/CA%2B55aFxsOMc9=p02r8-QhJ=h=Mqwckk4_Pnx9LQt5%2BfqMp_exQ@mail.gmail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-16 08:04:02 -07:00
Randy Dunlap	cedf03bd9a	x86: fix mm/fault.c build arch/x86/mm/fault.c needs to include asm/vsyscall.h to fix a build error: arch/x86/mm/fault.c: In function '__bad_area_nosemaphore': arch/x86/mm/fault.c:728: error: 'VSYSCALL_START' undeclared (first use in this function) Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-08-15 19:10:50 -07:00
Peter Zijlstra	7fdba1ca10	perf, x86: Avoid kfree() in CPU_STARTING On -rt kfree() can schedule, but CPU_STARTING is before the CPU is fully up and running. These are contradictory, so avoid it. Instead push the kfree() to CPU_ONLINE where we're free to schedule. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-kwd4j6ayld5thrscvaxgjquv@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-14 11:53:02 +02:00
Linus Torvalds	06e727d2a5	Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-tip * 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-tip: x86-64: Rework vsyscall emulation and add vsyscall= parameter x86-64: Wire up getcpu syscall x86: Remove unnecessary compile flag tweaks for vsyscall code x86-64: Add vsyscall:emulate_vsyscall trace event x86-64: Add user_64bit_mode paravirt op x86-64, xen: Enable the vvar mapping x86-64: Work around gold bug 13023 x86-64: Move the "user" vsyscall segment out of the data segment. x86-64: Pad vDSO to a page boundary	2011-08-12 20:46:24 -07:00
Linus Torvalds	1d229d54db	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf symbols: Check '/tmp/perf-' symbol file ownership perf sched: Usage leftover from trace -> script rename perf sched: Do not delete session object prematurely perf tools: Check $HOME/.perfconfig ownership perf, x86: Add model 45 SandyBridge support perf tools: Add support to install perf python extension perf tools: do not look at ./config for configuration perf tools: Make clean leaves some files perf lock: Dropping unsupported ':r' modifier perf probe: Fix coredump introduced by probe module option jump label: Reduce the cycle count by changing the link order perf report: Use ui__warning in some more places perf python: Add PERF_RECORD_{LOST,READ,SAMPLE} routine tables perf evlist: Introduce 'disable' method trace events: Update version number reference to new 3.x scheme for EVENT_POWER_TRACING_DEPRECATED perf buildid-cache: Zero out buffer of filenames when adding/removing buildid	2011-08-11 09:03:48 -07:00
Andy Lutomirski	3ae36655b9	x86-64: Rework vsyscall emulation and add vsyscall= parameter There are three choices: vsyscall=native: Vsyscalls are native code that issues the corresponding syscalls. vsyscall=emulate (default): Vsyscalls are emulated by instruction fault traps, tested in the bad_area path. The actual contents of the vsyscall page is the same as the vsyscall=native case except that it's marked NX. This way programs that make assumptions about what the code in the page does will not be confused when they read that code. vsyscall=none: Trying to execute a vsyscall will segfault. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/8449fb3abf89851fd6b2260972666a6f82542284.1312988155.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-10 19:26:46 -05:00
Andy Lutomirski	fce8dc0642	x86-64: Wire up getcpu syscall getcpu is available as a vdso entry and an emulated vsyscall. Programs that for some reason don't want to use the vdso should still be able to call getcpu without relying on the slow emulated vsyscall. It costs almost nothing to expose it as a real syscall. We also need this for the following patch in vsyscall=native mode. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/6b19f55bdb06a0c32c2fa6dba9b6f222e1fde999.1312988155.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-10 19:26:46 -05:00
Andy Lutomirski	f3fb5b7bb7	x86: Remove unnecessary compile flag tweaks for vsyscall code As of commit `98d0ac38ca` Author: Andy Lutomirski <luto@mit.edu> Date: Thu Jul 14 06:47:22 2011 -0400 x86-64: Move vread_tsc and vread_hpet into the vDSO user code no longer directly calls into code in arch/x86/kernel/, so we don't need compile flag hacks to make it safe. All vdso code is in the vdso directory now. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/835cd05a4c7740544d09723d6ba48f4406f9826c.1312988155.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-10 18:55:29 -05:00
Mathias Krause	66be895158	crypto: sha1 - SSSE3 based SHA1 implementation for x86-64 This is an assembler implementation of the SHA1 algorithm using the Supplemental SSE3 (SSSE3) instructions or, when available, the Advanced Vector Extensions (AVX). Testing with the tcrypt module shows the raw hash performance is up to 2.3 times faster than the C implementation, using 8k data blocks on a Core 2 Duo T5500. For the smalest data set (16 byte) it is still 25% faster. Since this implementation uses SSE/YMM registers it cannot safely be used in every situation, e.g. while an IRQ interrupts a kernel thread. The implementation falls back to the generic SHA1 variant, if using the SSE/YMM registers is not possible. With this algorithm I was able to increase the throughput of a single IPsec link from 344 Mbit/s to 464 Mbit/s on a Core 2 Quad CPU using the SSSE3 variant -- a speedup of +34.8%. Saving and restoring SSE/YMM state might make the actual throughput fluctuate when there are FPU intensive userland applications running. For example, meassuring the performance using iperf2 directly on the machine under test gives wobbling numbers because iperf2 uses the FPU for each packet to check if the reporting interval has expired (in the above test I got min/max/avg: 402/484/464 MBit/s). Using this algorithm on a IPsec gateway gives much more reasonable and stable numbers, albeit not as high as in the directly connected case. Here is the result from an RFC 2544 test run with a EXFO Packet Blazer FTB-8510: frame size sha1-generic sha1-ssse3 delta 64 byte 37.5 MBit/s 37.5 MBit/s 0.0% 128 byte 56.3 MBit/s 62.5 MBit/s +11.0% 256 byte 87.5 MBit/s 100.0 MBit/s +14.3% 512 byte 131.3 MBit/s 150.0 MBit/s +14.2% 1024 byte 162.5 MBit/s 193.8 MBit/s +19.3% 1280 byte 175.0 MBit/s 212.5 MBit/s +21.4% 1420 byte 175.0 MBit/s 218.7 MBit/s +25.0% 1518 byte 150.0 MBit/s 181.2 MBit/s +20.8% The throughput for the largest frame size is lower than for the previous size because the IP packets need to be fragmented in this case to make there way through the IPsec tunnel. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Maxim Locktyukhin <maxim.locktyukhin@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2011-08-10 19:00:29 +08:00
Stephen Rothwell	5cdd174fea	x86, amd: Include elf.h explicitly, prepare the code for the module.h split When the moduleu.h splitting tree is merged to the latest tip:x86/cpu tree, the x86_64 allmodconfig build fails like this: arch/x86/kernel/cpu/amd.c: In function 'bsp_init_amd': arch/x86/kernel/cpu/amd.c:437:3: error: 'va_align' undeclared (first use in this function) arch/x86/kernel/cpu/amd.c:438:23: error: 'ALIGN_VA_32' undeclared (first use in this function) arch/x86/kernel/cpu/amd.c:438:37: error: 'ALIGN_VA_64' undeclared (first use in this function) This is caused by the module.h split up intreacting with commit `dfb09f9b7a` ("x86, amd: Avoid cache aliasing penalties on AMD family 15h") from the tip:x86/cpu tree. I have added the following patch for today (this, or something similar, could be applied to the tip tree directly - the export.h include below was added by the module.h splitup). So include elf.h to use va_align and remove this implicit dependency on module.h doing it for us. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Link: http://lkml.kernel.org/r/20110810114956.238d66772883636e3040d29f@canb.auug.org.au Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-10 10:14:09 +02:00
Konrad Rzeszutek Wilk	10fe570fc1	Revert "xen/debug: WARN_ON when identity PFN has no _PAGE_IOMAP flag set." We don' use it anymore and there are more false positives. This reverts commit `fc25151d9a`. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-08-09 13:04:08 -04:00
Youquan Song	a34668f6be	perf, x86: Add model 45 SandyBridge support Add support to Romely-EP SandyBridge. Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Anhua Xu <anhua.xu@intel.com> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1312264895-2010-1-git-send-email-youquan.song@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-09 11:58:06 +02:00
Linus Torvalds	45a05f9488	Merge branch 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/trace: Fix compile error when CONFIG_XEN_PRIVILEGED_GUEST is not set xen: Fix misleading WARN message at xen_release_chunk xen: Fix printk() format in xen/setup.c xen/tracing: it looks like we wanted CONFIG_FTRACE xen/self-balloon: Add dependency on tmem. xen/balloon: Fix compile errors - missing header files. xen/grant: Fix compile warning. xen/pciback: remove duplicated #include	2011-08-06 12:22:30 -07:00
Borislav Petkov	9387f774d6	x86-32, amd: Move va_align definition to unbreak 32-bit build hpa reported that `dfb09f9b7a` breaks 32-bit builds with the following error message: /home/hpa/kernel/linux-tip.cpu/arch/x86/kernel/cpu/amd.c:437: undefined reference to `va_align' /home/hpa/kernel/linux-tip.cpu/arch/x86/kernel/cpu/amd.c:436: undefined reference to `va_align' This is due to the fact that va_align is a global in a 64-bit only compilation unit. Move it to mmap.c where it is visible to both subarches. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1312633899-1131-1-git-send-email-bp@amd64.org Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2011-08-06 11:44:57 -07:00
Jack Steiner	05e33fc20e	x86, UV: Remove UV delay in starting slave cpus Delete the 10 msec delay between the INIT and SIPI when starting slave cpus. I can find no requirement for this delay. BIOS also has similar code sequences without the delay. Removing the delay reduces boot time by 40 sec. Every bit helps. Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/20110805140900.GA6774@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-05 23:48:34 +02:00
Paul Fox	a3ea14df0e	x86, olpc: Wait for last byte of EC command to be accepted When executing EC commands, only waiting when there are still more bytes to write is usually fine. However, if the system suspends very quickly after a call to olpc_ec_cmd(), the last data byte may not yet be transferred to the EC, and the command will not complete. This solves a bug where the SCI wakeup mask was not correctly written when going into suspend. It means that sometimes, on XO-1.5 (but not XO-1), the devices that were marked as wakeup sources can't wake up the system. e.g. you ask for wifi wakeups, suspend, but then incoming wifi frames don't wake up the system as they should. Signed-off-by: Paul Fox <pgf@laptop.org> Signed-off-by: Daniel Drake <dsd@laptop.org> Acked-by: Andres Salomon <dilinger@queued.net> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-05 23:47:55 +02:00
Borislav Petkov	8fa8b03508	x86, amd: Move BSP code to cpu_dev helper Move code which is run once on the BSP during boot into the cpu_dev helper. [ hpa: removed bogus cpu_has -> static_cpu_has conversion ] Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/20110805180409.GC26217@aftab Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-05 12:32:33 -07:00
Borislav Petkov	a110b5ec73	x86: Add a BSP cpu_dev helper Add a function ptr to struct cpu_dev which is destined to be run only once on the BSP during boot. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/20110805180116.GB26217@aftab Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-05 12:26:49 -07:00
Borislav Petkov	dfb09f9b7a	x86, amd: Avoid cache aliasing penalties on AMD family 15h This patch provides performance tuning for the "Bulldozer" CPU. With its shared instruction cache there is a chance of generating an excessive number of cache cross-invalidates when running specific workloads on the cores of a compute module. This excessive amount of cross-invalidations can be observed if cache lines backed by shared physical memory alias in bits [14:12] of their virtual addresses, as those bits are used for the index generation. This patch addresses the issue by clearing all the bits in the [14:12] slice of the file mapping's virtual address at generation time, thus forcing those bits the same for all mappings of a single shared library across processes and, in doing so, avoids instruction cache aliases. It also adds the command line option "align_va_addr=(32\|64\|on\|off)" with which virtual address alignment can be enabled for 32-bit or 64-bit x86 individually, or both, or be completely disabled. This change leaves virtual region address allocation on other families and/or vendors unaffected. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1312550110-24160-2-git-send-email-bp@amd64.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-05 12:26:44 -07:00
H. Peter Anvin	13f9a3737c	Merge commit 'v3.0' into x86/cpu	2011-08-05 12:25:56 -07:00
Konrad Rzeszutek Wilk	c00c8aa2d9	xen/trace: Fix compile error when CONFIG_XEN_PRIVILEGED_GUEST is not set with CONFIG_XEN and CONFIG_FTRACE set we get this: arch/x86/xen/trace.c:22: error: ‘__HYPERVISOR_console_io’ undeclared here (not in a function) arch/x86/xen/trace.c:22: error: array index in initializer not of integer type arch/x86/xen/trace.c:22: error: (near initialization for ‘xen_hypercall_names’) arch/x86/xen/trace.c:23: error: ‘__HYPERVISOR_physdev_op_compat’ undeclared here (not in a function) Issue was that the definitions of __HYPERVISOR were not pulled if CONFIG_XEN_PRIVILEGED_GUEST was not set. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-08-05 09:43:02 -04:00
Andy Lutomirski	c149a665ac	x86-64: Add vsyscall:emulate_vsyscall trace event Vsyscall emulation is slow, so make it easy to track down. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/cdaad7da946a80b200df16647c1700db3e1171e9.1312378163.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-04 16:13:53 -07:00
Andy Lutomirski	318f5a2a67	x86-64: Add user_64bit_mode paravirt op Three places in the kernel assume that the only long mode CPL 3 selector is __USER_CS. This is not true on Xen -- Xen's sysretq changes cs to the magic value 0xe033. Two of the places are corner cases, but as of "x86-64: Improve vsyscall emulation CS and RIP handling" (`c9712944b2`), vsyscalls will segfault if called with Xen's extra CS selector. This causes a panic when older init builds die. It seems impossible to make Xen use __USER_CS reliably without taking a performance hit on every system call, so this fixes the tests instead with a new paravirt op. It's a little ugly because ptrace.h can't include paravirt.h. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/f4fcb3947340d9e96ce1054a432f183f9da9db83.1312378163.git.luto@mit.edu Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-04 16:13:49 -07:00
Andy Lutomirski	5d5791af4c	x86-64, xen: Enable the vvar mapping Xen needs to handle VVAR_PAGE, introduced in git commit: `9fd67b4ed0` x86-64: Give vvars their own page Otherwise we die during bootup with a message like: (XEN) mm.c:940:d10 Error getting mfn 1888 (pfn 1e3e48) from L1 entry 8000000001888465 for l1e_owner=10, pg_owner=10 (XEN) mm.c:5049:d10 ptwr_emulate: could not get_page_from_l1e() [ 0.000000] BUG: unable to handle kernel NULL pointer dereference at (null) [ 0.000000] IP: [<ffffffff8103a930>] xen_set_pte+0x20/0xe0 Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/4659478ed2f3480938f96491c2ecbe2b2e113a23.1312378163.git.luto@mit.edu Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-04 16:13:47 -07:00
Andy Lutomirski	f670bb760e	x86-64: Work around gold bug 13023 Gold has trouble assigning numbers to the location counter inside of an output section description. The bug was triggered by `9fd67b4ed0`, which consolidated all of the vsyscall sections into a single section. The workaround is IMO still nicer than the old way of doing it. This produces an apparently valid kernel image and passes my vdso tests on both GNU ld version 2.21.51.0.6-2.fc15 20110118 and GNU gold (version 2.21.51.0.6-2.fc15 20110118) 1.10 as distributed by Fedora 15. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/0b260cb806f1f9a25c00ce8377a5f035d57f557a.1312378163.git.luto@mit.edu Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-04 16:13:38 -07:00
Andy Lutomirski	9c40818da5	x86-64: Move the "user" vsyscall segment out of the data segment. The kernel's loader doesn't seem to care, but gold complains. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/f0716870c297242a841b949953d80c0d87bf3d3f.1312378163.git.luto@mit.edu Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-04 16:13:35 -07:00
Andy Lutomirski	1bdfac19b3	x86-64: Pad vDSO to a page boundary This avoids an information leak to userspace. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/a63380a3c58a0506a2f5a18ba1b12dbde1f25e58.1312378163.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-04 16:13:34 -07:00
H. Peter Anvin	17b0436077	Merge commit 'v3.0' into x86/vdso	2011-08-04 16:13:20 -07:00
Igor Mammedov	98f531da84	xen: Fix misleading WARN message at xen_release_chunk WARN message should not complain "Failed to release memory %lx-%lx err=%d\n" ^^^^^^^ about range when it fails to release just one page, instead it should say what pfn is not freed. In addition line: printk(KERN_INFO "xen_release_chunk: looking at area pfn %lx-%lx: " ... printk(KERN_CONT "%lu pages freed\n", len); will be broken if WARN in between this line is fired. So fix it by using a single printk for this. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-08-04 15:31:29 -04:00
Igor Mammedov	8f3c5883d8	xen: Fix printk() format in xen/setup.c Use correct format specifier for unsigned long. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-08-04 15:31:28 -04:00
Jeremy Fitzhardinge	1e9ea2656b	xen/tracing: it looks like we wanted CONFIG_FTRACE Apparently we wanted CONFIG_FTRACE rather the CONFIG_FUNCTION_TRACER. Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-08-04 15:31:27 -04:00
Linus Torvalds	33f35f2a4e	x86: don't include xen/xen.h in <asm/io.h> unless XEN is enabled Dmitry Kasatkin reports: "kernel-devel package with kernel headers have no <include/xen> directory if XEN is disabled. Modules which inclide asm/io.h won't compile. XEN related content is behind the CONFIG_XEN flag in the io.h. And <xen/xen.h> should be also behind CONFIG_XEN flag." So move the include of <xen/xen.h> down into the section that is conditional on CONFIG_XEN. Reported-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-08-03 22:00:38 -10:00
Linus Torvalds	35e51fe82d	Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6 * 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6: cpuidle: stop depending on pm_idle x86 idle: move mwait_idle_with_hints() to where it is used cpuidle: replace xen access to x86 pm_idle and default_idle cpuidle: create bootparam "cpuidle.off=1" mrst_pmu: driver for Intel Moorestown Power Management Unit	2011-08-03 21:54:15 -10:00
Len Brown	a0bfa13738	cpuidle: stop depending on pm_idle cpuidle users should call cpuidle_call_idle() directly rather than via (pm_idle)() function pointer. Architecture may choose to continue using (pm_idle)(), but cpuidle need not depend on it: my_arch_cpu_idle() ... if(cpuidle_call_idle()) pm_idle(); cc: Kevin Hilman <khilman@deeprootsystems.com> cc: Paul Mundt <lethal@linux-sh.org> cc: x86@kernel.org Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2011-08-03 19:06:37 -04:00
Len Brown	4bfc8288bc	x86 idle: move mwait_idle_with_hints() to where it is used ...and make it static no functional change cc: x86@kernel.org Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2011-08-03 19:06:36 -04:00
Len Brown	d91ee5863b	cpuidle: replace xen access to x86 pm_idle and default_idle When a Xen Dom0 kernel boots on a hypervisor, it gets access to the raw-hardware ACPI tables. While it parses the idle tables for the hypervisor's beneift, it uses HLT for its own idle. Rather than have xen scribble on pm_idle and access default_idle, have it simply disable_cpuidle() so acpi_idle will not load and architecture default HLT will be used. cc: xen-devel@lists.xensource.com Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2011-08-03 19:06:36 -04:00
Len Brown	6dccf9c508	mrst_pmu: driver for Intel Moorestown Power Management Unit The Moorestown (MRST) Power Management Unit (PMU) driver directs the SOC power states in the "Langwell" south complex (SCU). It hooks pci_platform_pm_ops[] and thus observes all PCI ".set_state" requests. For devices in the SC, the pmu driver translates those PCI requests into the appropriate commands for the SCU. The PMU driver helps implement S0i3, a deep system idle power idle state. Entry into S0i3 is via cpuidle, just like regular processor c-states. S0i3 depends on pre-conditions including uni-processor, graphics off, and certain IO devices in the SC must be off. If those pre-conditions are met, then the PMU allows cpuidle to enter S0i3, otherwise such requests are demoted, either to Atom C4 or Atom C6. This driver is based on prototype work by Bruce Flemming, Illyas Mansoor, Rajeev D. Muralidhar, Vishwesh M. Rudramuni, Hari Seshadri and Sujith Thomas. The current driver also includes contributions from H. Peter Anvin, Arjan van de Ven, Kristen Accardi, and Yong Wang. Thanks for additional review feedback from Alan Cox and Randy Dunlap. Acked-by: Alan Cox <alan@linux.intel.com> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2011-08-03 19:06:12 -04:00
Len Brown	d0e323b470	Merge branch 'apei' into apei-release Some trivial conflicts due to other various merges adding to the end of common lists sooner than this one. arch/ia64/Kconfig arch/powerpc/Kconfig arch/x86/Kconfig lib/Kconfig lib/Makefile Signed-off-by: Len Brown <len.brown@intel.com>	2011-08-03 11:30:42 -04:00
Huang Ying	df013ffb81	Add Kconfig option ARCH_HAVE_NMI_SAFE_CMPXCHG cmpxchg() is widely used by lockless code, including NMI-safe lockless code. But on some architectures, the cmpxchg() implementation is not NMI-safe, on these architectures the lockless code may need a spin_trylock_irqsave() based implementation. This patch adds a Kconfig option: ARCH_HAVE_NMI_SAFE_CMPXCHG, so that NMI-safe lockless code can depend on it or provide different implementation according to it. On many architectures, cmpxchg is only NMI-safe for several specific operand sizes. So, ARCH_HAVE_NMI_SAFE_CMPXCHG define in this patch only guarantees cmpxchg is NMI-safe for sizeof(unsigned long). Signed-off-by: Huang Ying <ying.huang@intel.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Acked-by: Richard Henderson <rth@twiddle.net> CC: Mikael Starvik <starvik@axis.com> Acked-by: David Howells <dhowells@redhat.com> CC: Yoshinori Sato <ysato@users.sourceforge.jp> CC: Tony Luck <tony.luck@intel.com> CC: Hirokazu Takata <takata@linux-m32r.org> CC: Geert Uytterhoeven <geert@linux-m68k.org> CC: Michal Simek <monstr@monstr.eu> Acked-by: Ralf Baechle <ralf@linux-mips.org> CC: Kyle McMartin <kyle@mcmartin.ca> CC: Martin Schwidefsky <schwidefsky@de.ibm.com> CC: Chen Liqin <liqin.chen@sunplusct.com> CC: "David S. Miller" <davem@davemloft.net> CC: Ingo Molnar <mingo@redhat.com> CC: Chris Zankel <chris@zankel.net> Signed-off-by: Len Brown <len.brown@intel.com>	2011-08-03 11:12:37 -04:00
Maarten Lankhorst	1d12d35284	oprofile, x86: Convert memory allocation to static array On -rt, allocators don't work from atomic context any more, and the maximum size of the array is known at compile time. Call trace on a -rt kernel: oprofile: using NMI interrupt. BUG: sleeping function called from invalid context at kernel/rtmutex.c:645 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: kworker/0:1 Pid: 0, comm: kworker/0:1 Tainted: G C 3.0.0-rt3-patser+ #39 Call Trace: <IRQ> [<ffffffff8103fc0a>] __might_sleep+0xca/0xf0 [<ffffffff8160d424>] rt_spin_lock+0x24/0x40 [<ffffffff811476c7>] __kmalloc+0xc7/0x370 [<ffffffffa0275c85>] ? ppro_setup_ctrs+0x215/0x260 [oprofile] [<ffffffffa0273de0>] ? oprofile_cpu_notifier+0x60/0x60 [oprofile] [<ffffffffa0275c85>] ppro_setup_ctrs+0x215/0x260 [oprofile] [<ffffffffa0273de0>] ? oprofile_cpu_notifier+0x60/0x60 [oprofile] [<ffffffffa0273de0>] ? oprofile_cpu_notifier+0x60/0x60 [oprofile] [<ffffffffa0273ea4>] nmi_cpu_setup+0xc4/0x110 [oprofile] [<ffffffff81094455>] generic_smp_call_function_interrupt+0x95/0x190 [<ffffffff8101df77>] smp_call_function_interrupt+0x27/0x40 [<ffffffff81615093>] call_function_interrupt+0x13/0x20 <EOI> [<ffffffff8131d0c4>] ? plist_check_head+0x54/0xc0 [<ffffffff81371fe8>] ? intel_idle+0xc8/0x120 [<ffffffff81371fc7>] ? intel_idle+0xa7/0x120 [<ffffffff814a57b0>] cpuidle_idle_call+0xb0/0x230 [<ffffffff810011db>] cpu_idle+0x8b/0xe0 [<ffffffff815fc82f>] start_secondary+0x1d3/0x1d8 Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com> Signed-off-by: Robert Richter <robert.richter@amd.com>	2011-08-01 22:54:19 +02:00
Jon Mason	b03e7495a8	PCI: Set PCI-E Max Payload Size on fabric On a given PCI-E fabric, each device, bridge, and root port can have a different PCI-E maximum payload size. There is a sizable performance boost for having the largest possible maximum payload size on each PCI-E device. However, if improperly configured, fatal bus errors can occur. Thus, it is important to ensure that PCI-E payloads sends by a device are never larger than the MPS setting of all devices on the way to the destination. This can be achieved two ways: - A conservative approach is to use the smallest common denominator of the entire tree below a root complex for every device on that fabric. This means for example that having a 128 bytes MPS USB controller on one leg of a switch will dramatically reduce performances of a video card or 10GE adapter on another leg of that same switch. It also means that any hierarchy supporting hotplug slots (including expresscard or thunderbolt I suppose, dbl check that) will have to be entirely clamped to 128 bytes since we cannot predict what will be plugged into those slots, and we cannot change the MPS on a "live" system. - A more optimal way is possible, if it falls within a couple of constraints: * The top-level host bridge will never generate packets larger than the smallest TLP (or if it can be controlled independently from its MPS at least) * The device will never generate packets larger than MPS (which can be configured via MRRS) * No support of direct PCI-E <-> PCI-E transfers between devices without some additional code to specifically deal with that case Then we can use an approach that basically ignores downstream requests and focuses exclusively on upstream requests. In that case, all we need to care about is that a device MPS is no larger than its parent MPS, which allows us to keep all switches/bridges to the max MPS supported by their parent and eventually the PHB. In this case, your USB controller would no longer "starve" your 10GE Ethernet and your hotplug slots won't affect your global MPS. Additionally, the hotplugged devices themselves can be configured to a larger MPS up to the value configured in the hotplug bridge. To choose between the two available options, two PCI kernel boot args have been added to the PCI calls. "pcie_bus_safe" will provide the former behavior, while "pcie_bus_perf" will perform the latter behavior. By default, the latter behavior is used. NOTE: due to the location of the enablement, each arch will need to add calls to this function. This patch only enables x86. This patch includes a number of changes recommended by Benjamin Herrenschmidt. Tested-by: Jordan_Hargrave@dell.com Signed-off-by: Jon Mason <mason@myri.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2011-08-01 11:49:16 -07:00
H. Peter Anvin	49d859d78c	x86, random: Verify RDRAND functionality and allow it to be disabled If the CPU declares that RDRAND is available, go through a guranteed reseed sequence, and make sure that it is actually working (producing data.) If it does not, disable the CPU feature flag. Allow RDRAND to be disabled on the command line (as opposed to at compile time) for a user who has special requirements with regards to random numbers. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Matt Mackall <mpm@selenic.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Theodore Ts'o" <tytso@mit.edu>	2011-07-31 14:02:19 -07:00
H. Peter Anvin	628c6246d4	x86, random: Architectural inlines to get random integers with RDRAND Architectural inlines to get random ints and longs using the RDRAND instruction. Intel has introduced a new RDRAND instruction, a Digital Random Number Generator (DRNG), which is functionally an high bandwidth entropy source, cryptographic whitener, and integrity monitor all built into hardware. This enables RDRAND to be used directly, bypassing the kernel random number pool. For technical documentation, see: http://software.intel.com/en-us/articles/download-the-latest-bull-mountain-software-implementation-guide/ In this patch, this is only used for the nonblocking random number pool. RDRAND is a nonblocking source, similar to our /dev/urandom, and is therefore not a direct replacement for /dev/random. The architectural hooks presented in the previous patch only feed the kernel internal users, which only use the nonblocking pool, and so this is not a problem. Since this instruction is available in userspace, there is no reason to have a /dev/hw_rng device driver for the purpose of feeding rngd. This is especially so since RDRAND is a nonblocking source, and needs additional whitening and reduction (see the above technical documentation for details) in order to be of "pure entropy source" quality. The CONFIG_EXPERT compile-time option can be used to disable this use of RDRAND. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Originally-by: Fenghua Yu <fenghua.yu@intel.com> Cc: Matt Mackall <mpm@selenic.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Theodore Ts'o" <tytso@mit.edu>	2011-07-31 13:59:29 -07:00
Linus Torvalds	f85f19de90	Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: remove printks about disabled bridge windows PCI: fold pci_calc_resource_flags() into decode_bar() PCI: treat mem BAR type "11" (reserved) as 32-bit, not 64-bit, BAR PCI: correct pcie_set_readrq write size PCI: pciehp: change wait time for valid configuration access x86/PCI: Preserve existing pci=bfsort whitelist for Dell systems PCI: ARI is a PCIe v2 feature x86/PCI: quirks: Use pci_dev->revision PCI: Make the struct pci_dev * argument of pci_fixup_irqs const. PCI hotplug: cpqphp: use pci_dev->vendor PCI hotplug: cpqphp: use pci_dev->subsystem_{vendor\|device} x86/PCI: config space accessor functions should not ignore the segment argument PCI: Assign values to 'pci_obff_signal_type' enumeration constants x86/PCI: reduce severity of host bridge window conflict warnings PCI: enumerate the PCI device only removed out PCI hieratchy of OS when re-scanning PCI PCI: PCIe AER: add aer_recover_queue x86/PCI: select direct access mode for mmconfig option PCI hotplug: Rename is_ejectable which also exists in dock.c	2011-07-29 23:35:05 -07:00
Linus Torvalds	b993fdbc7f	Merge branch 'upstream/xen-tracing2' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'upstream/xen-tracing2' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: xen/tracing: fix compile errors when tracing is disabled.	2011-07-29 23:33:40 -07:00
Randy Dunlap	fd079facb3	KVM: fix TASK_DELAY_ACCT kconfig warning Fix kconfig dependency warning: warning: (KVM) selects TASK_DELAY_ACCT which has unmet direct dependencies (TASKSTATS) Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-27 14:22:06 +03:00
Arun Sharma	7847777a45	atomic: cleanup asm-generic atomic.h inclusion After changing all consumers of atomics to include <linux/atomic.h>, we ran into some compile time errors due to this dependency chain: linux/atomic.h -> asm/atomic.h -> asm-generic/atomic-long.h where atomic-long.h could use funcs defined later in linux/atomic.h without a prototype. This patches moves the code that includes asm-generic/atomic.h to linux/atomic.h. Archs that need <asm-generic/atomic64.h> need to select CONFIG_GENERIC_ATOMIC64 from now on (some of them used to include it unconditionally). Compile tested on i386 and x86_64 with allnoconfig. Signed-off-by: Arun Sharma <asharma@fb.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-26 16:49:47 -07:00
Arun Sharma	f24219b4e9	atomic: move atomic_add_unless to generic code This is in preparation for more generic atomic primitives based on __atomic_add_unless. Signed-off-by: Arun Sharma <asharma@fb.com> Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-26 16:49:47 -07:00
Arun Sharma	60063497a9	atomic: use <linux/atomic.h> This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-26 16:49:47 -07:00
Akinobu Mita	148817ba09	asm-generic: add another generic ext2 atomic bitops The majority of architectures implement ext2 atomic bitops as test_and_{set,clear}_bit() without spinlock. This adds this type of generic implementation in ext2-atomic-setbit.h and use it wherever possible. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Suggested-by: Andreas Dilger <adilger@dilger.ca> Suggested-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-26 16:49:46 -07:00
Mike Frysinger	0e9a6cb5e6	ptrace: unify show_regs() prototype [ poleg@redhat.com: no need to declare show_regs() in ptrace.h, sched.h does this ] Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-26 16:49:43 -07:00
Linus Torvalds	fa8f53ace4	Merge branch 'x86-olpc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-olpc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, olpc-xo15-sci: Enable EC wakeup capability x86, olpc: Fix dependency on POWER_SUPPLY x86, olpc: Add XO-1.5 SCI driver x86, olpc: Add XO-1 RTC driver x86, olpc-xo1-sci: Propagate power supply/battery events x86, olpc-xo1-sci: Add lid switch functionality x86, olpc-xo1-sci: Add GPE handler and ebook switch functionality x86, olpc: EC SCI wakeup mask functionality x86, olpc: Add XO-1 SCI driver and power button control x86, olpc: Add XO-1 suspend/resume support x86, olpc: Rename olpc-xo1 to olpc-xo1-pm x86, olpc: Move CS5536-related constants to cs5535.h x86, olpc: Add missing elements to device tree	2011-07-26 11:11:54 -07:00
Jeremy Fitzhardinge	b3c4b98250	xen/tracing: fix compile errors when tracing is disabled. When CONFIG_FUNCTION_TRACER is disabled, compilation fails as follows: CC arch/x86/xen/setup.o In file included from arch/x86/include/asm/xen/hypercall.h:42, from arch/x86/xen/setup.c:19: include/trace/events/xen.h:31: warning: 'struct multicall_entry' declared inside parameter list include/trace/events/xen.h:31: warning: its scope is only this definition or declaration, which is probably not what you want include/trace/events/xen.h:31: warning: 'struct multicall_entry' declared inside parameter list include/trace/events/xen.h:31: warning: 'struct multicall_entry' declared inside parameter list include/trace/events/xen.h:31: warning: 'struct multicall_entry' declared inside parameter list [...] arch/x86/xen/trace.c:5: error: '__HYPERVISOR_set_trap_table' undeclared here (not in a function) arch/x86/xen/trace.c:5: error: array index in initializer not of integer type arch/x86/xen/trace.c:5: error: (near initialization for 'xen_hypercall_names') arch/x86/xen/trace.c:6: error: '__HYPERVISOR_mmu_update' undeclared here (not in a function) arch/x86/xen/trace.c:6: error: array index in initializer not of integer type arch/x86/xen/trace.c:6: error: (near initialization for 'xen_hypercall_names') Fix this by making sure struct multicall_entry has a declaration in scope at all times, and don't bother compiling xen/trace.c when tracing is disabled. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2011-07-25 15:51:02 -07:00
Linus Torvalds	d3ec4844d4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits) fs: Merge split strings treewide: fix potentially dangerous trailing ';' in #defined values/expressions uwb: Fix misspelling of neighbourhood in comment net, netfilter: Remove redundant goto in ebt_ulog_packet trivial: don't touch files that are removed in the staging tree lib/vsprintf: replace link to Draft by final RFC number doc: Kconfig: `to be' -> `be' doc: Kconfig: Typo: square -> squared doc: Konfig: Documentation/power/{pm => apm-acpi}.txt drivers/net: static should be at beginning of declaration drivers/media: static should be at beginning of declaration drivers/i2c: static should be at beginning of declaration XTENSA: static should be at beginning of declaration SH: static should be at beginning of declaration MIPS: static should be at beginning of declaration ARM: static should be at beginning of declaration rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check Update my e-mail address PCIe ASPM: forcedly -> forcibly gma500: push through device driver tree ... Fix up trivial conflicts: - arch/arm/mach-ep93xx/dma-m2p.c (deleted) - drivers/gpio/gpio-ep93xx.c (renamed and context nearby) - drivers/net/r8169.c (just context changes)	2011-07-25 13:56:39 -07:00
Daniel Drake	07d5b38e14	x86, olpc-xo15-sci: Enable EC wakeup capability Some recent changes to the way that ACPI handles wakeup flags means that the XO15EC ACPI device is not wakeup-capable by default so device_set_wakeup_enable() does nothing. Use device_init_wakeup() to mark the device as wakeup capable, and to enable wakeups. Signed-off-by: Daniel Drake <dsd@laptop.org> Link: http://lkml.kernel.org/r/20110724173430.BE03C9D401C@zog.reactivated.net Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-24 20:16:05 +02:00
Daniel Drake	d8d01a6378	x86, olpc: Fix dependency on POWER_SUPPLY As reported by Randy Dunlap, CONFIG_POWER_SUPPLY=m caused a compile error: arch/x86/built-in.o: In function `battery_status_changed': olpc-xo15-sci.c:(.text+0x3acdd): undefined reference to `power_supply_get_by_name' olpc-xo15-sci.c:(.text+0x3ad04): undefined reference to `power_supply_changed' The SCI drivers, as bool, require POWER_SUPPLY to be builtin. Use select to make that a hard requirement and avoid this build failure. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Daniel Drake <dsd@laptop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-24 20:14:44 +02:00
Linus Torvalds	ff0c4ad2c3	Merge branch 'for-upstream' of git://openrisc.net/jonas/linux * 'for-upstream' of git://openrisc.net/jonas/linux: (24 commits) OpenRISC: Add MAINTAINERS entry OpenRISC: Miscellaneous OpenRISC: Library routines OpenRISC: Headers OpenRISC: Traps OpenRISC: Module support OpenRISC: GPIO OpenRISC: Scheduling/Process management OpenRISC: Idle/Power management OpenRISC: System calls OpenRISC: IRQ OpenRISC: Timekeeping OpenRISC: DMA OpenRISC: PTrace OpenRISC: Build infrastructure OpenRISC: Signal handling OpenRISC: Memory management OpenRISC: Device tree OpenRISC: Boot code iomap: make IOPORT/PCI mapping functions conditional ...	2011-07-24 09:55:18 -07:00
Linus Torvalds	fcda12e7f6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: modpost: Fix modpost's license checking V3 module: add /sys/module/<name>/uevent files module: change attr callbacks to take struct module_kobject modules: make arch's use default loader hooks modules: add default loader hook implementations param: fix return value handling in param_set_*	2011-07-24 09:54:54 -07:00
Linus Torvalds	5fabc487c9	Merge branch 'kvm-updates/3.1' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/3.1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits) KVM: IOMMU: Disable device assignment without interrupt remapping KVM: MMU: trace mmio page fault KVM: MMU: mmio page fault support KVM: MMU: reorganize struct kvm_shadow_walk_iterator KVM: MMU: lockless walking shadow page table KVM: MMU: do not need atomicly to set/clear spte KVM: MMU: introduce the rules to modify shadow page table KVM: MMU: abstract some functions to handle fault pfn KVM: MMU: filter out the mmio pfn from the fault pfn KVM: MMU: remove bypass_guest_pf KVM: MMU: split kvm_mmu_free_page KVM: MMU: count used shadow pages on prepareing path KVM: MMU: rename 'pt_write' to 'emulate' KVM: MMU: cleanup for FNAME(fetch) KVM: MMU: optimize to handle dirty bit KVM: MMU: cache mmio info on page fault path KVM: x86: introduce vcpu_mmio_gva_to_gpa to cleanup the code KVM: MMU: do not update slot bitmap if spte is nonpresent KVM: MMU: fix walking shadow page table KVM guest: KVM Steal time registration ...	2011-07-24 09:07:03 -07:00
Linus Torvalds	c61264f98c	Merge branch 'upstream/xen-tracing2' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'upstream/xen-tracing2' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: xen/trace: use class for multicall trace xen/trace: convert mmu events to use DECLARE_EVENT_CLASS()/DEFINE_EVENT() xen/multicall: move *idx fields to start of mc_buffer xen/multicall: special-case singleton hypercalls xen/multicalls: add unlikely around slowpath in __xen_mc_entry() xen/multicalls: disable MC_DEBUG xen/mmu: tune pgtable alloc/release xen/mmu: use extend_args for more mmuext updates xen/trace: add tlb flush tracepoints xen/trace: add segment desc tracing xen/trace: add xen_pgd_(un)pin tracepoints xen/trace: add ptpage alloc/release tracepoints xen/trace: add mmu tracepoints xen/trace: add multicall tracing xen/trace: set up tracepoint skeleton xen/multicalls: remove debugfs stats trace/xen: add skeleton for Xen trace events	2011-07-24 09:06:47 -07:00
Linus Torvalds	a23a334bd5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (34 commits) crypto: caam - ablkcipher support crypto: caam - faster aead implementation crypto: caam - structure renaming crypto: caam - shorter names crypto: talitos - don't bad_key in ablkcipher setkey crypto: talitos - remove unused giv from ablkcipher methods crypto: talitos - don't set done notification in hot path crypto: talitos - ensure request ordering within a single tfm crypto: gf128mul - fix call to memset() crypto: s390 - support hardware accelerated SHA-224 crypto: algif_hash - Handle initial af_alg_make_sg error correctly crypto: sha1_generic - use SHA1_BLOCK_SIZE hwrng: ppc4xx - add support for ppc4xx TRNG crypto: crypto4xx - Perform read/modify/write on device control register crypto: caam - fix build warning when DEBUG_FS not configured crypto: arc4 - Fixed coding style issues crypto: crc32c - Fixed coding style issue crypto: omap-sham - do not schedule tasklet if there is no active requests crypto: omap-sham - clear device flags when finishing request crypto: omap-sham - irq handler must not clear error code ...	2011-07-24 09:05:32 -07:00
Jonas Bonn	66574cc054	modules: make arch's use default loader hooks This patch removes all the module loader hook implementations in the architecture specific code where the functionality is the same as that now provided by the recently added default hooks. Signed-off-by: Jonas Bonn <jonas@southpole.se> Acked-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Michal Simek <monstr@monstr.eu> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2011-07-24 22:06:04 +09:30
Xiao Guangrong	4f0226482d	KVM: MMU: trace mmio page fault Add tracepoints to trace mmio page fault Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:41 +03:00
Xiao Guangrong	ce88decffd	KVM: MMU: mmio page fault support The idea is from Avi: \| We could cache the result of a miss in an spte by using a reserved bit, and \| checking the page fault error code (or seeing if we get an ept violation or \| ept misconfiguration), so if we get repeated mmio on a page, we don't need to \| search the slot list/tree. \| (https://lkml.org/lkml/2011/2/22/221) When the page fault is caused by mmio, we cache the info in the shadow page table, and also set the reserved bits in the shadow page table, so if the mmio is caused again, we can quickly identify it and emulate it directly Searching mmio gfn in memslots is heavy since we need to walk all memeslots, it can be reduced by this feature, and also avoid walking guest page table for soft mmu. [jan: fix operator precedence issue] Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:40 +03:00
Xiao Guangrong	dd3bfd59db	KVM: MMU: reorganize struct kvm_shadow_walk_iterator Reorganize it for good using the cache Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:39 +03:00
Xiao Guangrong	c2a2ac2b56	KVM: MMU: lockless walking shadow page table Use rcu to protect shadow pages table to be freed, so we can safely walk it, it should run fastly and is needed by mmio page fault Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:38 +03:00
Xiao Guangrong	603e0651cf	KVM: MMU: do not need atomicly to set/clear spte Now, the spte is just from nonprsent to present or present to nonprsent, so we can use some trick to set/clear spte non-atomicly as linux kernel does Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:37 +03:00
Xiao Guangrong	1df9f2dc39	KVM: MMU: introduce the rules to modify shadow page table Introduce some interfaces to modify spte as linux kernel does: - mmu_spte_clear_track_bits, it set the spte from present to nonpresent, and track the stat bits(accessed/dirty) of spte - mmu_spte_clear_no_track, the same as mmu_spte_clear_track_bits except tracking the stat bits - mmu_spte_set, set spte from nonpresent to present - mmu_spte_update, only update the stat bits Now, it does not allowed to set spte from present to present, later, we can drop the atomicly opration for X86_32 host, and it is the preparing work to get spte on X86_32 host out of the mmu lock Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:36 +03:00
Xiao Guangrong	d7c55201e6	KVM: MMU: abstract some functions to handle fault pfn Introduce handle_abnormal_pfn to handle fault pfn on page fault path, introduce mmu_invalid_pfn to handle fault pfn on prefetch path It is the preparing work for mmio page fault support Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:35 +03:00
Xiao Guangrong	fce92dce79	KVM: MMU: filter out the mmio pfn from the fault pfn If the page fault is caused by mmio, the gfn can not be found in memslots, and 'bad_pfn' is returned on gfn_to_hva path, so we can use 'bad_pfn' to identify the mmio page fault. And, to clarify the meaning of mmio pfn, we return fault page instead of bad page when the gfn is not allowd to prefetch Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:34 +03:00
Xiao Guangrong	c37079586f	KVM: MMU: remove bypass_guest_pf The idea is from Avi: \| Maybe it's time to kill off bypass_guest_pf=1. It's not as effective as \| it used to be, since unsync pages always use shadow_trap_nonpresent_pte, \| and since we convert between the two nonpresent_ptes during sync and unsync. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:33 +03:00
Xiao Guangrong	bd4c86eaa6	KVM: MMU: split kvm_mmu_free_page Split kvm_mmu_free_page to kvm_mmu_isolate_page and kvm_mmu_free_page One is used to remove the page from cache under mmu lock and the other is used to free page table out of mmu lock Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:32 +03:00
Xiao Guangrong	aa6bd187af	KVM: MMU: count used shadow pages on prepareing path Move counting used shadow pages from commiting path to preparing path to reduce tlb flush on some paths Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:31 +03:00
Xiao Guangrong	b90a0e6c81	KVM: MMU: rename 'pt_write' to 'emulate' If 'pt_write' is true, we need to emulate the fault. And in later patch, we need to emulate the fault even though it is not a pt_write event, so rename it to better fit the meaning Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:30 +03:00
Xiao Guangrong	b36c7a7c10	KVM: MMU: cleanup for FNAME(fetch) gw->pte_access is the final access permission, since it is unified with gw->pt_access when we walked guest page table: FNAME(walk_addr_generic): pte_access = pt_access & FNAME(gpte_access)(vcpu, pte, true); Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:29 +03:00
Xiao Guangrong	640d9b0dbe	KVM: MMU: optimize to handle dirty bit If dirty bit is not set, we can make the pte access read-only to avoid handing dirty bit everywhere Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:27 +03:00
Xiao Guangrong	bebb106a5a	KVM: MMU: cache mmio info on page fault path If the page fault is caused by mmio, we can cache the mmio info, later, we do not need to walk guest page table and quickly know it is a mmio fault while we emulate the mmio instruction Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:26 +03:00
Xiao Guangrong	af7cc7d1ee	KVM: x86: introduce vcpu_mmio_gva_to_gpa to cleanup the code Introduce vcpu_mmio_gva_to_gpa to translate the gva to gpa, we can use it to cleanup the code between read emulation and write emulation Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:25 +03:00
Xiao Guangrong	ffb61bb3bc	KVM: MMU: do not update slot bitmap if spte is nonpresent Set slot bitmap only if the spte is present Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:24 +03:00
Xiao Guangrong	052331bea3	KVM: MMU: fix walking shadow page table Properly check the last mapping, and do not walk to the next level if last spte is met Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:23 +03:00
Glauber Costa	d910f5c106	KVM guest: KVM Steal time registration This patch implements the kvm bits of the steal time infrastructure. The most important part of it, is the steal time clock. It is an continuous clock that shows the accumulated amount of steal time since vcpu creation. It is supposed to survive cpu offlining/onlining. [marcelo: fix build with CONFIG_KVM_GUEST=n] Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Tested-by: Eric B Munson <emunson@mgebm.net> CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Avi Kivity <avi@redhat.com> CC: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-07-24 11:49:36 +03:00
Linus Torvalds	b4db920c7f	Merge branches 'x86-detect-hyper-for-linus', 'x86-fpu-for-linus', 'x86-kexec-for-linus', 'x86-platform-for-linus', 'x86-quirks-for-linus', 'x86-tsc-for-linus' and 'x86-smpboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-detect-hyper-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hyper: Change hypervisor detection order * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-32, fpu: Fix DNA exception during check_fpu() * 'x86-kexec-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: kexec, x86: Fix incorrect jump back address if not preserving context * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, config: Introduce an INTEL_MID configuration * 'x86-quirks-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, quirks: Use pci_dev->revision * 'x86-tsc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: tsc: Remove unneeded DMI-based blacklisting * 'x86-smpboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, boot: Wait for boot cpu to show up if nr_cpus limit is about to hit	2011-07-23 10:38:21 -07:00
Linus Torvalds	7080d30676	Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, build: Do not set the root_dev field in bzImage	2011-07-23 10:36:09 -07:00
Linus Torvalds	148a7b1725	Merge branch 'x86-atomic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-atomic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Add support for cmpxchg_double	2011-07-23 10:35:43 -07:00
Linus Torvalds	9d0715630e	Merge branch 'timers-clocksource-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-clocksource-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: clocksource: apb: Share APB timer code with other platforms	2011-07-23 10:34:47 -07:00
Ohad Ben-Cohen	e72542191c	virtio: expose for non-virtualization users too virtio has been so far used only in the context of virtualization, and the virtio Kconfig was sourced directly by the relevant arch Kconfigs when VIRTUALIZATION was selected. Now that we start using virtio for inter-processor communications, we need to source the virtio Kconfig outside of the virtualization scope too. Moreover, some architectures might use virtio for both virtualization and inter-processor communications, so directly sourcing virtio might yield unexpected results due to conflicting selections. The simple solution offered by this patch is to always source virtio's Kconfig in drivers/Kconfig, and remove it from the appropriate arch Kconfigs. Additionally, a virtio menu entry has been added so virtio drivers don't show up in the general drivers menu. This way anyone can use virtio, though it's arguably less accessible (and neat!) for virtualization users now. Note: some architectures (mips and sh) seem to have a VIRTUALIZATION menu merely for sourcing virtio's Kconfig, so that menu is removed too. Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2011-07-23 16:20:30 +09:30
Linus Torvalds	8e204874db	Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-64, vdso: Do not allocate memory for the vDSO clocksource: Change __ARCH_HAS_CLOCKSOURCE_DATA to a CONFIG option x86, vdso: Drop now wrong comment Document the vDSO and add a reference parser ia64: Replace clocksource.fsys_mmio with generic arch data x86-64: Move vread_tsc and vread_hpet into the vDSO clocksource: Replace vread with generic arch data x86-64: Add --no-undefined to vDSO build x86-64: Allow alternative patching in the vDSO x86: Make alternative instruction pointers relative x86-64: Improve vsyscall emulation CS and RIP handling x86-64: Emulate legacy vsyscalls x86-64: Fill unused parts of the vsyscall page with 0xcc x86-64: Remove vsyscall number 3 (venosys) x86-64: Map the HPET NX x86-64: Remove kernel.vsyscall64 sysctl x86-64: Give vvars their own page x86-64: Document some of entry_64.S x86-64: Fix alignment of jiffies variable	2011-07-22 17:05:15 -07:00
Linus Torvalds	3e0b8df79d	Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, UV: Correct UV2 BAU destination timeout x86, UV: Correct failed topology memory leak x86, UV: Remove cpumask_t from the stack x86, UV: Rename hubmask to pnmask x86, UV: Correct reset_with_ipi() x86, UV: Allow for non-consecutive sockets x86, UV: Inline header file functions x86, UV: Fix smp_processor_id() use in a preemptable region x66, UV: Enable 64-bit ACPI MFCG support for SGI UV2 platform x86, UV: Clean up uv_mmrs.h	2011-07-22 17:04:55 -07:00
Linus Torvalds	8051207959	Merge branch 'x86-signal-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-signal-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Kill handle_signal()->set_fs() x86, do_signal: Simplify the TS_RESTORE_SIGMASK logic x86, signals: Convert the X86_32 code to use set_current_blocked() x86, signals: Convert the IA32_EMULATION code to use set_current_blocked()	2011-07-22 17:04:32 -07:00
Linus Torvalds	9e39264ed4	Merge branch 'x86-numa-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-numa-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, numa: Implement pfn -> nid mapping granularity check x86, mm: s/PAGES_PER_ELEMENT/PAGES_PER_SECTION/	2011-07-22 17:04:18 -07:00
Linus Torvalds	dc43d9fa73	Merge branch 'x86-mtrr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mtrr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mtrr: Use pci_dev->revision x86, mtrr: use stop_machine APIs for doing MTRR rendezvous stop_machine: implement stop_machine_from_inactive_cpu() stop_machine: reorganize stop_cpus() implementation x86, mtrr: lock stop machine during MTRR rendezvous sequence	2011-07-22 17:04:04 -07:00
Linus Torvalds	80775068db	Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, microcode, AMD: Fix section header size check x86, microcode, AMD: Correct buf references	2011-07-22 17:03:52 -07:00
Linus Torvalds	7c6582b28a	Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mce: Use mce_sysdev_ prefix to group functions x86, mce: Use mce_chrdev_ prefix to group functions x86, mce: Cleanup mce_read() x86, mce: Cleanup mce_create()/remove_device() x86, mce: Check the result of ancient_init() x86, mce: Introduce mce_gather_info() x86, mce: Replace MCM_ with MCI_MISC_ x86, mce: Replace MCE_SELF_VECTOR by irq_work x86, mce, severity: Clean up trivial coding style problems x86, mce, severity: Cleanup severity table x86, mce, severity: Make formatting a bit more readable x86, mce, severity: Fix two severities table signatures	2011-07-22 17:03:40 -07:00
Linus Torvalds	227ad9bc07	Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, efi: Properly pre-initialize table pointers x86, efi: Add infrastructure for UEFI 2.0 runtime services x86, efi: Fix argument types for SetVariable()	2011-07-22 17:03:14 -07:00
Linus Torvalds	35b004cce1	Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, intel, power: Correct the MSR_IA32_ENERGY_PERF_BIAS message x86, msr: Fix typo in ENERGY_PERF_BIAS_POWERSAVE x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS	2011-07-22 17:02:54 -07:00
Linus Torvalds	0a613b647b	Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, smpboot: Mark the names[] array in __inquire_remote_apic() as const x86: Convert vmalloc()+memset() to vzalloc()	2011-07-22 17:02:38 -07:00
Linus Torvalds	eb47418dc5	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix write lock scalability 64-bit issue x86: Unify rwsem assembly implementation x86: Unify rwlock assembly implementation x86, asm: Fix binutils 2.16 issue with __USER32_CS x86, asm: Cleanup thunk_64.S x86, asm: Flip RESTORE_ARGS arguments logic x86, asm: Flip SAVE_ARGS arguments logic x86, asm: Thin down SAVE/RESTORE_* asm macros	2011-07-22 17:02:24 -07:00
Linus Torvalds	2c9e88a108	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, ioapic: Print IR_IO_APIC_route_entry when IR is enabled x86, ioapic: Print IRTE when IR is enabled x86, x2apic: Preserve high 32-bits of IA32_APIC_BASE MSR x86, ioapic: Also print Dest field x86, ioapic: Format clean up for IOAPIC output x86: print APIC data a little later during boot	2011-07-22 17:02:07 -07:00
Linus Torvalds	52de84f3f3	Merge branch 'timers-rtc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-rtc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Serialize EFI time accesses on rtc_lock x86: Serialize SMP bootup CMOS accesses on rtc_lock rtc: stmp3xxx: Remove UIE handlers rtc: stmp3xxx: Get rid of mach-specific accessors rtc: stmp3xxx: Initialize drvdata before registering device rtc: stmp3xxx: Port stmp-functions to mxs-equivalents rtc: stmp3xxx: Restore register definitions rtc: vt8500: Use define instead of hardcoded value for status bit	2011-07-22 16:52:39 -07:00
Linus Torvalds	a99a7d1436	Merge branch 'timers-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: mips: Fix i8253 clockevent fallout i8253: Cleanup outb/inb magic arm: Footbridge: Use common i8253 clockevent mips: Use common i8253 clockevent x86: Use common i8253 clockevent i8253: Create common clockevent implementation i8253: Export i8253_lock unconditionally pcpskr: MIPS: Make config dependencies finer grained pcspkr: Cleanup Kconfig dependencies i8253: Move remaining content and delete asm/i8253.h i8253: Consolidate definitions of PIT_LATCH x86: i8253: Consolidate definitions of global_clock_event i8253: Alpha, PowerPC: Remove unused asm/8253pit.h alpha: i8253: Cleanup remaining users of i8253pit.h i8253: Remove I8253_LOCK config i8253: Make pcsp sound driver use the shared i8253_lock i8253: Make pcspkr input driver use the shared i8253_lock i8253: Consolidate all kernel definitions of i8253_lock i8253: Unify all kernel declarations of i8253_lock i8253: Create linux/i8253.h and use it in all 8253 related files	2011-07-22 16:51:56 -07:00
Linus Torvalds	4d4abdcb1d	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (123 commits) perf: Remove the nmi parameter from the oprofile_perf backend x86, perf: Make copy_from_user_nmi() a library function perf: Remove perf_event_attr::type check x86, perf: P4 PMU - Fix typos in comments and style cleanup perf tools: Make test use the preset debugfs path perf tools: Add automated tests for events parsing perf tools: De-opt the parse_events function perf script: Fix display of IP address for non-callchain path perf tools: Fix endian conversion reading event attr from file header perf tools: Add missing 'node' alias to the hw_cache[] array perf probe: Support adding probes on offline kernel modules perf probe: Add probed module in front of function perf probe: Introduce debuginfo to encapsulate dwarf information perf-probe: Move dwarf library routines to dwarf-aux.{c, h} perf probe: Remove redundant dwarf functions perf probe: Move strtailcmp to string.c perf probe: Rename DIE_FIND_CB_FOUND to DIE_FIND_CB_END tracing/kprobe: Update symbol reference when loading module tracing/kprobes: Support module init function probing kprobes: Return -ENOENT if probe point doesn't exist ...	2011-07-22 16:44:39 -07:00
Linus Torvalds	6d16d6d9bb	Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: iommu/core: Fix build with INTR_REMAP=y && CONFIG_DMAR=n iommu/amd: Don't use MSI address range for DMA addresses iommu/amd: Move missing parts to drivers/iommu iommu: Move iommu Kconfig entries to submenu x86/ia64: intel-iommu: move to drivers/iommu/ x86: amd_iommu: move to drivers/iommu/ msm: iommu: move to drivers/iommu/ drivers: iommu: move to a dedicated folder x86/amd-iommu: Store device alias as dev_data pointer x86/amd-iommu: Search for existind dev_data before allocting a new one x86/amd-iommu: Allow dev_data->alias to be NULL x86/amd-iommu: Use only dev_data in low-level domain attach/detach functions x86/amd-iommu: Use only dev_data for dte and iotlb flushing routines x86/amd-iommu: Store ATS state in dev_data x86/amd-iommu: Store devid in dev_data x86/amd-iommu: Introduce global dev_data_list x86/amd-iommu: Remove redundant device_flush_dte() calls iommu-api: Add missing header file Fix up trivial conflicts (independent additions close to each other) in drivers/Makefile and include/linux/pci.h	2011-07-22 16:39:42 -07:00
Linus Torvalds	17413f5acd	Merge branch 'for-3.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-3.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: Fixup __this_cpu_xchg* operations	2011-07-22 15:07:35 -07:00
Linus Torvalds	acb41c0f92	Merge branch 'of-pci' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'of-pci' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: pci/of: Consolidate pci_bus_to_OF_node() pci/of: Consolidate pci_device_to_OF_node() x86/devicetree: Use generic PCI <-> OF matching microblaze/pci: Move the remains of pci_32.c to pci-common.c microblaze/pci: Remove powermac originated cruft pci/of: Match PCI devices to OF nodes dynamically	2011-07-22 14:54:02 -07:00
Linus Torvalds	951cc93a74	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1287 commits) icmp: Fix regression in nexthop resolution during replies. net: Fix ppc64 BPF JIT dependencies. acenic: include NET_SKB_PAD headroom to incoming skbs ixgbe: convert to ndo_fix_features ixgbe: only enable WoL for magic packet by default ixgbe: remove ifdef check for non-existent define ixgbe: Pass staterr instead of re-reading status and error bits from descriptor ixgbe: Move interrupt related values out of ring and into q_vector ixgbe: add structure for containing RX/TX rings to q_vector ixgbe: inline the ixgbe_maybe_stop_tx function ixgbe: Update ATR to use recorded TX queues instead of CPU for routing igb: Fix for DH89xxCC near end loopback test e1000: always call e1000_check_for_link() on e1000_ce4100 MACs. netxen: add fw version compatibility check be2net: request native mode each time the card is reset ipv4: Constrain UFO fragment sizes to multiples of 8 bytes virtio_net: Fix panic in virtnet_remove ipv6: make fragment identifications less predictable ipv6: unshare inetpeers can: make function can_get_bittiming static ...	2011-07-22 14:43:13 -07:00
Linus Torvalds	a7e1aabb28	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: lguest: Fix in/out emulation lguest: Fix translation count about wikipedia's cpuid page lguest: Fix three simple typos in comments lguest: update comments lguest: Simplify device initialization. lguest: don't rewrite vmcall instructions lguest: remove remaining vmcall lguest: use a special 1:1 linear pagetable mode until first switch. lguest: Do not exit on non-fatal errors	2011-07-22 13:45:50 -07:00
Linus Torvalds	111ad119d1	Merge branch 'stable/drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/pciback: Have 'passthrough' option instead of XEN_PCIDEV_BACKEND_PASS and XEN_PCIDEV_BACKEND_VPCI xen/pciback: Remove the DEBUG option. xen/pciback: Drop two backends, squash and cleanup some code. xen/pciback: Print out the MSI/MSI-X (PIRQ) values xen/pciback: Don't setup an fake IRQ handler for SR-IOV devices. xen: rename pciback module to xen-pciback. xen/pciback: Fine-grain the spinlocks and fix BUG: scheduling while atomic cases. xen/pciback: Allocate IRQ handler for device that is shared with guest. xen/pciback: Disable MSI/MSI-X when reseting a device xen/pciback: guest SR-IOV support for PV guest xen/pciback: Register the owner (domain) of the PCI device. xen/pciback: Cleanup the driver based on checkpatch warnings and errors. xen/pciback: xen pci backend driver. xen: tmem: self-ballooning and frontswap-selfshrinking xen: Add module alias to autoload backend drivers xen: Populate xenbus device attributes xen: Add __attribute__((format(printf... where appropriate xen: prepare tmem shim to handle frontswap xen: allow enable use of VGA console on dom0	2011-07-22 13:45:15 -07:00
Linus Torvalds	997271cf5e	Merge branch 'stable/pci.cleanups.v1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/pci.cleanups.v1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/pci: Use 'acpi_gsi_to_irq' value unconditionally. xen/pci: Remove 'xen_allocate_pirq_gsi'. xen/pci: Retire unnecessary #ifdef CONFIG_ACPI xen/pci: Move the allocation of IRQs when there are no IOAPIC's to the end xen/pci: Squash pci_xen_initial_domain and xen_setup_pirqs together. xen/pci: Use the xen_register_pirq for HVM and initial domain users xen/pci: In xen_register_pirq bind the GSI to the IRQ after the hypercall. xen/pci: Provide #ifdef CONFIG_ACPI to easy code squashing. xen/pci: Update comments and fix empty spaces. xen/pci: Shuffle code around.	2011-07-22 13:44:53 -07:00
Linus Torvalds	896d01796d	Merge branch 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen:pvhvm: Modpost section mismatch fix	2011-07-22 13:44:45 -07:00
Jonas Bonn	a4e05276a1	asm-generic: move archictures to common delay.h This patch moves the in-tree architectures that were using the 'generic' delay.h over to using the header file in asm-generic. This is not done using the generic-y mechanism as none of these arch's have started using that mechanism yet. This is a trivial change to make later when the arch begins using generic-y. Note the subtle change to the avr32 and SH architectures where the argument to __const_udelay was previously using the rounded down constant value instead of the rounded up value. Signed-off-by: Jonas Bonn <jonas@southpole.se> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>	2011-07-22 18:46:24 +02:00
Narendra_K@Dell.com	9b373ed18f	x86/PCI: Preserve existing pci=bfsort whitelist for Dell systems Commit `6e8af08dfa` enables pci=bfsort on future Dell systems. But the identification string 'Dell System' matches on already existing whitelist, which do not have SMBIOS type 0xB1, causing pci=bfsort not being set on existing whitelist. This patch fixes the regression by moving the type 0xB1 check beyond the existing whitelist so that existing whitelist is walked before. Signed-off-by: Shyam Iyer <shyam_iyer@dell.com> Signed-off-by: Narendra K <narendra_k@dell.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2011-07-22 08:44:28 -07:00
tip-bot for Sergei Shtylyov	a8c7ef3187	x86/PCI: quirks: Use pci_dev->revision This code uses PCI_CLASS_REVISION instead of PCI_REVISION_ID, so it wasn't converted by commit `44c10138fd` ("PCI: Change all drivers to use pci_device->revision") before being moved to arch/x86/... Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Dave Jones <davej@redhat.com> Link: http://lkml.kernel.org/r/201107111901.39281.sshtylyov@ru.mvista.com Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2011-07-22 08:41:39 -07:00
Ralf Baechle	d5341942d7	PCI: Make the struct pci_dev * argument of pci_fixup_irqs const. Aside of the usual motivation for constification, this function has a history of being abused a hook for interrupt and other fixups so I turned this function const ages ago in the MIPS code but it should be done treewide. Due to function pointer passing in varous places a few other functions had to be constified as well. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> To: Anton Vorontsov <avorontsov@mvista.com> To: Chris Metcalf <cmetcalf@tilera.com> To: Colin Cross <ccross@android.com> Acked-by: "David S. Miller" <davem@davemloft.net> To: Eric Miao <eric.y.miao@gmail.com> To: Erik Gilling <konkers@android.com> Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn> To: "H. Peter Anvin" <hpa@zytor.com> To: Imre Kaloz <kaloz@openwrt.org> To: Ingo Molnar <mingo@redhat.com> To: Ivan Kokshaysky <ink@jurassic.park.msu.ru> To: Jesse Barnes <jbarnes@virtuousgeek.org> To: Krzysztof Halasa <khc@pm.waw.pl> To: Lennert Buytenhek <kernel@wantstofly.org> To: Matt Turner <mattst88@gmail.com> To: Nicolas Pitre <nico@fluxnic.net> To: Olof Johansson <olof@lixom.net> Acked-by: Paul Mundt <lethal@linux-sh.org> To: Richard Henderson <rth@twiddle.net> To: Russell King <linux@arm.linux.org.uk> To: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-alpha@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linux-pci@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: linux-tegra@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: x86@kernel.org Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2011-07-22 08:26:06 -07:00
Jan Beulich	db34a363b9	x86/PCI: config space accessor functions should not ignore the segment argument Without this change, the majority of the raw PCI config space access functions silently ignore a non-zero segment argument, which is certainly wrong. Apart from pci_direct_conf1, all other non-MMCFG access methods get used only for non-extended accesses (i.e. assigned to raw_pci_ops only). Consequently, with the way raw_pci_{read,write}() work, it would be a coding error to call these functions with a non-zero segment (with the current call flow this cannot happen afaict). The access method 1 accessor, as it can be used for extended accesses (on AMD systems) instead gets checks added for the passed in segment to be zero. This would be the case when on such a system having multiple PCI segments (don't know whether any exist in practice) MMCFG for some reason is not usable, and method 1 gets selected for doing extended accesses. Rather than accessing the wrong device's config space, the function will now error out. v2: Convert BUG_ON() to WARN_ON(), and extend description as per Ingo's request. Signed-off-by: Jan Beulich <jbeulich@novell.com> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2011-07-22 08:25:41 -07:00
Bjorn Helgaas	43d786ed4d	x86/PCI: reduce severity of host bridge window conflict warnings Host bridge windows are top-level resources, so if we find a host bridge window conflict, it's probably with a hard-coded legacy reservation. Moving host bridge windows is theoretically possible, but we don't support it; we just ignore windows with conflicts, and it's not worth making this a user-visible error. Reported-and-tested-by: Jools Wills <jools@oxfordinspire.co.uk> References: https://bugzilla.kernel.org/show_bug.cgi?id=38522 Reported-by: Das <dasfox@gmail.com> References: https://bugzilla.kernel.org/show_bug.cgi?id=16497 Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2011-07-22 08:25:39 -07:00
Shaohua Li	0aba496fc8	x86/PCI: select direct access mode for mmconfig option Direct access is needed in mmconf mode too. There are two reasons: 1. we need it to access first 256 bytes. We have bug before that using mmconf to access pci config space hangs system (when resizing BARs), so we use type1 config for legacy config space. 2. when doing mmconfg bar checking, we need access ACPI _CRS, which might access PCI config space. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2011-07-22 08:25:36 -07:00
Adrian Knoth	8d431f4160	lguest: Fix translation count about wikipedia's cpuid page The comment is outdated, wikipedia now has six translations of the cpuid page. Signed-off-by: Adrian Knoth <adi@drcomp.erfurt.thur.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2011-07-22 14:39:50 +09:30
Adrian Knoth	64be11583b	lguest: Fix three simple typos in comments This patch fixes three typos I've accidentally spotted. Signed-off-by: Adrian Knoth <adi@drcomp.erfurt.thur.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (one was already fixed)	2011-07-22 14:39:50 +09:30
Rusty Russell	9f54288def	lguest: update comments Also removes a long-unused #define and an extraneous semicolon. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2011-07-22 14:39:50 +09:30
Rusty Russell	7e1941444f	lguest: remove remaining vmcall We switch back from using vmcall in `091ebf07a2` because it was unreliable under kvm, but I missed one (rarely-used) place. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2011-07-22 14:39:49 +09:30
Rusty Russell	5dea1c88ed	lguest: use a special 1:1 linear pagetable mode until first switch. The Host used to create some page tables for the Guest to use at the top of Guest memory; it would then tell the Guest where this was. In particular, it created linear mappings for 0 and 0xC0000000 addresses because lguest used to switch to its real page tables quite late in boot. However, since `d50d8fe19` Linux initialized boot page tables in head_32.S even before the "are we lguest?" boot jump. So, now we can simplify things: the Host pagetable code assumes 1:1 linear mapping until it first calls the LHCALL_NEW_PGTABLE hypercall, which we now do before we reach C code. This also means that the Host doesn't need to know anything about the Guest's PAGE_OFFSET. (Non-Linux guests might not even have such a thing). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2011-07-22 14:39:48 +09:30
Andy Lutomirski	aafade242f	x86-64, vdso: Do not allocate memory for the vDSO We can map the vDSO straight from kernel data, saving a few page allocations. As an added bonus, the deleted code contained a memory leak. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/2c4ed5c2c2e93603790229e0c3403ae506ccc0cb.1311277573.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2011-07-21 13:41:53 -07:00
H. Peter Anvin	ae7bd11b47	clocksource: Change __ARCH_HAS_CLOCKSOURCE_DATA to a CONFIG option The machinery for __ARCH_HAS_CLOCKSOURCE_DATA assumed a file in asm-generic would be the default for architectures without their own file in asm/, but that is not how it works. Replace it with a Kconfig option instead. Link: http://lkml.kernel.org/r/4E288AA6.7090804@zytor.com Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@mit.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Tony Luck <tony.luck@intel.com>	2011-07-21 13:34:05 -07:00
H. Peter Anvin	a536877e77	x86: Make Dell Latitude E6420 use reboot=pci Yet another variant of the Dell Latitude series which requires reboot=pci. From the E5420 bug report by Daniel J Blueman: > The E6420 is affected also (same platform, different casing and > features), which provides an external confirmation of the issue; I can > submit a patch for that later or include it if you prefer: > http://linux.koolsolutions.com/2009/08/04/howto-fix-linux-hangfreeze-during-reboots-and-restarts/ Reported-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@kernel.org>	2011-07-21 11:47:17 -07:00
Daniel J Blueman	b7798d28ec	x86: Make Dell Latitude E5420 use reboot=pci Rebooting on the Dell E5420 often hangs with the keyboard or ACPI methods, but is reliable via the PCI method. [ hpa: this was deferred because we believed for a long time that the recent reshuffling of the boot priorities in commit `660e34cebf` fixed this platform. Unfortunately that turned out to be incorrect. ] Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Link: http://lkml.kernel.org/r/1305248699-2347-1-git-send-email-daniel.blueman@gmail.com Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@kernel.org>	2011-07-21 11:45:49 -07:00
Robert Richter	1ac2e6ca44	x86, perf: Make copy_from_user_nmi() a library function copy_from_user_nmi() is used in oprofile and perf. Moving it to other library functions like copy_from_user(). As this is x86 code for 32 and 64 bits, create a new file usercopy.c for unified code. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110607172413.GJ20052@erda.amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 20:41:57 +02:00
Cyrill Gorcunov	f53173e47d	x86, perf: P4 PMU - Fix typos in comments and style cleanup This patch: - fixes typos in comments and clarifies the text - renames obscure p4_event_alias::original and ::alter members to ::original and ::alternative as appropriate - drops parenthesis from the return of p4_get_alias_event() No functional changes. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Link: http://lkml.kernel.org/r/20110721160625.GX7492@sun Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 20:41:54 +02:00
Phil Carmody	497888cf69	treewide: fix potentially dangerous trailing ';' in #defined values/expressions All these are instances of #define NAME value; or #define NAME(params_opt) value; These of course fail to build when used in contexts like if(foo $OP NAME) while(bar $OP NAME) and may silently generate the wrong code in contexts such as foo = NAME + 1; /* foo = value; + 1; / bar = NAME - 1; / bar = value; - 1; / baz = NAME & quux; / baz = value; & quux; */ Reported on comp.lang.c, Message-ID: <ab0d55fe-25e5-482b-811e-c475aa6065c3@c29g2000yqd.googlegroups.com> Initial analysis of the dangers provided by Keith Thompson in that thread. There are many more instances of more complicated macros having unnecessary trailing semicolons, but this pile seems to be all of the cases of simple values suffering from the problem. (Thus things that are likely to be found in one of the contexts above, more complicated ones aren't.) Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-07-21 14:10:00 +02:00
Huang Ying	050438ed5a	kexec, x86: Fix incorrect jump back address if not preserving context In kexec jump support, jump back address passed to the kexeced kernel via function calling ABI, that is, the function call return address is the jump back entry. Furthermore, jump back entry == 0 should be used to signal that the jump back or preserve context is not enabled in the original kernel. But in the current implementation the stack position used for function call return address is not cleared context preservation is disabled. The patch fixes this bug. Reported-and-tested-by: Yin Kangkai <kangkai.yin@intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/1310607277-25029-1-git-send-email-ying.huang@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 11:19:28 +02:00
Alan Cox	43605ef188	x86, config: Introduce an INTEL_MID configuration We need to carve up the configuration between: - MID general - Moorestown specific - Medfield specific - Future devices As a base point create an INTEL_MID configuration property. We make the existing MRST configuration a sub-option. This means that the rest of the kernel config can still use X86_MRST checks without anything going backwards. After this is merged future patches will tidy up which devices are MID and which are X86_MRST, as well as add options for Medfield. Signed-off-by: Alan Cox <alan@linux.intel.com> Link: http://lkml.kernel.org/r/20110712164859.7642.84136.stgit@bob.linux.org.uk Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 10:35:14 +02:00
Sergei Shtylyov	38175051f8	x86, quirks: Use pci_dev->revision This code uses PCI_CLASS_REVISION instead of PCI_REVISION_ID, so it wasn't converted by commit `44c10138fd` ("PCI: Change all drivers to use pci_device->revision") before being moved to arch/x86/... Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Dave Jones <davej@redhat.com> Link: http://lkml.kernel.org/r/201107111901.39281.sshtylyov@ru.mvista.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 10:26:00 +02:00
Greg Dietsche	a6c23905ff	x86, smpboot: Mark the names[] array in __inquire_remote_apic() as const This array is read-only. Make it explicit by marking as const. Signed-off-by: Greg Dietsche <Gregory.Dietsche@cuw.edu> Link: http://lkml.kernel.org/r/1309482653-23648-1-git-send-email-Gregory.Dietsche@cuw.edu Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 10:04:51 +02:00
Jan Beulich	ef68c8f87e	x86: Serialize EFI time accesses on rtc_lock The EFI specification requires that callers of the time related runtime functions serialize with other CMOS accesses in the kernel, as the EFI time functions may choose to also use the legacy CMOS RTC. Besides fixing a latent bug, this is a prerequisite to safely enable the rtc-efi driver for x86, which ought to be preferred over rtc-cmos on all EFI platforms. Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Matthew Garrett <mjg59@srcf.ucam.org> Cc: <mjg@redhat.com> Link: http://lkml.kernel.org/r/4E257E33020000780004E319@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Matthew Garrett <mjg@redhat.com>	2011-07-21 09:21:00 +02:00
Jan Beulich	ac619f4eba	x86: Serialize SMP bootup CMOS accesses on rtc_lock With CPU hotplug, there is a theoretical race between other CMOS (namely RTC) accesses and those done in the SMP secondary processor bringup path. I am unware of the problem having been noticed by anyone in practice, but it would very likely be rather spurious and very hard to reproduce. So to be on the safe side, acquire rtc_lock around those accesses. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/4E257AE7020000780004E2FF@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 09:20:59 +02:00
Jan Beulich	a750036f35	x86: Fix write lock scalability 64-bit issue With the write lock path simply subtracting RW_LOCK_BIAS there is, on large systems, the theoretical possibility of overflowing the 32-bit value that was used so far (namely if 128 or more CPUs manage to do the subtraction, but don't get to do the inverse addition in the failure path quickly enough). A first measure is to modify RW_LOCK_BIAS itself - with the new value chosen, it is good for up to 2048 CPUs each allowed to nest over 2048 times on the read path without causing an issue. Quite possibly it would even be sufficient to adjust the bias a little further, assuming that allowing for significantly less nesting would suffice. However, as the original value chosen allowed for even more nesting levels, to support more than 2048 CPUs (possible currently only for 64-bit kernels) the lock itself gets widened to 64 bits. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4E258E0D020000780004E3F0@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 09:03:36 +02:00
Jan Beulich	a738669464	x86: Unify rwsem assembly implementation Rather than having two functionally identical implementations for 32- and 64-bit configurations, use the previously extended assembly abstractions to fold the rwsem two implementations into a shared one. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4E258DF3020000780004E3ED@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 09:03:32 +02:00
Jan Beulich	4625cd6379	x86: Unify rwlock assembly implementation Rather than having two functionally identical implementations for 32- and 64-bit configurations, extend the existing assembly abstractions enough to fold the two rwlock implementations into a shared one. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4E258DD7020000780004E3EA@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 09:03:31 +02:00
Linus Torvalds	919d25a710	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86. reboot: Make Dell Latitude E6320 use reboot=pci x86, doc only: Correct real-mode kernel header offset for init_size x86: Disable AMD_NUMA for 32bit for now	2011-07-20 15:33:59 -07:00
Jeremy Fitzhardinge	2a6f6d0955	xen/multicall: move *idx fields to start of mc_buffer The CPU would prefer small offsets. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2011-07-18 15:43:46 -07:00
Jeremy Fitzhardinge	eac303bf2e	xen/multicall: special-case singleton hypercalls Singleton calls seem to end up being pretty common, so just directly call the hypercall rather than going via multicall. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2011-07-18 15:43:45 -07:00
Jeremy Fitzhardinge	4a7b005dbf	xen/multicalls: add unlikely around slowpath in __xen_mc_entry() Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2011-07-18 15:43:45 -07:00
Jeremy Fitzhardinge	ffc78767f2	xen/multicalls: disable MC_DEBUG It's useful - and probably should be a config - but its very heavyweight, especially with the tracing stuff to help sort out problems. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2011-07-18 15:43:28 -07:00
Jeremy Fitzhardinge	bc7fe1d977	xen/mmu: tune pgtable alloc/release Make sure the fastpath code is inlined. Batch the page permission change and the pin/unpin, and make sure that it can be batched with any adjacent set_pte/pmd/etc operations. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2011-07-18 15:43:28 -07:00
Jeremy Fitzhardinge	dcf7435cfe	xen/mmu: use extend_args for more mmuext updates Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2011-07-18 15:43:27 -07:00
Jeremy Fitzhardinge	c8eed1719a	xen/trace: add tlb flush tracepoints Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2011-07-18 15:43:27 -07:00
Jeremy Fitzhardinge	ab78f7ad2c	xen/trace: add segment desc tracing Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2011-07-18 15:43:27 -07:00
Jeremy Fitzhardinge	5f94fb5b8e	xen/trace: add xen_pgd_(un)pin tracepoints Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2011-07-18 15:43:27 -07:00
Jeremy Fitzhardinge	c2ba050d2e	xen/trace: add ptpage alloc/release tracepoints Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2011-07-18 15:43:27 -07:00
Jeremy Fitzhardinge	8470880791	xen/trace: add mmu tracepoints Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2011-07-18 15:43:27 -07:00
Jeremy Fitzhardinge	c796f213a6	xen/trace: add multicall tracing Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2011-07-18 15:43:26 -07:00
Jeremy Fitzhardinge	f04e2ee41d	xen/trace: set up tracepoint skeleton Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2011-07-18 15:43:04 -07:00
Jeremy Fitzhardinge	84cdee76b1	xen/multicalls: remove debugfs stats Remove debugfs stats to make way for tracing. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2011-07-18 15:43:04 -07:00
Borislav Petkov	8c400f6ce0	x86, vdso: Drop now wrong comment Now that `1b3f2a72bb` is in, it is very important that the below lying comment be removed! :-) Signed-off-by: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20110718191054.GA18359@liondog.tnic Acked-by: Andy Lutomirski <luto@mit.edu> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-18 12:29:50 -07:00
Len Brown	17edf2d79f	x86, intel, power: Correct the MSR_IA32_ENERGY_PERF_BIAS message Fix the printk_once() so that it actually prints (didn't print before due to a stray comma.) [ hpa: changed to an incremental patch and adjusted the description accordingly. ] Signed-off-by: Len Brown <len.brown@intel.com> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1107151732480.18606@x980 Cc: <table@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-15 15:13:55 -07:00
Oleg Nesterov	73d382decc	x86: Kill handle_signal()->set_fs() handle_signal()->set_fs() has a nice comment which explains what set_fs() is, but it doesn't explain why it is needed and why it depends on CONFIG_X86_64. Afaics, the history of this confusion is: 1. I guess today nobody can explain why it was needed in arch/i386/kernel/signal.c, perhaps it was always wrong. This predates 2.4.0 kernel. 2. then it was copy-and-past'ed to the new x86_64 arch. 3. then it was removed from i386 (but not from x86_64) by `b93b6ca3` "i386: remove unnecessary code". 4. then it was reintroduced under CONFIG_X86_64 when x86 unified i386 and x86_64, because the patch above didn't touch x86_64. Remove it. ->addr_limit should be correct. Even if it was possible that it is wrong, it is too late to fix it after setup_rt_frame(). Linus commented in: http://lkml.kernel.org/r/alpine.LFD.0.999.0707170902570.19166@woody.linux-foundation.org ... about the equivalent bit from i386: Heh. I think it's entirely historical. Please realize that the whole reason that function is called "set_fs()" is that it literally used to set the %fs segment register, not "->addr_limit". So I think the "set_fs(USER_DS)" is there _only_ to match the other regs->xds = __USER_DS; regs->xes = __USER_DS; regs->xss = __USER_DS; regs->xcs = __USER_CS; things, and never mattered. And now it matters even less, and has been copied to all other architectures where it is just totally insane. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20110710164424.GA20261@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-14 21:46:20 -07:00
Oleg Nesterov	9b42962074	x86, do_signal: Simplify the TS_RESTORE_SIGMASK logic 1. do_signal() looks at TS_RESTORE_SIGMASK and calculates the mask which should be stored in the signal frame, then it passes "oldset" to the callees, down to setup_rt_frame(). This is ugly, setup_rt_frame() can do this itself and nobody else needs this sigset_t. Move this code into setup_rt_frame. 2. do_signal() also clears TS_RESTORE_SIGMASK if handle_signal() succeeds. We can move this to setup_rt_frame() as well, this avoids the unnecessary checks and makes the logic more clear. 3. use set_current_blocked() instead of sigprocmask(SIG_SETMASK), sigprocmask() should be avoided. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20110710182203.GA27979@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-14 21:22:11 -07:00
Oleg Nesterov	3982294b03	x86, signals: Convert the X86_32 code to use set_current_blocked() sys_sigsuspend() and sys_sigreturn() change ->blocked directly. This is not correct, see the changelog in `e6fa16ab` "signal: sigprocmask() should do retarget_shared_pending()" Change them to use set_current_blocked(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20110710192727.GA31759@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-14 21:21:57 -07:00
Oleg Nesterov	905f29e2aa	x86, signals: Convert the IA32_EMULATION code to use set_current_blocked() sys32_sigsuspend() and sys32_*sigreturn() change ->blocked directly. This is not correct, see the changelog in `e6fa16ab` "signal: sigprocmask() should do retarget_shared_pending()" Change them to use set_current_blocked(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20110710192724.GA31755@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-14 21:21:31 -07:00
Andy Lutomirski	98d0ac38ca	x86-64: Move vread_tsc and vread_hpet into the vDSO The vsyscall page now consists entirely of trap instructions. Cc: John Stultz <johnstul@us.ibm.com> Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/637648f303f2ef93af93bae25186e9a1bea093f5.1310639973.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-14 17:57:05 -07:00
H. Peter Anvin	4bb82178f5	x86, msr: Fix typo in ENERGY_PERF_BIAS_POWERSAVE Fix a trivial typo in the name of the constant ENERGY_PERF_BIAS_POWERSAVE. This didn't cause trouble because this constant is not currently used for anything. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Len Brown <len.brown@intel.com> Link: http://lkml.kernel.org/r/tip-abe48b108247e9b90b4c6739662a2e5c765ed114@git.kernel.org	2011-07-14 14:58:44 -07:00
Cyrill Gorcunov	f912987097	perf, x86: P4 PMU - Introduce event alias feature Instead of hw_nmi_watchdog_set_attr() weak function and appropriate x86_pmu::hw_watchdog_set_attr() call we introduce even alias mechanism which allow us to drop this routines completely and isolate quirks of Netburst architecture inside P4 PMU code only. The main idea remains the same though -- to allow nmi-watchdog and perf top run simultaneously. Note the aliasing mechanism applies to generic PERF_COUNT_HW_CPU_CYCLES event only because arbitrary event (say passed as RAW initially) might have some additional bits set inside ESCR register changing the behaviour of event and we can't guarantee anymore that alias event will give the same result. P.S. Thanks a huge to Don and Steven for for testing and early review. Acked-by: Don Zickus <dzickus@redhat.com> Tested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> CC: Ingo Molnar <mingo@elte.hu> CC: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Stephane Eranian <eranian@google.com> CC: Lin Ming <ming.m.lin@intel.com> CC: Arnaldo Carvalho de Melo <acme@redhat.com> CC: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20110708201712.GS23657@sun Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-14 17:25:04 -04:00
Len Brown	abe48b1082	x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS Since 2.6.36 (`23016bf0d2`), Linux prints the existence of "epb" in /proc/cpuinfo, Since 2.6.38 (`d5532ee7b4`), the x86_energy_perf_policy(8) utility has been available in-tree to update MSR_IA32_ENERGY_PERF_BIAS. However, the typical BIOS fails to initialize the MSR, presumably because this is handled by high-volume shrink-wrap operating systems... Linux distros, on the other hand, do not yet invoke x86_energy_perf_policy(8). As a result, WSM-EP, SNB, and later hardware from Intel will run in its default hardware power-on state (performance), which assumes that users care for performance at all costs and not for energy efficiency. While that is fine for performance benchmarks, the hardware's intended default operating point is "normal" mode... Initialize the MSR to the "normal" by default during kernel boot. x86_energy_perf_policy(8) is available to change the default after boot, should the user have a different preference. Signed-off-by: Len Brown <len.brown@intel.com> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1107140051020.18606@x980 Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@kernel.org>	2011-07-14 12:13:42 -07:00
David S. Miller	6a7ebdf2fd	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/bluetooth/l2cap_core.c	2011-07-14 07:56:40 -07:00
Glauber Costa	095c0aa83e	sched: adjust scheduler cpu power for stolen time This patch makes update_rq_clock() aware of steal time. The mechanism of operation is not different from irq_time, and follows the same principles. This lives in a CONFIG option itself, and can be compiled out independently of the rest of steal time reporting. The effect of disabling it is that the scheduler will still report steal time (that cannot be disabled), but won't use this information for cpu power adjustments. Everytime update_rq_clock_task() is invoked, we query information about how much time was stolen since last call, and feed it into sched_rt_avg_update(). Although steal time reporting in account_process_tick() keeps track of the last time we read the steal clock, in prev_steal_time, this patch do it independently using another field, prev_steal_time_rq. This is because otherwise, information about time accounted in update_process_tick() would never reach us in update_rq_clock(). Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Tested-by: Eric B Munson <emunson@mgebm.net> CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> CC: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-14 12:59:47 +03:00
Glauber Costa	3c404b578f	KVM guest: Add a pv_ops stub for steal time This patch adds a function pointer in one of the many paravirt_ops structs, to allow guests to register a steal time function. Besides a steal time function, we also declare two jump_labels. They will be used to allow the steal time code to be easily bypassed when not in use. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Tested-by: Eric B Munson <emunson@mgebm.net> CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-14 12:59:44 +03:00
Glauber Costa	c9aaa8957f	KVM: Steal time implementation To implement steal time, we need the hypervisor to pass the guest information about how much time was spent running other processes outside the VM, while the vcpu had meaningful work to do - halt time does not count. This information is acquired through the run_delay field of delayacct/schedstats infrastructure, that counts time spent in a runqueue but not running. Steal time is a per-cpu information, so the traditional MSR-based infrastructure is used. A new msr, KVM_MSR_STEAL_TIME, holds the memory area address containing information about steal time This patch contains the hypervisor part of the steal time infrasructure, and can be backported independently of the guest portion. [avi, yongjie: export delayacct_on, to avoid build failures in some configs] Signed-off-by: Glauber Costa <glommer@redhat.com> Tested-by: Eric B Munson <emunson@mgebm.net> CC: Rik van Riel <riel@redhat.com> CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Yongjie Ren <yongjie.ren@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-14 12:59:14 +03:00
Andy Lutomirski	433bd805e5	clocksource: Replace vread with generic arch data The vread field was bloating struct clocksource everywhere except x86_64, and I want to change the way this works on x86_64, so let's split it out into per-arch data. Cc: x86@kernel.org Cc: Clemens Ladisch <clemens@ladisch.de> Cc: linux-ia64@vger.kernel.org Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/3ae5ec76a168eaaae63f08a2a1060b91aa0b7759.1310563276.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-13 11:23:12 -07:00
Andy Lutomirski	7f79ad15f3	x86-64: Add --no-undefined to vDSO build This gives much nicer diagnostics when something goes wrong. It's supported at least as far back as binutils 2.15. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/de0b50920469ff6359c529526e7639fdd36fa83c.1310563276.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-13 11:23:09 -07:00
Andy Lutomirski	1b3f2a72bb	x86-64: Allow alternative patching in the vDSO This code is short enough and different enough from the module loader that it's not worth trying to share anything. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/e73112e4381fff29e31b882c2d0856822edaea53.1310563276.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-13 11:23:07 -07:00
Andy Lutomirski	59e97e4d6f	x86: Make alternative instruction pointers relative This save a few bytes on x86-64 and means that future patches can apply alternatives to unrelocated code. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/ff64a6b9a1a3860ca4a7b8b6dc7b4754f9491cd7.1310563276.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-13 11:22:56 -07:00

... 3 4 5 6 7 ...

14099 Коммитов