The common arch/powerpc code calls in to functions in setup-bus.c
so some builds of ppc32 would fail.
Note, ppc32 usage of setup-irq.c is limited to arch/ppc and should be
removed when arch/ppc goes away.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There is no need for kobject_unregister() anymore, thanks to Kay's
kobject cleanup changes, so replace all instances of it with
kobject_put().
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The PCI bus should not be trying to declare its own attribute type.
Especially as this code could never ever be called because the driver
core overwrites the driver kobject type to be its own internal type.
Delete all of this code as it was never being used and is not correct.
Also update my copyright on the file while I'm touching things there.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Don't try to call the "raw" sysfs_create_file when we already have a
helper function to do this kind of work for us.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This allows an easier way to get to the device klist associated with a
struct bus_type (you have three to choose from...) This will make it
easier to move these fields to be dynamic in a future patch.
The only user of this is the PCI core which horribly abuses this
interface to rearrange the order of the pci devices. This should be
done using the existing bus device walking functions, but that's left
for future patches.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This allows an easier way to get to the kset associated with a struct
bus_type (you have three to choose from...) This will make it easier to
move these fields to be dynamic in a future patch.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Stop using kobject_register, as this way we can control the sending of
the uevent properly, after everything is properly initialized.
Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
rpadlpar pci hotplug driver was doing some pretty bad stuff with the
sysfs files. This cleans up the logic to be sane and gets rid of the
gratuitous kset that is not needed for a simple directory like this.
Note, this patch is not even build tested, let alone run-time tested.
Someone with access to this hardware and can test would be greatly
appreciated.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: John Rose <johnrose@austin.ibm.com>
Cc: Badari Pulavarty <pbadari@gmail.com>
Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This also renames pci_hotplug_slots_subsys to pcis_hotplug_slots_kset
catch all current users with a build error instead of a build warning
which can easily be missed.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We don't need a "default" ktype for a kset. We should set this
explicitly every time for each kset. This change is needed so that we
can make ksets dynamic, and cleans up one of the odd, undocumented
assumption that the kset/kobject/ktype model has.
This patch is based on a lot of help from Kay Sievers.
Nasty bug in the block code was found by Dave Young
<hidave.darkstar@gmail.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It is important that these resources be reserved
to avoid conflicts with well known ACPI registers.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
It appears that some PCI-E bridges do the wrong thing in the presense of
CRS Software Visibility and MMCONFIG. In particular, it looks like an
ATI bridge (device ID 7936) will return 0001 in the vendor ID field of
any bridged devices indefinitely.
Not enabling CRS SV avoids the problem, and as we currently do not
really make good use of the feature anyway (we just time out rather than
do any threaded discovery as suggested by the CRS specs), we're better
off just not enabling it.
This should fix a slew of problem reports with random devices (generally
graphics adapters or fairly high-performance networking cards, since it
only affected PCI-E) not getting properly recognized on these AMD systems.
If we really want to use CRS-SV, we may end up eventually needing a
whitelist of systems where this should be enabled, along with some kind
of "pcibios_enable_crs()" query to call the system-specific code.
Suggested-by: Loic Prylli <loic@myri.com>
Tested-by: Kai Ruhnau <kai@tragetaschen.dyndns.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The PCI code in 32 and 64 bits fixes up resources differently.
32 bits uses a header quirk plus handles bridges in pcibios_fixup_bus()
while 64 bits does things in various places depending on whether you
are using OF probing, using PCI hotplug, etc...
This merges those by basically using the 32 bits approach for both,
with various tweaks to make 64 bits work with the new approach.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Restore PCI expansion ROM P2P prefetch window creation.
This patch reverts previous "Avoid creating P2P prefetch
window for expansion ROMs" change due to regressions that
were spotted on some systems.
Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This reverts commit fd6e732186, which
helped up things on MIPS, but was wrong for everything else. As Ralf
Baechle puts it:
"It seems the whole MIPS resource managment is complicated enough (out
of necessity) that only a few people actually grok it. Ioports being
actually memory mapped on MIPS only makes the confusion worse, sigh."
Requested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Alan Cox <alan@redhat.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There should be a pci_dev_put when breaking out of a loop that iterates
over calls to pci_get_device and similar functions.
This was fixed using the following semantic patch.
// <smpl>
@@
identifier d;
type T;
expression e;
iterator for_each_pci_dev;
@@
T *d;
...
for_each_pci_dev(d)
{... when != pci_dev_put(d)
when != e = d
(
return d;
|
+ pci_dev_put(d);
? return ...;
)
...}
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The pcie protdrv status can be returned uninitialized,
if there are no children under a device. This leads to
bad responses downstream. Fix this.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The Coverity checker spotted that we'd have already oops'ed if "ctrl"
was NULL.
Additionally, "func" had just been checked for not being NULL.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Now that we have dealt with the real issue, in that some ATI SATA and
USB controllers needed the INTX_DISABLE quirk, we can remove these AMD
chipset global MSI disabling quirks.
This reverts three changesets:
4be8f90643 (PCI: disable MSI on RS690)
aea6a433f5 (PCI: disable MSI on RD580)
f122392f67 (PCI: disable MSI on RX790)
This is based upon testing and feedback from
Shane Huang <Shane.Huang@amd.com>.
Cc: Shane Huang <Shane.Huang@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A reasonably common problem with some devices is that they will
disable MSI generation when the INTX_DISABLE bit is set in the
PCI_COMMAND register.
Quirk this explicitly, guarding the pci_intx() calls in msi.c with
this quirk indication.
The first entries for this quirk are for 5714 and 5780 Tigon3 chips,
and thus we can remove the workaround code from the tg3.c driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is the fix for the following problem:
https://bugzilla.redhat.com/show_bug.cgi?id=227657
The bnx2 device 5706 complains about MSI not working behind a
ServerWorks HT1000 PCIX bridge. An earlier commit to fix the problem:
e3008dedff4bdc96a5f67224cd3d8d12237082a0:
"PCI: disable MSI by default on systems with Serverworks HT1000 chips"
was not entirely correct, and has been reverted.
MSI does not work on the PCIX bus because the BIOS did not set the
HT_MSI_FLAGS_ENABLE bit in the HyperTransport MSI capability on the
bridge. We use the existing quirk_msi_ht_cap() to detect the problem
and disable MSI in all buses behind it.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Cc: Anantha Subramanyam <ananth@broadcom.com>
Cc: Naren Sankar <nsankar@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This reverts commit e3008dedff.
The real bug was an INTX issue in the tg3 ethernet chip, and
cured by commit c129d962a66c76964954a98b38586ada82cf9381
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch renames the include file asm-x86/iommu.h to asm-x86/gart.h to make
clear to which IOMMU implementation it belongs. The patch also adds "GART" to
the Kconfig line.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
- off by one in dmar_get_fault_reason() (maximal index in array is
ARRAY_SIZE()-1, not ARRAY_SIZE())
- NULL noise removal
- __iomem annotation fix
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Set bits 0, 4, 5 and 7 of PCI configuration register 0x40 in the
quirk. This has the following effects and is recommended by the
vendor.
* Force enable of IDE channels (used to be left alone as BIOS
configured)
* Change initial phase behavior of PIO cycle such that the host pulls
down the bus instead of tristating it. Vendor recommends this
setting.
The above settings are better for the current generation of
controllers and needed for the upcoming next generation.
Tested on JMB363.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Ethan Hsiao <ethanhsiao@jmicron.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
x86_64 defines ARCH_HAS_SG_CHAIN. So if IOMMU implementations don't
support sg chaining, we will get data corruption.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
pci_dev's->sysdata is highly overloaded and currently IOMMU is broken due
to IOMMU code depending on this field.
This patch introduces new field in pci_dev's dev.archdata struct to hold
IOMMU specific per device IOMMU private data.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg KH <greg@kroah.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds PageSelectiveInvalidation support replacing existing
DomainSelectiveInvalidation for intel_{map/unmap}_sg() calls and also
enables to mapping one big contiguous DMA virtual address which is mapped
to discontiguous physical address for SG map/unmap calls.
"Doamin selective invalidations" wipes out the IOMMU address translation
cache based on domain ID where as "Page selective invalidations" wipes out
the IOMMU address translation cache for that address mask range which is
more cache friendly when compared to Domain selective invalidations.
Here is how it is done.
1) changes to iova.c
alloc_iova() now takes a bool size_aligned argument, which
when when set, returns the io virtual address that is
naturally aligned to 2 ^ x, where x is the order
of the size requested.
Returning this io vitual address which is naturally
aligned helps iommu to do the "page selective
invalidations" which is IOMMU cache friendly
over "domain selective invalidations".
2) Changes to driver/pci/intel-iommu.c
Clean up intel_{map/unmap}_{single/sg} () calls so that
s/g map/unamp calls is no more dependent on
intel_{map/unmap}_single()
intel_map_sg() now computes the total DMA virtual address
required and allocates the size aligned total DMA virtual address
and maps the discontiguous physical address to the allocated
contiguous DMA virtual address.
In the intel_unmap_sg() case since the DMA virtual address
is contiguous and size_aligned, PageSelectiveInvalidation
is used replacing earlier DomainSelectiveInvalidations.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Greg KH <greg@kroah.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Suresh B <suresh.b.siddha@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This config option (DMAR_FLPY_WA) sets up 1:1 mapping for the floppy device so
that the floppy device which does not use DMA api's will continue to work.
Once the floppy driver starts using DMA api's this config option can be turn
off or this patch can be yanked out of kernel at that time.
[akpm@linux-foundation.org: cleanups, rename things, build fix]
[jengelh@computergmbh.de: Kconfig fixes]
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When we fix all the opensource gfx drivers to use the DMA api's, at that time
we can yank this config options out.
[jengelh@computergmbh.de: Kconfig fixes]
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
MSI interrupt handler registrations and fault handling support for Intel-IOMMU
hadrware.
This patch enables the MSI interrupts for the DMA remapping units and in the
interrupt handler read the fault cause and outputs the same on to the console.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Intel IOMMU driver needs memory during DMA map calls to setup its internal
page tables and for other data structures. As we all know that these DMA map
calls are mostly called in the interrupt context or with the spinlock held by
the upper level drivers(network/storage drivers), so in order to avoid any
memory allocation failure due to low memory issues, this patch makes memory
allocation by temporarily setting PF_MEMALLOC flags for the current task
before making memory allocation calls.
We evaluated mempools as a backup when kmem_cache_alloc() fails
and found that mempools are really not useful here because
1) We don't know for sure how much to reserve in advance
2) And mempools are not useful for GFP_ATOMIC case (as we call
memory alloc functions with GFP_ATOMIC)
(akpm: point 2 is wrong...)
With PF_MEMALLOC flag set in the current->flags, the VM subsystem avoids any
watermark checks before allocating memory thus guarantee'ing the memory till
the last free page. Further, looking at the code in mm/page_alloc.c in
__alloc_pages() function, looks like this flag is useful only in the
non-interrupt context.
If we are in the interrupt context and memory allocation in IOMMU driver fails
for some reason, then the DMA map api's will return failure and it is up to
the higher level drivers to retry. Suppose, if upper level driver programs
the controller with the buggy DMA virtual address, the IOMMU will block that
DMA transaction when that happens thus preventing any corruption to main
memory.
So far in our test scenario, we were unable to create any memory allocation
failure inside dma map api calls.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Actual intel IOMMU driver. Hardware spec can be found at:
http://www.intel.com/technology/virtualization
This driver sets X86_64 'dma_ops', so hook into standard DMA APIs. In this
way, PCI driver will get virtual DMA address. This change is transparent to
PCI drivers.
[akpm@linux-foundation.org: remove unneeded cast]
[akpm@linux-foundation.org: build fix]
[bunk@stusta.de: fix duplicate CONFIG_DMAR Makefile line]
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This code implements a generic IOVA allocation and management. As per Dave's
suggestion we are now allocating IO virtual address from Higher DMA limit
address rather than lower end address and this eliminated the need to preserve
the IO virtual address for multiple devices sharing the same domain virtual
address.
Also this code uses red black trees to store the allocated and reserved iova
nodes. This showed a good performance improvements over previous linear
linked list.
[akpm@linux-foundation.org: remove inlines]
[akpm@linux-foundation.org: coding style fixes]
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When devices are under a p2p bridge, upstream transactions get replaced by the
device id of the bridge as it owns the PCIE transaction. Hence its necessary
to setup translations on behalf of the bridge as well. Due to this limitation
all devices under a p2p share the same domain in a DMAR.
We just cache the type of device, if its a native PCIe device
or not for later use.
[akpm@linux-foundation.org: BUG_ON -> WARN_ON+recover]
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch supports the upcomming Intel IOMMU hardware a.k.a. Intel(R)
Virtualization Technology for Directed I/O Architecture and the hardware spec
for the same can be found here
http://www.intel.com/technology/virtualization/index.htm
FAQ! (questions from akpm, answers from ak)
> So... what's all this code for?
>
> I assume that the intent here is to speed things up under Xen, etc?
Yes in some cases, but not this code. That would be the Xen version of this
code that could potentially assign whole devices to guests. I expect this to
be only useful in some special cases though because most hardware is not
virtualizable and you typically want an own instance for each guest.
Ok at some point KVM might implement this too; i likely would use this code
for this.
> Do we
> have any benchmark results to help us to decide whether a merge would be
> justified?
The main advantage for doing it in the normal kernel is not performance, but
more safety. Broken devices won't be able to corrupt memory by doing random
DMA.
Unfortunately that doesn't work for graphics yet, for that need user space
interfaces for the X server are needed.
There are some potential performance benefits too:
- When you have a device that cannot address the complete address range an
IOMMU can remap its memory instead of bounce buffering. Remapping is likely
cheaper than copying.
- The IOMMU can merge sg lists into a single virtual block. This could
potentially speed up SG IO when the device is slow walking SG lists. [I
long ago benchmarked 5% on some block benchmark with an old MPT Fusion; but
it probably depends a lot on the HBA]
And you get better driver debugging because unexpected memory accesses from
the devices will cause a trappable event.
>
> Does it slow anything down?
It adds more overhead to each IO so yes.
This patch:
Add support for early detection and parsing of DMAR's (DMA Remapping) reported
to OS via ACPI tables.
DMA remapping(DMAR) devices support enables independent address translations
for Direct Memory Access(DMA) from Devices. These DMA remapping devices are
reported via ACPI tables and includes pci device scope covered by these DMA
remapping device.
For detailed info on the specification of "Intel(R) Virtualization Technology
for Directed I/O Architecture" please see
http://www.intel.com/technology/virtualization/index.htm
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (37 commits)
PCI: merge almost all of pci_32.h and pci_64.h together
PCI: X86: Introduce and enable PCI domain support
PCI: Add 'nodomains' boot option, and pci_domains_supported global
PCI: modify PCI bridge control ISA flag for clarity
PCI: use _CRS for PCI resource allocation
PCI: avoid P2P prefetch window for expansion ROMs
PCI: skip ISA ioresource alignment on some systems
PCI: remove transparent bridge sizing
pci: write file size to inode on proc bus file write
pci: use size stored in proc_dir_entry for proc bus files
pci: implement "pci=noaer"
PCI: fix IDE legacy mode resources
MSI: Use correct data offset for 32-bit MSI in read_msi_msg()
PCI: Fix incorrect argument order to list_add_tail() in PCI dynamic ID code
PCI: i386: Compaq EVO N800c needs PCI bus renumbering
PCI: Remove no longer correct documentation regarding MSI vector assignment
PCI: re-enable onboard sound on "MSI K8T Neo2-FIR"
PCI: quirk_vt82c586_acpi: Omit reading PCI revision ID
PCI: quirk amd_8131_mmrbc: Omit reading pci revision ID
cpqphp: Use PCI_CLASS_REVISION instead of PCI_REVISION_ID for read
...