If we use a large mapping, the expectation is that only unmaps from
the first pte in the superpage are supported. Unmaps from offsets
into the superpage should fail (ie. return zero sized unmap). In the
current code, unmapping from an offset clears the size of the full
mapping starting from an offset. For instance, if we map a 16k
physically contiguous range at IOVA 0x0 with a large page, then
attempt to unmap 4k at offset 12k, 4 ptes are cleared (12k - 28k) and
the unmap returns 16k unmapped. This potentially incorrectly clears
valid mappings and confuses drivers like VFIO that use the unmap size
to release pinned pages.
Fix by refusing to unmap from offsets into the page.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <joro@8bytes.org>
The IOMMU pagetables can have up to 6 levels, but the code
in free_pagetable() only releases the first 3 levels. Fix
this leak by releasing all levels.
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
If a device is multifunction and does not have ACS enabled then we
assume that the entire package lacks ACS and use function 0 as the
base of the group. The PCIe spec however states that components are
permitted to implement ACS on some, none, or all of their applicable
functions. It's therefore conceivable that function 0 may be fully
independent and support ACS while other functions do not. Instead
use the lowest function of the slot that does not have ACS enabled
as the base of the group. This may be the current device, which is
intentional. So long as we use a consistent algorithm, all the
non-ACS functions will be grouped together and ACS functions will
get separate groups.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
The updates are mostly about the x86 IOMMUs this time. Exceptions are
the groundwork for the PAMU IOMMU from Freescale (for a PPC platform)
and an extension to the IOMMU group interface. On the x86 side this
includes a workaround for VT-d to disable interrupt remapping on broken
chipsets. On the AMD-Vi side the most important new feature is a kernel
command-line interface to override broken information in IVRS ACPI
tables and get interrupt remapping working this way. Besides that there
are small fixes all over the place.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRh2vAAAoJECvwRC2XARrjbjkP/jzKzeffybUpQsIJF8rs/IEt
hSwqpGLr6WR5FdneEH9fiBIp4pyMDXmuAb/2ZNgB+DgPN3xgqmWVo4WLk7pMo3BS
/xIz/lu7hIX3AtKt807pL9+rPdhGYEJ43Vmr4bW9x0l1kuNXy6fmMLcN5FaPKjV4
p4hY4jOstEgtYQw4wi39/9b4FsYoipZizkOUSdtCzWwTv7jOHH7/Wra8iZyzL6Je
1VlF/efp0ytTcwLdHOfGwPCIlZrQRtQCM4SqdAUG9bOL3ARR9Yu/0iW1295nbLzo
CQX5CfKePvo/fGxki1jcBi+UCyxYKPosB5kCxmh4MAxCg/VzzMsaME/A73tLJa6W
Y29bbjwPoBPMq03HX8S9R5QWY8HpFujUUp+J4TXcKuTgYEV28WfLu1uaeKD716nM
LoXUojov7Cj8ZQZnhyu5l+XNaephBZLfw/8bM6bAxhlKXwAjmLiS5Z+srPl1GJee
5GCV+L94JifHLZaREWh3JFsh9O3W7Wno2++c4JU32uCWJHXH7tMgs2P8n5AY9rnT
Km1a9y6w2MF3Gg9j4y6u75m0XnFTNzYjeJMUtqVlwVhNHhgaXfuIWY63xOQCLJs1
ThTHOjoh0VqONGobR/ywn+0ouo9X07DnWpluyFd+zY3XK0UE0NOu9XMr4i6TWxOf
mlzWoEKxtw36XGHB/FtQ
=MVc/
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"The updates are mostly about the x86 IOMMUs this time.
Exceptions are the groundwork for the PAMU IOMMU from Freescale (for a
PPC platform) and an extension to the IOMMU group interface.
On the x86 side this includes a workaround for VT-d to disable
interrupt remapping on broken chipsets. On the AMD-Vi side the most
important new feature is a kernel command-line interface to override
broken information in IVRS ACPI tables and get interrupt remapping
working this way.
Besides that there are small fixes all over the place."
* tag 'iommu-updates-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (24 commits)
iommu/tegra: Fix printk formats for dma_addr_t
iommu: Add a function to find an iommu group by id
iommu/vt-d: Remove warning for HPET scope type
iommu: Move swap_pci_ref function to drivers/iommu/pci.h.
iommu/vt-d: Disable translation if already enabled
iommu/amd: fix error return code in early_amd_iommu_init()
iommu/AMD: Per-thread IOMMU Interrupt Handling
iommu: Include linux/err.h
iommu/amd: Workaround for ERBT1312
iommu/amd: Document ivrs_ioapic and ivrs_hpet parameters
iommu/amd: Don't report firmware bugs with cmd-line ivrs overrides
iommu/amd: Add ioapic and hpet ivrs override
iommu/amd: Add early maps for ioapic and hpet
iommu/amd: Extend IVRS special device data structure
iommu/amd: Move add_special_device() to __init
iommu: Fix compile warnings with forward declarations
iommu/amd: Properly initialize irq-table lock
iommu/amd: Use AMD specific data structure for irq remapping
iommu/amd: Remove map_sg_no_iommu()
iommu/vt-d: add quirk for broken interrupt remapping on 55XX chipsets
...
The swap_pci_ref function is used by the IOMMU API code for
swapping pci device pointers, while determining the iommu
group for the device.
Currently this function was being implemented for different
IOMMU drivers. This patch moves the function to a new file,
drivers/iommu/pci.h so that the implementation can be
shared across various IOMMU drivers.
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
In the current interrupt handling scheme, there are as many threads as
the number of IOMMUs. Each thread is created and assigned to an IOMMU at
the time of registering interrupt handlers (request_threaded_irq).
When an IOMMU HW generates an interrupt, the irq handler (top half) wakes up
the corresponding thread to process event and PPR logs of all IOMMUs
starting from the 1st IOMMU.
In the system with multiple IOMMU,this handling scheme complicates the
synchronization of the IOMMU data structures and status registers as
there could be multiple threads competing for the same IOMMU while
the other IOMMU could be left unhandled.
To simplify, this patch is proposing a different interrupt handling scheme
by having each thread only managing interrupts of the corresponding IOMMU.
This can be achieved by passing the struct amd_iommu when registering the
interrupt handlers. This structure is unique for each IOMMU and can be used
by the bottom half thread to identify the IOMMU to be handled instead
of calling for_each_iommu. Besides this also eliminate the needs to lock
the IOMMU for processing event and PPR logs.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Work around an IOMMU hardware bug where clearing the
EVT_INT or PPR_INT bit in the status register may race with
the hardware trying to set it again. When not handled the
bit might not be cleared and we lose all future event or ppr
interrupts.
Reported-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <joro@8bytes.org>
For compatibility reasons the irq remapping code for the AMD
IOMMU used the same per-irq data structure as the Intel
implementation. Now that support for the AMD specific data
structure is upstream we can use this one instead.
Reviewed-by: Shuah Khan <shuahkhan@gmail.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
This function was intended as a fall-back if the map_sg
function is called for a device not mapped by the IOMMU.
Since the AMD IOMMU driver uses per-device dma_ops this can
never happen. So this function isn't needed anymore.
Reviewed-by: Shuah Khan <shuahkhan@gmail.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
This is required in case of PAMU, as it can support a window size of up
to 64G (even on 32bit).
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Current driver does not clear the IOMMU event log interrupt bit
in the IOMMU status register after processing an interrupt.
This causes the IOMMU hardware to generate event log interrupt only once.
This has been observed in both IOMMU v1 and V2 hardware.
This patch clears the bit by writing 1 to bit 1 of the IOMMU
status register (MMIO Offset 2020h)
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
There is a bug introduced with commit 27c2127 that causes
devices which are hot unplugged and then hot-replugged to
not have per-device dma_ops set. This causes these devices
to not function correctly. Fixed with this patch.
Cc: stable@vger.kernel.org
Reported-by: Andreas Degert <andreas.degert@googlemail.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Change to remove local PCI_BUS() define and use the new PCI_BUS_NUM()
interface from PCI.
Signed-off-by: Shuah Khan <shuah.khan@hp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Joerg Roedel <joro@8bytes.org>
Besides some fixes and cleanups in the code there are three more
important changes to point out this time:
* New IOMMU driver for the ARM SHMOBILE platform
* An IOMMU-API extension for non-paging IOMMUs (required for
upcoming PAMU driver)
* Rework of the way the Tegra IOMMU driver accesses its
registetrs - register windows are easier to extend now.
There are also a few changes to non-iommu code, but that is acked by the
respective maintainers.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRK0gMAAoJECvwRC2XARrjHAwQANIJjqgZECxqx/MuAfmvkvA0
gRvlqBh/LWQhm/PlkpvqTMq7YY9kH1sxk+UD32oJok3XnScQWfcrJNmpijLo9/9Z
XyMTXQrGX0X+LWAXLIBXrlbV37mztHFEVxYrO+jiEGKP8+153sguPvmu0y6wC2AZ
RhsrVftDE7OIqdTGo8+ORCKOg7ZXNJ04hER4vW8I+0LLP1m6nnHXSKZ4E6Vmtc9K
YgfcwwsduYOkboMK5S0XLl58Xqiq53iXw3R+wSFIsFVVQ9Zp5yZzUGphvSQvDOBc
fX01M+Ouu+bT5U2DlDmYCnL3K14Mr7TqlH78Loq3w6yHRm1fxQoiF5vm98ZAmFde
nU6WCJNks0z+hIlkdIlrLgvBd8nWubGOtU3EfhzseawF1WexIusTqO4Fkp+rNJk0
wZ8h2ATUCch17BE8O794lCQuOwHQ6q7JcQmVz2GPJ83GEvQW1svKzzPIPBm0yLW3
hCS9T9O+Bic0Bx+L7QXu5D1aRxJskJUPnINVirfSUXb0vVLb/U9jGNgITf2A9XCl
p5z0i4RriDwCzg9917U4ZvjYbf3rjdMRwJ5TAxNqRrooMbGvOTZCJzIjujv82Adp
BDm8HZx3FZP/8S5hfE5Ahr4gaNle8jnO53G6jKkjDuSG6DP+XMEj82oSJ/M+Rnld
nCvEUi0bXhwHOOfdmgNU
=G4Ot
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU Updates from Joerg Roedel:
"Besides some fixes and cleanups in the code there are three more
important changes to point out this time:
* New IOMMU driver for the ARM SHMOBILE platform
* An IOMMU-API extension for non-paging IOMMUs (required for
upcoming PAMU driver)
* Rework of the way the Tegra IOMMU driver accesses its
registetrs - register windows are easier to extend now.
There are also a few changes to non-iommu code, but that is acked by
the respective maintainers."
* tag 'iommu-updates-v3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (23 commits)
iommu/tegra: assume CONFIG_OF in SMMU driver
iommu/tegra: assume CONFIG_OF in gart driver
iommu/amd: Remove redundant NULL check before dma_ops_domain_free().
iommu/amd: Initialize device table after dma_ops
iommu/vt-d: Zero out allocated memory in dmar_enable_qi
iommu/tegra: smmu: Fix incorrect mask for regbase
iommu/exynos: Make exynos_sysmmu_disable static
ARM: mach-shmobile: r8a7740: Add IPMMU device
ARM: mach-shmobile: sh73a0: Add IPMMU device
ARM: mach-shmobile: sh7372: Add IPMMU device
iommu/shmobile: Add iommu driver for Renesas IPMMU modules
iommu: Add DOMAIN_ATTR_WINDOWS domain attribute
iommu: Add domain window handling functions
iommu: Implement DOMAIN_ATTR_PAGING attribute
iommu: Check for valid pgsize_bitmap in iommu_map/unmap
iommu: Make sure DOMAIN_ATTR_MAX is really the maximum
iommu/tegra: smmu: Change SMMU's dependency on ARCH_TEGRA
iommu/tegra: smmu: Use helper function to check for valid register offset
iommu/tegra: smmu: Support variable MMIO ranges/blocks
iommu/tegra: Add missing spinlock initialization
...
dma_ops_domain_free on a NULL pointer is a no-op, so the NULL check in
amd_iommu_init_dma_ops() can be removed.
Signed-off-by: Cyril Roelandt <tipecaml@gmail.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Move all the code to either to the header file
asm/irq_remapping.h or to drivers/iommu/.
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Acked-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The AMD IOMMU driver only uses the page-sizes it gets from
IOMMU core and uses the appropriate page-size. So this
comment is not necessary.
Signed-off-by: Joerg Roedel <joro@8bytes.org>
There is a bug in the hardware that will be triggered when
this page size is used. Make sure this does not happen.
Signed-off-by: Joerg Roedel <joro@8bytes.org>
An alias doesn't always point to a physical device. When this
happens we must first verify that the IOMMU group isn't rooted in
a device above the alias. In this case the alias is effectively
just another quirk for the devices aliased to it. Alternatively,
the virtual alias itself may be the root of the IOMMU group. To
support this, allow a group to be hosted on the alias dev_data
for use by anything that might have the same alias.
Signed-off-by: Alex williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Add a WARN_ON to make it clear why we don't add dma_pdev->dev to the
group we're allocating.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This needs to be broken apart, start with pulling all the IOMMU
group init code into a new function.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
We should return NULL on error instead of the freed pointer.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Report the availability of irq remapping through the
IOMMU-API to allow KVM device passthrough again without
additional module parameter overrides.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Add the six routines required to setup interrupt remapping
with the AMD IOMMU. Also put it all together into the AMD
specific irq_remap_ops.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Add the routine to setup interrupt remapping for ioapic
interrupts. Also add a routine to change the affinity of an
irq and to free an irq allocation for interrupt remapping.
The last two functions will also be used for MSI interrupts.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Add routines to:
* Alloc remapping tables and single entries from these
tables
* Change entries in the tables
* Free entries in the table
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Add routine to invalidate the IOMMU cache for interupt
translations. Also include the IRTE caches when flushing all
IOMMU caches.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The irq remapping tables for the AMD IOMMU need to be
aligned on a 128 byte boundary. Create a seperate slab-cache
to guarantee this alignment.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The IVRS ACPI table provides information about the IOAPICs
and the HPETs available in the system and which PCI device
ID they use in transactions. Save that information for later
usage in interrupt remapping.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The new IOMMU groups code in the AMD IOMMU driver makes the
assumption that there is a pci_dev struct available for all
device-ids listed in the IVRS ACPI table. Unfortunatly this
assumption is not true and so this code causes a NULL
pointer dereference at boot on some systems.
Fix it by making sure the given pointer is never NULL when
passed to the group specific code. The real fix is larger
and will be queued for v3.7.
Reported-by: Florian Dazinger <florian@dazinger.net>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Fix some typos in comments and user-visible messages. No
functional changes.
Signed-off-by: Frank Arnold <frank.arnold@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
SR-IOV can create buses without a bridge. There may be other cases
where this happens as well. In these cases skip to the parent bus
and continue testing devices there.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This did not work because devices are not put into the
pt_domain. Fix this.
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
A few sparse warnings fire in drivers/iommu/amd_iommu_init.c.
Fix most of them with this patch. Also fix the sparse
warnings in drivers/iommu/irq_remapping.c while at it.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
write_file_bool() modifies 32 bits of data, so "amd_iommu_unmap_flush"
needs to be 32 bits as well or we'll corrupt memory. Fortunately it
looks like the data is aligned with a gap after the declaration so this
is harmless in production.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>