This struct contains all necessary information for the
function already. Also handle the info->dev == NULL case
while at it.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The code in the locked section does not touch anything
protected by the dmar_global_lock. Remove it from there.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When this lock is held the device_domain_lock is also
required to make sure the device_domain_info does not vanish
while in use. So this lock can be removed as it gives no
additional protection.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Move the code to attach/detach domains to iommus and vice
verce into a single function to make sure there are no
dangling references.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This makes domain attachment more synchronous with domain
deattachment. The domain<->iommu link is released in
dmar_remove_one_dev_info.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Rename this function and the ones further down its
call-chain to domain_context_clear_*. In particular this
means:
iommu_detach_dependent_devices -> domain_context_clear
iommu_detach_dev_cb -> domain_context_clear_one_cb
iommu_detach_dev -> domain_context_clear_one
These names match a lot better with its
domain_context_mapping counterparts.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Rename the function to dmar_remove_one_dev_info to match is
name better with its dmar_insert_one_dev_info counterpart.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Rename this function to dmar_insert_one_dev_info() to match
the name better with its counter part function
domain_remove_one_dev_info().
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Do the context-mapping of devices from a single place in the
call-path and clean up the other call-sites.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Just call domain_remove_one_dev_info() for all devices in
the domain instead of reimplementing the functionality.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We don't need to do an expensive search for domain-ids
anymore, as we keep track of per-iommu domain-ids.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This replaces the dmar_domain->iommu_bmp with a similar
reference count array. This allows us to keep track of how
many devices behind each iommu are attached to the domain.
This is necessary for further simplifications and
optimizations to the iommu<->domain attachment code.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This field is now obsolete because all places use the
per-iommu domain-ids. Kill the remaining uses of this field
and remove it.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
There is no reason for this special handling of the
si_domain. The per-iommu domain-id can be allocated
on-demand like for any other domain. So remove the
pre-allocation code.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This function can figure out the domain-id to use itself
from the iommu_did array. This is more reliable over
different domain types and brings us one step further to
remove the domain->id field.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Get rid of the special cases for VM domains vs. non-VM
domains and simplify the code further to just handle the
hardware passthrough vs. page-table case.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
There is no reason to pass the translation type through
multiple layers. It can also be determined in the
domain_context_mapping_one function directly.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The special case for VM domains is not needed, as other
domains could be attached to the iommu in the same way. So
get rid of this special case.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This array is indexed by the domain-id and contains the
pointers to the domains attached to this iommu. Modern
systems support 65536 domain ids, so that this array has a
size of 512kb, per iommu.
This is a huge waste of space, as the array is usually
sparsely populated. This patch makes the array
two-dimensional and allocates the memory for the domain
pointers on-demand.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Instead of searching in the domain array for already
allocated domain ids, keep track of them explicitly.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Debugging domain ID leakage typically requires long running tests in
order to exhaust the domain ID space or kernel instrumentation to
track the setting and clearing of bits. A couple trivial intel-iommu
specific sysfs extensions make it much easier to expose the IOMMU
capabilities and current usage.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This makes sure it won't be possible to accidentally leak format
strings into iommu device names. Current name allocations are safe,
but this makes the "%s" explicit.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This continues the attempt to fix commit fb170fb4c5 ("iommu/vt-d:
Introduce helper functions to make code symmetric for readability").
The previous attempt in commit 7168440690 ("iommu/vt-d: Detach
domain *only* from attached iommus") overlooked the fact that
dmar_domain.iommu_bmp gets cleared for VM domains when devices are
detached:
intel_iommu_detach_device
domain_remove_one_dev_info
domain_detach_iommu
The domain is detached from the iommu, but the iommu is still attached
to the domain, for whatever reason. Thus when we get to domain_exit(),
we can't rely on iommu_bmp for VM domains to find the active iommus,
we must check them all. Without that, the corresponding bit in
intel_iommu.domain_ids doesn't get cleared and repeated VM domain
creation and destruction will run out of domain IDs. Meanwhile we
still can't call iommu_detach_domain() on arbitrary non-VM domains or
we risk clearing in-use domain IDs, as 7168440690 attempted to
address.
It's tempting to modify iommu_detach_domain() to test the domain
iommu_bmp, but the call ordering from domain_remove_one_dev_info()
prevents it being able to work as fb170fb4c5 seems to have intended.
Caching of unused VM domains on the iommu object seems to be the root
of the problem, but this code is far too fragile for that kind of
rework to be proposed for stable, so we simply revert this chunk to
its state prior to fb170fb4c5.
Fixes: fb170fb4c5 ("iommu/vt-d: Introduce helper functions to make
code symmetric for readability")
Fixes: 7168440690 ("iommu/vt-d: Detach domain *only* from attached
iommus")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: stable@vger.kernel.org # v3.17+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Do not touch the TE bit unless we know translation is
disabled.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
For all the copy-translation code to run, we have to keep
translation enabled in intel_iommu_init(). So remove the
code disabling it.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We can't change the RTT bit when translation is enabled, so
don't copy translation tables when we would change the bit
with our new root entry.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When we copied over context tables from an old kernel, we
need to defer assignment of devices to domains until the
device driver takes over. So skip this part of
initialization when we copied over translation tables from
the old kernel.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This seperates the allocation of the si_domain from its
assignment to devices. It makes sure that the iommu=pt case
still works in the kdump kernel, when we have to defer the
assignment of devices to domains to device driver
initialization time.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Mark the context entries we copied over from the old kernel,
so that we don't detect them as present in other code paths.
This makes sure we safely overwrite old context entries when
a new domain is assigned.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Mark all domain-ids we find as reserved, so that there could
be no collision between domains from the previous kernel and
our domains in the IOMMU TLB.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
If we are in a kdump kernel and find translation enabled in
the iommu, try to copy the translation tables from the old
kernel to preserve the mappings until the device driver
takes over.
This supports old and the extended root-entry and
context-table formats.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Add code to detect whether translation is already enabled in
the IOMMU. Save this state in a flags field added to
struct intel_iommu.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In case there was an old root entry, make our new one
visible immediately after it was allocated.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
QI needs to be available when we write the root entry into
hardware because flushes might be necessary after this.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Give them a common prefix that can be grepped for and
improve the wording here and there.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Although the extended tables are theoretically a completely orthogonal
feature to PASID and anything else that *uses* the newly-available bits,
some of the early hardware has problems even when all we do is enable
them and use only the same bits that were in the old context tables.
For now, there's no motivation to support extended tables unless we're
going to use PASID support to do SVM. So just don't use them unless
PASID support is advertised too. Also add a command-line bailout just in
case later chips also have issues.
The equivalent problem for PASID support has already been fixed with the
upcoming VT-d spec update and commit bd00c606a ("iommu/vt-d: Change
PASID support to bit 40 of Extended Capability Register"), because the
problematic platforms use the old definition of the PASID-capable bit,
which is now marked as reserved and meaningless.
So with this change, we'll magically start using ECS again only when we
see the new hardware advertising "hey, we have PASID support and we
actually tested it this time" on bit 40.
The VT-d hardware architect has promised that we are not going to have
any reason to support ECS *without* PASID any time soon, and he'll make
sure he checks with us before changing that.
In the future, if hypothetical new features also use new bits in the
context tables and can be seen on implementations *without* PASID support,
we might need to add their feature bits to the ecs_enabled() macro.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
When we use 'intel_iommu=igfx_off' to disable translation for the
graphics, and when we discover that the BIOS has misconfigured the DMAR
setup for I/OAT, we use a special DUMMY_DEVICE_DOMAIN_INFO value in
dev->archdata.iommu to indicate that translation is disabled.
With passthrough mode, we were attempting to dereference that as a
normal pointer to a struct device_domain_info when setting up an
identity mapping for the affected device.
This fixes the problem by making device_to_iommu() explicitly check for
the special value and indicate that no IOMMU was found to handle the
devices in question.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@vger.kernel.org (which means you can pick up 18436afdc now too)
Pull intel iommu updates from David Woodhouse:
"This lays a little of the groundwork for upcoming Shared Virtual
Memory support — fixing some bogus #defines for capability bits and
adding the new ones, and starting to use the new wider page tables
where we can, in anticipation of actually filling in the new fields
therein.
It also allows graphics devices to be assigned to VM guests again.
This got broken in 3.17 by disallowing assignment of RMRR-afflicted
devices. Like USB, we do understand why there's an RMRR for graphics
devices — and unlike USB, it's actually sane. So we can make an
exception for graphics devices, just as we do USB controllers.
Finally, tone down the warning about the X2APIC_OPT_OUT bit, due to
persistent requests. X2APIC_OPT_OUT was added to the spec as a nasty
hack to allow broken BIOSes to forbid us from using X2APIC when they
do stupid and invasive things and would break if we did.
Someone noticed that since Windows doesn't have full IOMMU support for
DMA protection, setting the X2APIC_OPT_OUT bit made Windows avoid
initialising the IOMMU on the graphics unit altogether.
This means that it would be available for use in "driver mode", where
the IOMMU registers are made available through a BAR of the graphics
device and the graphics driver can do SVM all for itself.
So they started setting the X2APIC_OPT_OUT bit on *all* platforms with
SVM capabilities. And even the platforms which *might*, if the
planets had been aligned correctly, possibly have had SVM capability
but which in practice actually don't"
* git://git.infradead.org/intel-iommu:
iommu/vt-d: support extended root and context entries
iommu/vt-d: Add new extended capabilities from v2.3 VT-d specification
iommu/vt-d: Allow RMRR on graphics devices too
iommu/vt-d: Print x2apic opt out info instead of printing a warning
iommu/vt-d: kill bogus ecap_niotlb_iunits()
Not much this time, but the changes include:
* Moving domain allocation into the iommu drivers to prepare for
the introduction of default domains for devices
* Fixing the IO page-table code in the AMD IOMMU driver to
correctly encode large page sizes
* Extension of the PCI support in the ARM-SMMU driver
* Various fixes and cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVNFIPAAoJECvwRC2XARrj4v8QAMVsPJ+kmnLvqGDkO9v2i9z6
sFX27h55HhK3Pgb5aEmEhvZd0Eec22KtuADr92LsRSjskgA4FgrzzSlo8w7+MbwM
dtowij+5Bzx/jEeexM5gog0ZA9Brl725KSYBmwJIAroKAtl3YXsIA4TO7X/JtXJm
0qWbCxLs9CX5uWyJawkeDl8UAaZYb8AHKv1UhJt8Z5yajM/qITMULi51g2Bgh8kx
YaRHeZNj+mFQqb6IlNkmOhILN+dbTdxQREp+aJs1alGdkBGlJyfo6eK4weNOpA4x
gc8EXUWZzj1GEPyWMpA/ZMzPzCbj9M6wTeXqRiTq31AMV10zcy545uYcLWks680M
CYvWTmjeCvwsbuaj9cn+efa47foH2UoeXxBmXWOJDv4WxcjE1ejmlmSd8WYfwkh9
hIkMzD8tW2iZf3ssnjCeQLa7f6ydL2P4cpnK2JH+N7hN9VOASAlciezroFxtCjU+
18T7ozgUTbOXZZomBX7OcGQ8ElXMiHB/uaCyNO64yVzApsUnQfpHzcRI5OavOYn5
dznjrzvNLCwHs3QFI4R7rsmIfPkOM0g5nY5drGwJ23+F+rVpLmpWVPR5hqT7a1HM
tJVmzces6HzOu7P1Mo0IwvNbZEmNBGTHYjGtWs6e79MQxdriFT4I+DwvFOy7GUq/
Is2b+HPwWhiWJHQXLTT2
=FMxH
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"Not much this time, but the changes include:
- moving domain allocation into the iommu drivers to prepare for the
introduction of default domains for devices
- fixing the IO page-table code in the AMD IOMMU driver to correctly
encode large page sizes
- extension of the PCI support in the ARM-SMMU driver
- various fixes and cleanups"
* tag 'iommu-updates-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (34 commits)
iommu/amd: Correctly encode huge pages in iommu page tables
iommu/amd: Optimize amd_iommu_iova_to_phys for new fetch_pte interface
iommu/amd: Optimize alloc_new_range for new fetch_pte interface
iommu/amd: Optimize iommu_unmap_page for new fetch_pte interface
iommu/amd: Return the pte page-size in fetch_pte
iommu/amd: Add support for contiguous dma allocator
iommu/amd: Don't allocate with __GFP_ZERO in alloc_coherent
iommu/amd: Ignore BUS_NOTIFY_UNBOUND_DRIVER event
iommu/amd: Use BUS_NOTIFY_REMOVED_DEVICE
iommu/tegra: smmu: Compute PFN mask at runtime
iommu/tegra: gart: Set aperture at domain initialization time
iommu/tegra: Setup aperture
iommu: Remove domain_init and domain_free iommu_ops
iommu/fsl: Make use of domain_alloc and domain_free
iommu/rockchip: Make use of domain_alloc and domain_free
iommu/ipmmu-vmsa: Make use of domain_alloc and domain_free
iommu/shmobile: Make use of domain_alloc and domain_free
iommu/msm: Make use of domain_alloc and domain_free
iommu/tegra-gart: Make use of domain_alloc and domain_free
iommu/tegra-smmu: Make use of domain_alloc and domain_free
...
* device-properties:
device property: Introduce firmware node type for platform data
device property: Make it possible to use secondary firmware nodes
driver core: Implement device property accessors through fwnode ones
driver core: property: Update fwnode_property_read_string_array()
driver core: Add comments about returning array counts
ACPI: Introduce has_acpi_companion()
driver core / ACPI: Represent ACPI companions using fwnode_handle
Get rid of domain_init and domain_destroy and implement
domain_alloc/domain_free instead.
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Add a new function iommu_context_addr() which takes care of the
differences and returns a pointer to a context entry which may be
in either format. The formats are binary compatible for all the old
fields anyway; the new one is just larger and some of the reserved
bits in the original 128 are now meaningful.
So far, nothing actually uses the new fields in the extended context
entry. Modulo hardware bugs with interpreting the new-style tables,
this should basically be a no-op.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Commit c875d2c1 ("iommu/vt-d: Exclude devices using RMRRs from IOMMU API
domains") prevents certain options for devices with RMRRs. This even
prevents those devices from getting a 1:1 mapping with 'iommu=pt',
because we don't have the code to handle *preserving* the RMRR regions
when moving the device between domains.
There's already an exclusion for USB devices, because we know the only
reason for RMRRs there is a misguided desire to keep legacy
keyboard/mouse emulation running in some theoretical OS which doesn't
have support for USB in its own right... but which *does* enable the
IOMMU.
Add an exclusion for graphics devices too, so that 'iommu=pt' works
there. We should be able to successfully assign graphics devices to
guests too, as long as the initial handling of stolen memory is
reconfigured appropriately. This has certainly worked in the past.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@vger.kernel.org
Device domains never span IOMMU hardware units, which allows the
domain ID space for each IOMMU to be an independent address space.
Therefore we can have multiple, independent domains, each with the
same domain->id, but attached to different hardware units. This is
also why we need to do a heavy-weight search for VM domains since
they can span multiple IOMMUs hardware units and we don't require a
single global ID to use for all hardware units.
Therefore, if we call iommu_detach_domain() across all active IOMMU
hardware units for a non-VM domain, the result is that we clear domain
IDs that are not associated with our domain, allowing them to be
re-allocated and causing apparent coherency issues when the device
cannot access IOVAs for the intended domain.
This bug was introduced in commit fb170fb4c5 ("iommu/vt-d: Introduce
helper functions to make code symmetric for readability"), but is
significantly exacerbated by the more recent commit 62c22167dd
("iommu/vt-d: Fix dmar_domain leak in iommu_attach_device") which calls
domain_exit() more frequently to resolve a domain leak.
Fixes: fb170fb4c5 ("iommu/vt-d: Introduce helper functions to make code symmetric for readability")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: stable@vger.kernel.org # v3.17+
Signed-off-by: Joerg Roedel <jroedel@suse.de>