Граф коммитов

296 Коммитов

Автор SHA1 Сообщение Дата
Thierry Reding 501be6c1c7 drm/tegra: Fix SMMU support on Tegra124 and Tegra210
When testing whether or not to enable the use of the SMMU, consult the
supported DMA mask rather than the actually configured DMA mask, since
the latter might already have been restricted.

Fixes: 2d9384ff91 ("drm/tegra: Relax IOMMU usage criteria on old Tegra")
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2020-04-28 11:44:07 +02:00
Dave Airlie e139e8aed0 drm/tegra: Fixes for v5.6-rc1
These are a couple of quick fixes for regressions that were found during
 the first two weeks of the merge window.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAl48S+kTHHRyZWRpbmdA
 bnZpZGlhLmNvbQAKCRDdI6zXfz6zodLzD/sGXZVXj2Y/IH2gLv2soU9PIsy43xr7
 y6D4kZIWPgGuQD+GNbU7I0k5zMQffJ3fWogXLnWyMhxj/RgjX1O0h6WWspBmWXlS
 zUtv9u4wkn2eVrKoRUyV694RJFqG3wmppYwPnWrx0UJFBQptUhp98XTUIXD634Vp
 FwM4ya+Yc0ucBc982XmZ/hkPZYnduQF9eZTnR0xbA/xZj4G3s9unNy/2mjVDrpZM
 MjYDzfXSMiVHL1ROLZLCsQVrfUMYoUgB8ill0o0DxE5kHL6KpA55j28WzxBwbZQQ
 iSxDbjtNPnQV3pR5/sBLpLP1CeYJ1dY9TVUcy+6u+f3SFcJUhM6LEsi5W2EgcBKT
 j+0Tag0i12cFvEDo1NbK4SGCp3OmeuFB0nAPT7OGtUMByWC+uhgspCQn0ZqVjEB6
 Iy4EGrhPqbUCEsF72VR6amgi0sd2lkyGaCY1F3Do2hrHs/EgrFN3bKDRhM63oAGT
 y9vH5eV92Q0CDAFpu+V6kBLOLod6dmyKhYexZdDfiA24BLsDoDeUafFlSi8luQmj
 3bwX7agPmvtaIyBakmLG7nRHK0EhNaP82EpxyZ/a3PHH9CWUa+jC3CsqlO3tvbBZ
 02mkmvXnYCyrL+qdh/abHfPktKQ49QM5JrHCp96hNLc27qyqyfzbC4FYtgA/3JCo
 Fn8pINhC7Mc/dg==
 =W1SC
 -----END PGP SIGNATURE-----

Merge tag 'drm/tegra/for-5.6-rc1-fixes' of git://anongit.freedesktop.org/tegra/linux into drm-next

drm/tegra: Fixes for v5.6-rc1

These are a couple of quick fixes for regressions that were found during
the first two weeks of the merge window.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200206172753.2185390-1-thierry.reding@gmail.com
2020-02-07 12:22:30 +10:00
Thierry Reding 98ae41adb2 gpu: host1x: Set DMA direction only for DMA-mapped buffer objects
The DMA direction is only used by the DMA API, so there is no use in
setting it when a buffer object isn't mapped with the DMA API.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
2020-02-06 18:23:12 +01:00
Thierry Reding 273da5a046 drm/tegra: Reuse IOVA mapping where possible
This partially reverts the DMA API support that was recently merged
because it was causing performance regressions on older Tegra devices.
Unfortunately, the cache maintenance performed by dma_map_sg() and
dma_unmap_sg() causes performance to drop by a factor of 10.

The right solution for this would be to cache mappings for buffers per
consumer device, but that's a bit involved. Instead, we simply revert to
the old behaviour of sharing IOVA mappings when we know that devices can
do so (i.e. they share the same IOMMU domain).

Cc: <stable@vger.kernel.org> # v5.5
Reported-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
2020-02-06 18:21:55 +01:00
Dave Airlie fd7226fbb2 drm/tegra: Changes for v5.6-rc1
This contains a small set of mostly fixes and some minor improvements.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAl4ZGs0THHRyZWRpbmdA
 bnZpZGlhLmNvbQAKCRDdI6zXfz6zoZUkEACo+fhKHTmkCz4W+4m90aEFL/VUi7Bn
 lFmssk2E71WvxUSWfLYawuhi/bgtMLvPmTi1KTCpvdvr+ladF1f4/Vd6vuhMu7Ec
 t2ZePYmRlpX+3PHW+V2fIRyBC9/6wSx2UEAbcMOijIPHlWDNdrxo26w0+af0dBJ3
 b8oIv5q5jX0GzZDOCOsBngUwnVzCYthVz7SA+UXOwBvNPO61kjx8IA/IfBHAQ+HF
 ZVpn//C7g2/wHq125kdrFUwAWQPvFMHOs1JPfXS9248kobkIabSp0XxLaj403cFU
 WWa/no5cPkG+WRwaX5egvE2/8P3mDzu3ev9gkgpdDJxnn1YL656mPWO7jNqW4IJd
 JD8kTk7ODh7KQvY3xdHD3MrHpHWHtHdOiFZtOygiW84W2yK5/g/LRd/oph/xP1RE
 /h8eiRP/mK7VkVRmgbiZTIEQVp10RKH56XtVd1Cf3eFNkig1Q8MLarwy2rhClyJO
 7cJZKoOqRfx0aMYipxx5+YWDXrCVwtNrQKTroJ27ycre3ykv+NAoeHeei9ie81Rg
 XZeCF3OuvVNXoUPQRS3XYkH70CTbdDTpkwymctcfwWOerhn65QAHvCSNR91ezmNq
 YwC7CkS6ymeG9UhSoCvNoLa1adfeztRCwg8RkKdHgHFtHNNQuyRlp4xNyrJNgoCO
 YX5dApw++iTOwQ==
 =OBEv
 -----END PGP SIGNATURE-----

Merge tag 'drm/tegra/for-5.6-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next

drm/tegra: Changes for v5.6-rc1

This contains a small set of mostly fixes and some minor improvements.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200111004835.2412858-1-thierry.reding@gmail.com
2020-01-15 16:21:28 +10:00
YueHaibing 033ccdb7f6 gpu: host1x: Remove dev_err() on platform_get_irq() failure
platform_get_irq() will call dev_err() itself on failure,
so there is no need for the driver to also do this.
This is detected by coccinelle.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2020-01-10 17:05:12 +01:00
Thierry Reding fd67e9c6ed drm/tegra: Do not implement runtime PM
The Tegra DRM driver heavily relies on the implementations for runtime
suspend/resume to be called at specific times. Unfortunately, there are
some cases where that doesn't work. One example is if the user disables
runtime PM for a given subdevice. Another example is that the PM core
acquires a reference to runtime PM during system sleep, effectively
preventing devices from going into low power modes. This is intentional
to avoid nasty race conditions, but it also causes system sleep to not
function properly on all Tegra systems.

Fix this by not implementing runtime PM at all. Instead, a minimal,
reference-counted suspend/resume infrastructure is added to the host1x
bus. This has the benefit that it can be used regardless of the system
power state (or any transitions we might be in), or whether or not the
user allows runtime PM.

Atomic modesetting guarantees that these functions will end up being
called at the right point in time, so the pitfalls for the more generic
runtime PM do not apply here.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2020-01-10 16:37:43 +01:00
Thierry Reding 608f43ad27 gpu: host1x: Rename "parent" to "host"
Rename the host1x clients' parent to "host" because that more closely
describes what it is. The parent can be confused with the parent device
in terms of the device hierarchy. Subsequent patches will add a new
member that refers to the parent in that hierarchy.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2020-01-10 16:37:38 +01:00
Daniel Vetter 6c56e8adc0 drm-misc-next for v5.6:
UAPI Changes:
 - Add support for DMA-BUF HEAPS.
 
 Cross-subsystem Changes:
 - mipi dsi definition updates, pulled into drm-intel as well.
 - Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim.
 - Remove support for dma-buf kmap/kunmap.
 - Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well.
 
 Core Changes:
 - Small cleanups to ttm.
 - Fix SCDC definition.
 - Assorted cleanups to core.
 - Add todo to remove load/unload hooks, and use generic fbdev emulation.
 - Assorted documentation updates.
 - Use blocking ww lock in ttm fault handler.
 - Remove drm_fb_helper_fbdev_setup/teardown.
 - Warning fixes with W=1 for atomic.
 - Use drm_debug_enabled() instead of drm_debug flag testing in various drivers.
 - Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted)
 - Various kconfig indentation fixes in core and drivers.
 - Fix freeing transactions in dp-mst correctly.
 - Sean Paul is steping down as core maintainer. :-(
 - Add lockdep annotations for atomic locks vs dma-resv.
 - Prevent use-after-free for a bad job in drm_scheduler.
 - Fill out all block sizes in the P01x and P210 definitions.
 - Avoid division by zero in drm/rect, and fix bounds.
 - Add drm/rect selftests.
 - Add aspect ratio and alternate clocks for HDMI 4k modes.
 - Add todo for drm_framebuffer_funcs and fb_create cleanup.
 - Drop DRM_AUTH for prime import/export ioctls.
 - Clear DP-MST payload id tables downstream when initializating.
 - Fix for DSC throughput definition.
 - Add extra FEC definitions.
 - Fix fake offset in drm_gem_object_funs.mmap.
 - Stop using encoder->bridge in core directly
 - Handle bridge chaining slightly better.
 - Add backlight support to drm/panel, and use it in many panel drivers.
 - Increase max number of y420 modes from 128 to 256, as preparation to add the new modes.
 
 Driver Changes:
 - Small fixes all over.
 - Fix documentation in vkms.
 - Fix mmap_sem vs dma_resv in nouveau.
 - Small cleanup in komeda.
 - Add page flip support in gma500 for psb/cdv.
 - Add ddc symlink in the connector sysfs directory for many drivers.
 - Add support for analogic an6345, and fix small bugs in it.
 - Add atomic modesetting support to ast.
 - Fix radeon fault handler VMA race.
 - Switch udl to use generic shmem helpers.
 - Unconditional vblank handling for mcde.
 - Miscellaneous fixes to mcde.
 - Tweak debug output from komeda using debugfs.
 - Add gamma and color transform support to komeda for DOU-IPS.
 - Add support for sony acx424AKP panel.
 - Various small cleanups to gma500.
 - Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation.
 - Add support for Logic PD Type 28 panel.
 - Use drm_panel_* wrapper functions in exynos/tegra/msm.
 - Add devicetree bindings for generic DSI panels.
 - Don't include drm_pci.h directly in many drivers.
 - Add support for begin/end_cpu_access in udmabuf.
 - Stop using drm_get_pci_dev in gma500 and mga200.
 - Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access.
 - Add devfreq thermal support to panfrost.
 - Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager.
 - meson: Add support for OSD1 plane AFBC commit.
 - Stop displaying garbage when toggling ast primary plane on/off.
 - More cleanups and fixes to UDL.
 - Add D32 suport to komeda.
 - Remove globle copy of drm_dev in gma500.
 - Add support for Boe Himax8279d MIPI-DSI LCD panel.
 - Add support for ingenic JZ4770 panel.
 - Small null pointer deference fix in ingenic.
 - Remove support for the special tfp420 driver, as there is a generic way to do it.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl34lkkACgkQ/lWMcqZw
 E8M76g//WRYl9fWnV063s44FBVJYjGxaus0vQJSGidaPCIE6Ep6TNjXp8DVzV82M
 HR79P9glL02DC9B8pflioNNXdIRGSVk/FJcKVB2seFAqEFCAknvWDM/X/y+mOUpp
 fUeFl+Znlwx3YlM8f4Qujdbm+CbTewfbya4VAWeWd8XG2V8jfq5cmODPPlUMNenZ
 J6Ja+W3ph741uSIfAKaP69LVJgOcuUjXINE4SWhRk/i5QF3GIRej/A7ZjWGLQ/t2
 2zUUF7EiCzhPomM40H3ddKtXb4ZjNJuc5pOD4GpxR8ciNbe2gUOHEZ5aenwYBdsU
 5MwbxNKyMbKXATtn3yv3fSc4jH3DtmEKpmovONeO8ZDBrQBnxeYa3tQvfkNghA2f
 acoZMzYUImV+ft6DMIgpXppASvo7mQYDAbLPOGEJ9E44AL4UP00jesEjnK5FOHSR
 3BEzGUnK/6QL5zFNPni8YZQ8dan4jDIno1mqIV+cQ4WCGlaKckzIWO6243Bf13b/
 kROSJpgWkiK6Ngq0ofhD0MHyT/m1QnqUzWRKTJhRtPflSWRBsDZqWCQ5Vx1QlNIE
 /HfTNbTpXWwa+5wXbbB8TkDw5t9cQGnR+QcrEd9HgoIec7B5Re8rx9i0TJAT4N05
 03RCQCecSfD8gwKd2wgaFIpFGRl9lTdLYSpffSmyL2X5a20lZhM=
 =b15X
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-12-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.6:

UAPI Changes:
- Add support for DMA-BUF HEAPS.

Cross-subsystem Changes:
- mipi dsi definition updates, pulled into drm-intel as well.
- Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim.
- Remove support for dma-buf kmap/kunmap.
- Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well.

Core Changes:
- Small cleanups to ttm.
- Fix SCDC definition.
- Assorted cleanups to core.
- Add todo to remove load/unload hooks, and use generic fbdev emulation.
- Assorted documentation updates.
- Use blocking ww lock in ttm fault handler.
- Remove drm_fb_helper_fbdev_setup/teardown.
- Warning fixes with W=1 for atomic.
- Use drm_debug_enabled() instead of drm_debug flag testing in various drivers.
- Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted)
- Various kconfig indentation fixes in core and drivers.
- Fix freeing transactions in dp-mst correctly.
- Sean Paul is steping down as core maintainer. :-(
- Add lockdep annotations for atomic locks vs dma-resv.
- Prevent use-after-free for a bad job in drm_scheduler.
- Fill out all block sizes in the P01x and P210 definitions.
- Avoid division by zero in drm/rect, and fix bounds.
- Add drm/rect selftests.
- Add aspect ratio and alternate clocks for HDMI 4k modes.
- Add todo for drm_framebuffer_funcs and fb_create cleanup.
- Drop DRM_AUTH for prime import/export ioctls.
- Clear DP-MST payload id tables downstream when initializating.
- Fix for DSC throughput definition.
- Add extra FEC definitions.
- Fix fake offset in drm_gem_object_funs.mmap.
- Stop using encoder->bridge in core directly
- Handle bridge chaining slightly better.
- Add backlight support to drm/panel, and use it in many panel drivers.
- Increase max number of y420 modes from 128 to 256, as preparation to add the new modes.

Driver Changes:
- Small fixes all over.
- Fix documentation in vkms.
- Fix mmap_sem vs dma_resv in nouveau.
- Small cleanup in komeda.
- Add page flip support in gma500 for psb/cdv.
- Add ddc symlink in the connector sysfs directory for many drivers.
- Add support for analogic an6345, and fix small bugs in it.
- Add atomic modesetting support to ast.
- Fix radeon fault handler VMA race.
- Switch udl to use generic shmem helpers.
- Unconditional vblank handling for mcde.
- Miscellaneous fixes to mcde.
- Tweak debug output from komeda using debugfs.
- Add gamma and color transform support to komeda for DOU-IPS.
- Add support for sony acx424AKP panel.
- Various small cleanups to gma500.
- Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation.
- Add support for Logic PD Type 28 panel.
- Use drm_panel_* wrapper functions in exynos/tegra/msm.
- Add devicetree bindings for generic DSI panels.
- Don't include drm_pci.h directly in many drivers.
- Add support for begin/end_cpu_access in udmabuf.
- Stop using drm_get_pci_dev in gma500 and mga200.
- Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access.
- Add devfreq thermal support to panfrost.
- Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager.
- meson: Add support for OSD1 plane AFBC commit.
- Stop displaying garbage when toggling ast primary plane on/off.
- More cleanups and fixes to UDL.
- Add D32 suport to komeda.
- Remove globle copy of drm_dev in gma500.
- Add support for Boe Himax8279d MIPI-DSI LCD panel.
- Add support for ingenic JZ4770 panel.
- Small null pointer deference fix in ingenic.
- Remove support for the special tfp420 driver, as there is a generic way to do it.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ba73535a-9334-5302-2e1f-5208bd7390bd@linux.intel.com
2019-12-17 13:57:54 +01:00
Daniel Vetter 7a8139c54e drm/tegra: Map cmdbuf once for reloc processing
A few reasons to drop kmap:

- For native objects all we do is look at obj->vaddr anyway, so might
  as well not call functions for every page.

- Reloc-processing on dma-buf is ... questionable.

- Plus most dma-buf that bother kernel cpu mmaps give you at least
  vmap, much less kmaps. And all the ones relevant for arm-soc are
  again doing a obj->vaddr game anyway, there's no real kmap going on
  on arm it seems.

Plus this seems to be the only real in-tree user of dma_buf_kmap, and
I'd like to get rid of that.

Reviewed-by: Thierry Reding <treding@nvidia.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: linux-tegra@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20191118103536.17675-2-daniel.vetter@ffwll.ch
2019-11-25 22:36:01 +01:00
Thierry Reding c8a2036474 gpu: host1x: Unconditionally select IOMMU_IOVA
Currently configurations can be generated where IOMMU_SUPPORT is
disabled but IOMMU_IOVA is built as a module and HOST1X as built-in. In
such a case, the symbols guarded by IOMMU_IOVA will not be available
when linking the host1x driver and cause a linking failure.

Simplify this by unconditionally selecting IOMMU_IOVA, which makes sure
that it will be forced to =y if HOST1X=y. Technically we can now get
IOMMU_IOVA code built-in even if we don't use it (host1x only uses it
when IOMMU_SUPPORT is also enabled), but such configuration are of a
mostly academic nature. In all practical configurations we want IOMMU
support anyway.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-11-01 10:49:17 +01:00
Thierry Reding 06867a362d gpu: host1x: Set DMA mask based on IOMMU setup
If the Tegra DRM clients are backed by an IOMMU, push buffers are likely
to be allocated beyond the 32-bit boundary if sufficient system memory
is available. This is problematic on earlier generations of Tegra where
host1x supports a maximum of 32 address bits for the GATHER opcode. More
recent versions of Tegra (Tegra186 and later) have a wide variant of the
GATHER opcode, which allows addressing up to 64 bits of memory.

If host1x itself is behind an IOMMU as well this doesn't matter because
the IOMMU's input address space is restricted to 32 bits on generations
without support for wide GATHER opcodes.

However, if host1x is not behind an IOMMU, it won't be able to process
push buffers beyond the 32-bit boundary on Tegra generations that don't
support wide GATHER opcodes. Restrict the DMA mask to 32 bits on these
generations prevents buffers from being allocated from beyond the 32-bit
boundary.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-29 15:04:35 +01:00
Thierry Reding af1cbfb9bf gpu: host1x: Support DMA mapping of buffers
If host1x_bo_pin() returns an SG table, create a DMA mapping for the
buffer. For buffers that the host1x client has already mapped itself,
host1x_bo_pin() returns NULL and the existing DMA address is used.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-29 15:04:35 +01:00
Thierry Reding b78e70c04c gpu: host1x: Allocate gather copy for host1x
Currently when the gather buffers are copied, they are copied to a
buffer that is allocated for the host1x client that wants to execute the
command streams in the buffers. However, the gather buffers will be read
by the host1x device, which causes SMMU faults if the DMA API is backed
by an IOMMU.

Fix this by allocating the gather buffer copy for the host1x device,
which makes sure that it will be mapped into the host1x's IOVA space if
the DMA API is backed by an IOMMU.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-29 15:04:35 +01:00
Thierry Reding 44156eee91 gpu: host1x: Clean up debugfs on removal
The debugfs files created for host1x are never removed, causing these
files to be left dangling in debugfs. This results in a crash when any
of these files are accessed after the host1x driver has been removed,
as well as a failure to create the debugfs entries when they are added
again on driver probe.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-29 15:04:34 +01:00
Thierry Reding 80327ce3d4 gpu: host1x: Overhaul host1x_bo_{pin,unpin}() API
The host1x_bo_pin() and host1x_bo_unpin() APIs are used to pin and unpin
buffers during host1x job submission. Pinning currently returns the SG
table and the DMA address (an IOVA if an IOMMU is used or a physical
address if no IOMMU is used) of the buffer. The DMA address is only used
for buffers that are relocated, whereas the host1x driver will map
gather buffers into its own IOVA space so that they can be processed by
the CDMA engine.

This approach has a couple of issues. On one hand it's not very useful
to return a DMA address for the buffer if host1x doesn't need it. On the
other hand, returning the SG table of the buffer is suboptimal because a
single SG table cannot be shared for multiple mappings, because the DMA
address is stored within the SG table, and the DMA address may be
different for different devices.

Subsequent patches will move the host1x driver over to the DMA API which
doesn't work with a single shared SG table. Fix this by returning a new
SG table each time a buffer is pinned. This allows the buffer to be
referenced by multiple jobs for different engines.

Change the prototypes of host1x_bo_pin() and host1x_bo_unpin() to take a
struct device *, specifying the device for which the buffer should be
pinned. This is required in order to be able to properly construct the
SG table. While at it, make host1x_bo_pin() return the SG table because
that allows us to return an ERR_PTR()-encoded error code if we need to,
or return NULL to signal that we don't need the SG table to be remapped
and can simply use the DMA address as-is. At the same time, returning
the DMA address is made optional because in the example of command
buffers, host1x doesn't need to know the DMA address since it will have
to create its own mapping anyway.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-29 15:04:34 +01:00
Ben Dooks (Codethink) 33904487f1 gpu: host1x: Make host1x_cdma_wait_pushbuffer_space() static
The host1x_cdma_wait_pushbuffer_space() function is not declared or
directly called from outside the file it is in, so make it static.

Fixes the following sparse warning:

    drivers/gpu/host1x/cdma.c:235:5: warning: symbol 'host1x_cdma_wait_pushbuffer_space' was not declared. Should it be static?

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-28 11:18:34 +01:00
Thierry Reding caccddcfc4 gpu: host1x: Request channels for clients, not devices
A struct device doesn't carry much information that a channel might be
interested in, but the client very much does. Request channels for the
clients rather than their parent devices and store a pointer to them
in order to have that information available when needed.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-28 11:18:33 +01:00
Thierry Reding 8f45f5071a gpu: host1x: Explicitly initialize host1x_info structures
It's technically not required to explicitly initialize the fields that
will be zero by default, but it's easier to read these structures if
they are all initialized uniformly.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-28 11:18:33 +01:00
Thierry Reding b9cd7b954a gpu: host1x: Remove gratuitous blank line
Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-28 11:18:33 +01:00
Thierry Reding d98914ebc2 gpu: host1x: Do not limit DMA segment size
host1x nor any its clients have any limitations on the DMA segment size,
so don't pretend that they do.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-10-28 11:18:08 +01:00
Dave Airlie dfd03396d7 drm/tegra: Changes for v5.3-rc1
This contains a couple of small improvements and cleanups for the Tegra
 DRM driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAl0M8jsTHHRyZWRpbmdA
 bnZpZGlhLmNvbQAKCRDdI6zXfz6zoU0MEACZhCSKLvinuCw2pWdsc2A5rZN3r0Mr
 9TpdIKXYwcHO2G2gYC1mXv0wlp2pzIOr5y3VHLl1VboWXlPZjvgH0OiDtmwZk/vU
 jjFOGZO5ZjwyVNAZ8SlQrpdBNIAm3wseVfrhVURIL+9dANDgXsxWiqTkgnEmVclZ
 Kj2SpXfysTM9TJyXfQW8op6jKtP3NJ/IPEtTguNL1R9ho0phlcRYBUMEIYqtgBVE
 aRsjrrbM27OGf+JFPY3C7bW90hzgVqZLeK9R4AAMPS8iZ91k9njlFZ20LI9Yk/P1
 GBVyemiRtcG3owbli3NlfFzaeNjmM2PmcMZJsOyf6T+UH5juH2XZ/TV5O6i/jq0z
 /8DWsuFdEMX5tdP0t4B3vnbGQTYMOo/6bRsZHceLXcpKFNuJcC6lY233/Fc3D1bw
 gWm5ZHTs3SmPm2JoWmA54h2oiXU8/hGiPZGoUDHUTxf/h7DOGhK2hfaNdB6ZYXJu
 be4yS5TidFlFKi911JuXblCDeFf0VPsOfemQtJW4OvKg9FD6WmRuehYPytJ8ifB1
 dByXh5siOVMcyE0a2bsMPAvsxK6z4pPwyDz34AJUETJ0DSmSfWeEMSbCmpJwt6m5
 35IiMossSULJXWjqtc5bcK1gmbYMJSEwj+Xe33O+H67THSgo4GLxqKdb++5QK5Ky
 svluhQIjYYx3pw==
 =T2hU
 -----END PGP SIGNATURE-----

Merge tag 'drm/tegra/for-5.3-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next

drm/tegra: Changes for v5.3-rc1

This contains a couple of small improvements and cleanups for the Tegra
DRM driver.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621150753.19550-1-thierry.reding@gmail.com
2019-06-25 12:59:43 +10:00
Greg Kroah-Hartman eb7cf945a8 host1x: debugfs_create_dir() can never return NULL
So there is no need to check for a value that can never happen.  No need
to check the return value all anyway, as any debugfs call can take the
result of this function as an option just fine.

Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-tegra@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-06-13 18:23:38 +02:00
Thomas Gleixner 9c92ab6191 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 282
Based on 1 normalized pattern(s):

  this software is licensed under the terms of the gnu general public
  license version 2 as published by the free software foundation and
  may be copied distributed and modified under those terms this
  program is distributed in the hope that it will be useful but
  without any warranty without even the implied warranty of
  merchantability or fitness for a particular purpose see the gnu
  general public license for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 285 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141900.642774971@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:36:37 +02:00
Thierry Reding 31fa25f100 gpu: host1x: Do not link logical devices to DT nodes
Logical devices created by the host1x bus infrastructure don't need to
be associated with a device tree node. Doing so will cause the driver
core to attempt to hook up IOMMU operations and fail because it is not
a real device.

However, for backwards-compatibility, we need to provide various OF_*
uevent variables that were previously provided by of_device_uevent() and
which are parsed by libdrm in userspace when querying the available
devices. Do this by implementing a uevent callback for the host1x bus.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-06-05 15:06:03 +02:00
Thierry Reding 1e390478cf gpu: host1x: Increase maximum DMA segment size
Recent versions of the DMA API debug code have started to warn about
violations of the maximum DMA segment size. This is because the segment
size defaults to 64 KiB, which can easily be exceeded in large buffer
allocations such as used in DRM/KMS for framebuffers.

Technically the Tegra SMMU and ARM SMMU don't have a maximum segment
size (they map individual pages irrespective of whether they are
contiguous or not), so the choice of 4 MiB is a bit arbitrary here. The
maximum segment size is a 32-bit unsigned integer, though, so we can't
set it to the correct maximum size, which would be the size of the
aperture.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-06-05 15:06:03 +02:00
Thierry Reding 4bb923e807 gpu: host1x: Do not output error message for deferred probe
When deferring probe, avoid logging a confusing error message. While at
it, make the error message more informational.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-06-05 15:05:49 +02:00
Thomas Gleixner 9952f6918d treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms and conditions of the gnu general public license
  version 2 as published by the free software foundation this program
  is distributed in the hope it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not see http www gnu org
  licenses

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 228 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190528171438.107155473@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:29:52 -07:00
Thomas Gleixner ec8f24b7fa treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:46 +02:00
Arnd Bergmann 8bbad1ba31 gpu: host1x: Program stream ID to bypass without SMMU
If SMMU support is not available, fall back to programming the bypass
stream ID (0x7f).

Fixes: de5469c21f ("gpu: host1x: Program the channel stream ID")
Suggested-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
[treding@nvidia.com: rebase this on top of a later build fix]
Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-04-11 17:40:35 +02:00
Stefan Agner e154592a1d gpu: host1x: Fix compile error when IOMMU API is not available
In case the IOMMU API is not available compiling host1x fails with
the following error:

  In file included from drivers/gpu/host1x/hw/host1x06.c:27:
  drivers/gpu/host1x/hw/channel_hw.c: In function ‘host1x_channel_set_streamid’:
  drivers/gpu/host1x/hw/channel_hw.c:118:30: error: implicit declaration of function
    ‘dev_iommu_fwspec_get’; did you mean ‘iommu_fwspec_free’?  [-Werror=implicit-function-declaration]
  struct iommu_fwspec *spec = dev_iommu_fwspec_get(channel->dev->parent);
                              ^~~~~~~~~~~~~~~~~~~~
                              iommu_fwspec_free

Fixes: de5469c21f ("gpu: host1x: Program the channel stream ID")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-04-11 10:35:39 +02:00
Dmitry Osipenko 79930bafe2 gpu: host1x: Continue CDMA execution starting with a next job
Currently gathers of a hung job are getting NOP'ed and a restarted CDMA
executes the NOP'ed gathers. There shouldn't be a reason to not restart
CDMA execution starting with a next job, avoiding the unnecessary churning
with gathers NOP'ing.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07 18:34:25 +01:00
Dmitry Osipenko 5d6f043685 gpu: host1x: Don't complete a completed job
There is a chance that the last job has been completed at the time of
CDMA timeout handler invocation. In this case there is no need to complete
the completed job.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07 18:30:58 +01:00
Dmitry Osipenko e8bad65938 gpu: host1x: Cancel only job that actually got stuck
Host1x doesn't have information about jobs inter-dependency, that is
something that will become available once host1x will get a proper
jobs scheduler implementation. Currently a hang job causes other unrelated
jobs to be canceled, that is a relic from downstream driver which is
irrelevant to upstream. Let's cancel only the hanging job and not to touch
other jobs in queue.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07 18:30:24 +01:00
Thierry Reding e1f338c0f8 gpu: host1x: Optimize CDMA push buffer memory usage
The host1x CDMA push buffer is terminated by a special opcode (RESTART)
that tells the CDMA to wrap around to the beginning of the push buffer.
To accomodate the RESTART opcode, an extra 4 bytes are allocated on top
of the 512 * 8 = 4096 bytes needed for the 512 slots (1 slot = 2 words)
that are used for other commands passed to CDMA. This requires that two
memory pages are allocated, but most of the second page (4092 bytes) is
never used.

Decrease the number of slots to 511 so that the RESTART opcode fits
within the page. Adjust the push buffer wraparound code to take into
account push buffer sizes that are not a power of two.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07 18:28:59 +01:00
Thierry Reding 0e43b8da15 gpu: host1x: Use correct semantics for HOST1X_CHANNEL_DMAEND
The HOST1X_CHANNEL_DMAEND is an offset relative to the value written to
the HOST1X_CHANNEL_DMASTART register, but it is currently treated as an
absolute address. This can cause SMMU faults if the CDMA fetches past a
pushbuffer's IOMMU mapping.

Properly setting the DMAEND prevents the CDMA from fetching beyond that
address and avoid such issues. This is currently not observed because a
whole (almost) page of essentially scratch space absorbs any excessive
prefetching by CDMA. However, changing the number of slots in the push
buffer can trigger these SMMU faults.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07 18:28:58 +01:00
Thierry Reding 8de896eb20 gpu: host1x: Support 40-bit addressing on Tegra186
The host1x and clients instantiated on Tegra186 support addressing 40
bits of memory.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07 18:28:58 +01:00
Thierry Reding 38fabcc953 gpu: host1x: Restrict IOVA space to DMA mask
On Tegra186 and later, the ARM SMMU provides an input address space that
is 48 bits wide. However, memory clients can only address up to 40 bits.
If the geometry is used as-is, allocations of IOVA space can end up in a
region that is not addressable by the memory clients.

To fix this, restrict the IOVA space to the DMA mask of the host1x
device.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07 18:28:57 +01:00
Thierry Reding 67a82dbc0a gpu: host1x: Support 40-bit addressing
Tegra186 and later support 40 bits of address space. Additional
registers need to be programmed to store the full 40 bits of push
buffer addresses.

Since command stream gathers can also reside in buffers in a 40-bit
address space, a new variant of the GATHER opcode is also introduced.
It takes two parameters: the first parameter contains the lower 32
bits of the address and the second parameter contains bits 32 to 39.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07 18:28:35 +01:00
Thierry Reding 5a5fccbd8c gpu: host1x: Introduce support for wide opcodes
The CDMA push buffer can currently only handle opcodes that take a
single word parameter. However, the host1x implementation on Tegra186
and later supports opcodes that require multiple words as parameters.

Unfortunately the way the push buffer is structured, these wide opcodes
cannot simply be composed of two regular opcodes because that could
result in the wide opcode being split across the end of the push buffer
and the final RESTART opcode required to wrap the push buffer around
would break the wide opcode.

One way to fix this would be to remove the concept of slots to simplify
push buffer operations. However, that's not entirely trivial and should
be done in a separate patch. For now, simply use a different function
to push four-word opcodes into the push buffer. Technically only three
words are pushed, with the fourth word used as padding to preserve the
2-word alignment required by the slots abstraction. The fourth word is
always a NOP opcode.

Additional care must be taken when the end of the push buffer is
reached. If a four-word opcode doesn't fit into the push buffer without
being split by the boundary, NOP opcodes will be introduced and the new
wide opcode placed at the beginning of the push buffer.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07 18:28:35 +01:00
Thierry Reding de5469c21f gpu: host1x: Program the channel stream ID
When processing command streams, make sure the host1x's stream ID is
programmed for the channel so that addresses are properly translated
through the SMMU.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-07 18:28:33 +01:00
Thierry Reding 6841482b82 gpu: host1x: Set up stream ID table
In order to enable the MMIO path stream ID protection provided by the
incarnation of host1x found in Tegra186 and later, the host1x must be
provided with the list of stream ID register offsets for each of its
clients. Some clients (such as VIC) have multiple stream ID registers
that are assumed to be contiguous. The host1x is programmed with the
base offset and a limit which provide the range of registers that the
host1x needs to monitor for writes.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-04 08:35:55 +01:00
Thierry Reding f67524caf4 gpu: host1x: Represent host1x bus devices in debugfs
This new debugfs file represents the state of host1x bus devices,
specifying the list of subdevices and marking which ones have
successfully registered.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-04 08:35:54 +01:00
Arnd Bergmann 0747a672a3 gpu: host1x: Use completion instead of semaphore
In this usage, the two are completely equivalent, but the completion
documents better what is going on, and we generally try to avoid
semaphores these days.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2019-02-04 08:35:54 +01:00
Thierry Reding ac1bdbf22b gpu: host1x: Add Tegra194 support
The host1x hardware found on Tegra194 is mostly backwards compatible
with the version found on Tegra186, with the notable exceptions of the
increased number of syncpoints and mlocks. In addition, some rarely
used features such as syncpoint wait bases were dropped and some
registers had to move around to accomodate the increased number of
syncpoints.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-11-29 17:11:49 +01:00
Thierry Reding 2fc777ba84 gpu: host1x: Fix syncpoint ID field size on Tegra186
The number of syncpoints on Tegra186 is 576 and therefore no longer fits
into 8 bits. Increase the size of the syncpoint ID field to 10 in order
to accomodate all syncpoints.

Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-11-27 17:18:39 +01:00
Thierry Reding b7c61d511d gpu: host1x: Resize channel register region on Tegra186 and later
The register region allocated per channel was decreased from 16384 bytes
to 256 bytes on Tegra186 and later. Resize the region to make sure every
channel (instead of only the first) is properly programmed.

Suggested-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-11-27 17:18:26 +01:00
Dmitry Osipenko e31c8ea5af gpu: host1x: Detach Host1x from IOMMU DMA domain on arm32
Host1x is getting attached to an implicit IOMMU DMA domain if
CONFIG_ARM_DMA_USE_IOMMU=y. Since Host1x driver manages IOMMU by
itself, Host1x device must be detached from the implicit domain using
arch-specific IOMMU-API.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-09-26 17:11:14 +02:00
Thierry Reding 50bac83c80 gpu: host1x: Remove spurious tab
All other assignments have a single space around the = sign, so remove
the spurious tab for consistency.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-09-26 16:09:46 +02:00
Dmitry Osipenko ec58923215 gpu: host1x: Check whether size of unpin isn't 0
Only gather pins are mapped by the Host1x driver, regular BO relocations
are not. Check whether size of unpin isn't 0, otherwise IOVA allocation at
0x0 could be erroneously released.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-07-09 10:31:30 +02:00
Dmitry Osipenko 4466b1f0e0 gpu: host1x: Skip IOMMU initialization if firewall is enabled
Host1x's CDMA can't access the command buffers if IOMMU and Host1x
firewall are enabled in the kernels config because firewall doesn't map
the copied buffer into IOVA space. Fix this by skipping IOMMU
initialization if firewall is enabled as firewall merges sparse cmdbufs
into a single contiguous buffer and hence IOMMU isn't needed in this case.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-07-09 10:31:10 +02:00
Linus Torvalds 135c5504a6 drm for v4.18-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbFzauAAoJEAx081l5xIa+jzwP/AwfTreH3XBmlLeMO8kwkAMW
 AiH1u3KHrlitN1U85x90dC8hVuuV2Kv9pEXQo1me/TAL/QCb9VZYCSf2dHHkkMwD
 1UdwxAQW7soBbTRo9k0zuVJpRGSYIiYfqDh3L6KmTC3UF0tUJm53VOWCLG/DNrtn
 XhEnGnlCUNbABZMMpavEvlwxPtvaYFlp6M+MmwRPIx32SrPJ3vSaBzxwWxOaFmjU
 xiRdK+GpAbCV8yGeCCSkDgEe/TdWGUhZxoC9dLb0H9ex7ip8uZ0W4D+VTHPFrhQX
 6nCpqUbp7BQTsbVSd1pAVsAv45scmSgWbKcqfC0NKSVcsHcJZBR0tQOF9OvnGZcf
 o1Hv/beTqJ++IcG2rEIwyJTGxAGfZ0YSb0evTC9VcszaYo+b3+G283bdztIjzDeS
 0QCTeLHYbZRHPITWVULNpMWy3TkJv32IdFhQfYSnD8/OGQIxLNhh4FFOtHnOmxSF
 N8dnzOLKXhXMo/NgOL+UMNnbgLqIyOtEXCPDLuOQJNv/SOp8662m/A0yRjQNR6M2
 gsPmR7dxQIwwJMyqrkLDOF411ABZohulquYgwLgG938MRPmTpPWOR72PtpGF4hAW
 HLg+3HHBd1N/A1mlJUMAbUn2eMUACZBUIycE9u+U/geRgve/OQnzJH/FKGP2EJ4R
 pf6CruEva+6GRR5GVzuM
 =twst
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2018-06-06-1' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "This starts to support NVIDIA volta hardware with nouveau, and adds
  amdgpu support for the GPU in the Kabylake-G (the intel + radeon
  single package chip), along with some initial Intel icelake enabling.

  Summary:

  New Drivers:
   - v3d - driver for broadcom V3D V3.x+ hardware
   - xen-front - XEN PV display frontend

  core:
   - handle zpos normalization in the core
   - stop looking at legacy pointers in atomic paths
   - improved scheduler documentation
   - improved aspect ratio validation
   - aspect ratio support for 64:27 and 256:135
   - drop unused control node code.

  i915:
   - Icelake (ICL) enabling
   - GuC/HuC refactoring
   - PSR/PSR2 enabling and fixes
   - DPLL management refactoring
   - DP MST fixes
   - NV12 enabling
   - HDCP improvements
   - GEM/Execlist/reset improvements
   - GVT improvements
   - stolen memory first 4k fix

  amdgpu:
   - Vega 20 support
   - VEGAM support (Kabylake-G)
   - preOS scanout buffer reservation
   - power management gfxoff support for raven
   - SR-IOV fixes
   - Vega10 power profiles and clock voltage control
   - scatter/gather display support on CZ/ST

  amdkfd:
   - GFX9 dGPU support
   - userptr memory mapping

  nouveau:
   - major refactoring for Volta GV100 support

  tda998x:
   - HDMI i2c CEC support

  etnaviv:
   - removed unused logging code
   - license text cleanups
   - MMU handling improvements
   - timeout fence fix for 50 days uptime

  tegra:
   - IOMMU support in gr2d/gr3d drivers
   - zpos support

  vc4:
   - syncobj support
   - CTM, plane alpha and async cursor support

  analogix_dp:
   - HPD and aux chan fixes

  sun4i:
   - MIPI DSI support

  tilcdc:
   - clock divider fixes for OMAP-l138 LCDK board

  rcar-du:
   - R8A77965 support
   - dma-buf fences fixes
   - hardware indexed crtc/du group handling
   - generic zplane property support

  atmel-hclcdc:
   - generic zplane property support

  mediatek:
   - use generic video mode function

  exynos:
   - S5PV210 FIMD variant support
   - IPP v2 framework
   - more HW overlays support"

* tag 'drm-next-2018-06-06-1' of git://anongit.freedesktop.org/drm/drm: (1286 commits)
  drm/amdgpu: fix 32-bit build warning
  drm/exynos: fimc: signedness bug in fimc_setup_clocks()
  drm/exynos: scaler: fix static checker warning
  drm/amdgpu: Use dev_info() to report amdkfd is not supported for this ASIC
  drm/amd/display: Remove use of division operator for long longs
  drm/amdgpu: Update GFX info structure to match what vega20 used
  drm/amdgpu/pp: remove duplicate assignment
  drm/sched: add rcu_barrier after entity fini
  drm/amdgpu: move VM BOs on LRU again
  drm/amdgpu: consistenly use VM moved flag
  drm/amdgpu: kmap PDs/PTs in amdgpu_vm_update_directories
  drm/amdgpu: further optimize amdgpu_vm_handle_moved
  drm/amdgpu: cleanup amdgpu_vm_validate_pt_bos v2
  drm/amdgpu: rework VM state machine lock handling v2
  drm/amdgpu: Add runtime VCN PG support
  drm/amdgpu: Enable VCN static PG by default on RV
  drm/amdgpu: Add VCN static PG support on RV
  drm/amdgpu: Enable VCN CG by default on RV
  drm/amdgpu: Add static CG control for VCN on RV
  drm/exynos: Fix default value for zpos plane property
  ...
2018-06-06 08:16:33 -07:00
Thierry Reding 326bbd79fd gpu: host1x: Use not explicitly sized types
The number of words and the offset in a gather don't need to be
explicitly sized, so make them unsigned int instead.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18 21:51:37 +02:00
Thierry Reding 06490bb99e gpu: host1x: Rename relocarray -> relocs for consistency
All other array variables use a plural, and this is the only one using
the *array suffix. This is confusing, so rename it for consistency.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18 21:51:25 +02:00
Thierry Reding ac330f45c7 gpu: host1x: Drop unnecessary host1x argument
Functions taking a pointer to a host1x syncpoint as an argument don't
need to specify a pointer to a host1x instance because it can be
obtained from the syncpoint.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18 21:51:01 +02:00
Thierry Reding d4ad3ad9b8 gpu: host1x: Cleanup loop variable usage
Use unsigned int where possible and don't unnecessarily initialize the
loop variable.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18 21:50:40 +02:00
Thierry Reding bf3d41ccab gpu: host1x: Store pointer to client in jobs
Rather than storing some identifier derived from the application
context that can't be used concretely anywhere, store a pointer to the
client directly so that accesses can be made directly through that
client object.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18 21:50:24 +02:00
Thierry Reding 24c94e166d gpu: host1x: Remove wait check support
The job submission userspace ABI doesn't support this and there are no
plans to implement it, so all of this code is dead and can be removed.

Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18 21:50:04 +02:00
Emil Goode 2f8a6da866 gpu: host1x: Fix compiler errors by converting to dma_addr_t
The compiler is complaining with the following errors:

drivers/gpu/host1x/cdma.c:94:48: error:
	passing argument 3 of ‘dma_alloc_wc’ from incompatible pointer type
	[-Werror=incompatible-pointer-types]

drivers/gpu/host1x/cdma.c:113:48: error:
	passing argument 3 of ‘dma_alloc_wc’ from incompatible pointer type
	[-Werror=incompatible-pointer-types]

The expected pointer type of the third argument to dma_alloc_wc() is
dma_addr_t but phys_addr_t is passed.

Change the phys member of struct push_buffer to be dma_addr_t so that we
pass the correct type to dma_alloc_wc().
Also check pb->mapped for non-NULL in the destroy function as that is the
right way of checking if dma_alloc_wc() was successful.

Signed-off-by: Emil Goode <emil.fsw@goode.io>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-18 11:23:48 +02:00
Thierry Reding f40e1590c5 gpu: host1x: Acquire a reference to the IOVA cache
The IOVA API uses a memory cache to allocate IOVA nodes from. To make
sure that this cache is available, obtain a reference to it and release
the reference when the cache is no longer needed.

On 64-bit ARM this is hidden by the fact that the DMA mapping API gets
that reference and never releases it. On 32-bit ARM, this is papered
over by the Tegra DRM driver (the sole user of the host1x API requiring
the cache) acquiring a reference to the IOVA cache for its own purposes.
However, there may be additional users of this API in the future, so fix
this upfront to avoid surprises.

Fixes: 404bfb78da ("gpu: host1x: Add IOMMU support")
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-17 17:44:48 +02:00
Dmitry Osipenko 27db6a0073 gpu: host1x: Fix dma_free_wc() argument in the error path
If IOVA allocation or IOMMU mapping fails, dma_free_wc() is invoked with
size=0 because of a typo, that triggers "kernel BUG at mm/vmalloc.c:124!".

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-05-17 17:44:48 +02:00
Christoph Hellwig 3d6ce86ee7 drivers: remove force dma flag from buses
With each bus implementing its own DMA configuration callback, there is no
need for bus to explicitly set the force_dma flag.  Modify the
of_dma_configure function to accept an input parameter which specifies if
implicit DMA configuration is required when it is not described by the
firmware.

Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>  # PCI parts
Reviewed-by: Rob Herring <robh@kernel.org>
[hch: tweaked the changelog a bit]
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-03 16:25:08 +02:00
Nipun Gupta 07397df29e dma-mapping: move dma configuration to bus infrastructure
ACPI/OF support for configuration of DMA is a bus specific aspect, and
thus should be configured by the bus.  Introduces a 'dma_configure' bus
method so that busses can control their DMA capabilities.

Also update the PCI, Platform, ACPI and host1x buses to use the new
method.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>  # PCI parts
Acked-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[hch: simplified host1x_dma_configure based on a comment from Thierry,
      rewrote changelog]
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-03 16:22:18 +02:00
Thierry Reding 41c3068cc2 gpu: host1x: Use IOMMU groups
Use IOMMU groups to attach the host1x device to its IOMMU domain. This
is not strictly necessary because the domain isn't shared with any other
device, but it makes the code consistent with how IOMMU is handled in
other drivers and provides an easy way to detect when no IOMMU has been
attached via device tree.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-12-21 14:52:36 +01:00
Thierry Reding 8f7da1578e gpu: host1x: Cleanup on initialization failure
When an error happens during the initialization of one of the sub-
devices, make sure to properly cleanup all sub-devices that have been
initialized up to that point.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-12-13 13:42:03 +01:00
Thierry Reding 1f876c3fce gpu: host1x: Rewrite conditional for better readability
The current check is slightly difficult to read, rewrite it to improve
that a little.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-12-13 13:42:03 +01:00
Linus Torvalds e60e1ee606 main drm pull request for v4.15
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaCm8RAAoJEAx081l5xIa+zX0QAJSm31kCG3vdw2CNiRx25L3q
 3hcsEOgAjVJ9FQVGKFWjzb8TK35tSqtNx5kWIj0VGaIfBE5Bdg5SLLgKKUYas8rY
 4LaphqICq2uxu2BNa2tpiar/sHhAnuozwQ4czpVWXzlaISnb9yYzRl7gMuyUVGkx
 +Gih5VUhLmQC0HsRTLJ3vaZQoUsLAl2gAjKcWa1bx57j2S+iKOPfsLaq7VYo+y1I
 Njc+iSGqMhJzRLXVkxL2lQKaslp7R38Bbh5K4Kvyjkm4Aq7zErOF6irpOXKMcrGl
 mwnr89vf1G9thjikrBaXpKnuvdbWYveoN/ORMlTdCfxkFnChHLnm3bd7NJ49RXDN
 Hv/Iq9YYjmZ9GTatxnx7lWtmXnZXC5he1yn1JAuz/yt7/0b/Wx+Mu/wEpBXYNFTd
 1AZdD586i+AmPo3yDkqH9nBu8JC0W0AnS9VZma4LVvZOP2UfJmj5Im1CLHItbGDN
 FnUCkwyD/lJUUk+WgT+w/GOMJgmFHDiFFl4tFtYVVjrUirpCFVguSKG9xuv6tT8P
 8iRsoP7RrcmDN9ojN2SEHwcpsAv3HnKkDv+9+GIbWnrGsSbCPq8Qm+JDSvf4h22I
 K5lwNpJrcpSKI+q10L7w2xliTBwb98sJkWGA/rssomrdBOWteGZAyqFRYAVgQ+mJ
 x/nJurIqQYh2KQN9+uLG
 =xVV2
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux

Pull drm updates from Dave Airlie:
 "This is the main drm pull request for v4.15.

  Core:
   - Atomic object lifetime fixes
   - Atomic iterator improvements
   - Sparse/smatch fixes
   - Legacy kms ioctls to be interruptible
   - EDID override improvements
   - fb/gem helper cleanups
   - Simple outreachy patches
   - Documentation improvements
   - Fix dma-buf rcu races
   - DRM mode object leasing for improving VR use cases.
   - vgaarb improvements for non-x86 platforms.

  New driver:
   - tve200: Faraday Technology TVE200 block.

     This "TV Encoder" encodes a ITU-T BT.656 stream and can be found in
     the StorLink SL3516 (later Cortina Systems CS3516) as well as the
     Grain Media GM8180.

  New bridges:
   - SiI9234 support

  New panels:
   - S6E63J0X03, OTM8009A, Seiko 43WVF1G, 7" rpi touch panel, Toshiba
     LT089AC19000, Innolux AT043TN24

  i915:
   - Remove Coffeelake from alpha support
   - Cannonlake workarounds
   - Infoframe refactoring for DisplayPort
   - VBT updates
   - DisplayPort vswing/emph/buffer translation refactoring
   - CCS fixes
   - Restore GPU clock boost on missed vblanks
   - Scatter list updates for userptr allocations
   - Gen9+ transition watermarks
   - Display IPC (Isochronous Priority Control)
   - Private PAT management
   - GVT: improved error handling and pci config sanitizing
   - Execlist refactoring
   - Transparent Huge Page support
   - User defined priorities support
   - HuC/GuC firmware refactoring
   - DP MST fixes
   - eDP power sequencing fixes
   - Use RCU instead of stop_machine
   - PSR state tracking support
   - Eviction fixes
   - BDW DP aux channel timeout fixes
   - LSPCON fixes
   - Cannonlake PLL fixes

  amdgpu:
   - Per VM BO support
   - Powerplay cleanups
   - CI powerplay support
   - PASID mgr for kfd
   - SR-IOV fixes
   - initial GPU reset for vega10
   - Prime mmap support
   - TTM updates
   - Clock query interface for Raven
   - Fence to handle ioctl
   - UVD encode ring support on Polaris
   - Transparent huge page DMA support
   - Compute LRU pipe tweaks
   - BO flag to allow buffers to opt out of implicit sync
   - CTX priority setting API
   - VRAM lost infrastructure plumbing

  qxl:
   - fix flicker since atomic rework

  amdkfd:
   - Further improvements from internal AMD tree
   - Usermode events
   - Drop radeon support

  nouveau:
   - Pascal temperature sensor support
   - Improved BAR2 handling
   - MMU rework to support Pascal MMU

  exynos:
   - Improved HDMI/mixer support
   - HDMI audio interface support

  tegra:
   - Prep work for tegra186
   - Cleanup/fixes

  msm:
   - Preemption support for a5xx
   - Display fixes for 8x96 (snapdragon 820)
   - Async cursor plane fixes
   - FW loading rework
   - GPU debugging improvements

  vc4:
   - Prep for DSI panels
   - fix T-format tiling scanout
   - New madvise ioctl

  Rockchip:
   - LVDS support

  omapdrm:
   - omap4 HDMI CEC support

  etnaviv:
   - GPU performance counters groundwork

  sun4i:
   - refactor driver load + TCON backend
   - HDMI improvements
   - A31 support
   - Misc fixes

  udl:
   - Probe/EDID read fixes.

  tilcdc:
   - Misc fixes.

  pl111:
   - Support more variants

  adv7511:
   - Improve EDID handling.
   - HDMI CEC support

  sii8620:
   - Add remote control support"

* tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux: (1480 commits)
  drm/rockchip: analogix_dp: Use mutex rather than spinlock
  drm/mode_object: fix documentation for object lookups.
  drm/i915: Reorder context-close to avoid calling i915_vma_close() under RCU
  drm/i915: Move init_clock_gating() back to where it was
  drm/i915: Prune the reservation shared fence array
  drm/i915: Idle the GPU before shinking everything
  drm/i915: Lock llist_del_first() vs llist_del_all()
  drm/i915: Calculate ironlake intermediate watermarks correctly, v2.
  drm/i915: Disable lazy PPGTT page table optimization for vGPU
  drm/i915/execlists: Remove the priority "optimisation"
  drm/i915: Filter out spurious execlists context-switch interrupts
  drm/amdgpu: use irq-safe lock for kiq->ring_lock
  drm/amdgpu: bypass lru touch for KIQ ring submission
  drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
  drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
  drm/amd/powerplay: initialize a variable before using it
  drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels
  drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition
  drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug
  drm/rockchip: add CONFIG_OF dependency for lvds
  ...
2017-11-15 20:42:10 -08:00
Linus Torvalds e37e0ee019 A couple of dma-mapping updates:
- turn dma_cache_sync into a dma_map_ops instance and remove
    implementation that purely are dead because the architecture
    doesn't support noncoherent allocations
  - add a flag for busses that need DMA configuration (Robin Murphy)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAloLSrYLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYOMuQ//XXD94uNPYavrgXzGsAtg+I+LEm+xyk4T0dX5fxfj
 amXX49MHoGemjsBgzJlkQMMFqwDEdkKyEuFnEuy6OeowYCyD6zW0MJ3MwP9OosNJ
 PNTdGZIfSvxPYEW8cR9AdK3iQ2loMBZnYhd+O/oVjSugULLW2DNa7r2VRktcCKoh
 8Ob/8gL6Y9xEYJBRszhrBwKTa/hU8IThxxozBFzN7I3LIKyFboSTcwXGLAHow43g
 4anCTjWTaDcoU2JwY6UTRKRRTV+gD0ZRcsZfd8lNNb5rtMVZkBVOHbF14SMAmw1r
 kSgRcU3+WIFPhK/8wBYqtGZZGnOgFBTHVeqow3AdS728pBWlWl8niTK0DiIgCd3m
 qzScF6SqfN1bCZkZAy8FUV2l0DPYKS6lvyNkf00Eb2W/f6LEqAcjCi2QDDxRfaw+
 Vm97nPUiM+uXNy/6KtAy6ChdprSqx12/edXPp7Y3H2rS/+Dmr6exeix+wb7QUN8W
 JI7ZRHo4JLaJZk/XrZtGX/6jnN1Jo7vfApQOmYDY7kE1iGtOU/LQQj8gcZRVQxML
 4soN6ivSmZX2k03LabWHpYQ8QiyCSYChLC+Az7rQH47LDLeu1IdTJu6orpXpaxyo
 ymzEWlHbmF7mE66X4g/Up/eAYk2YLUA3rKLGVjAIaWDBzHftSFg5EaAnqMADC1G2
 hSo=
 =ALJf
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-4.15' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - turn dma_cache_sync into a dma_map_ops instance and remove
   implementation that purely are dead because the architecture doesn't
   support noncoherent allocations

 - add a flag for busses that need DMA configuration (Robin Murphy)

* tag 'dma-mapping-4.15' of git://git.infradead.org/users/hch/dma-mapping:
  dma-mapping: turn dma_cache_sync into a dma_map_ops method
  sh: make dma_cache_sync a no-op
  xtensa: make dma_cache_sync a no-op
  unicore32: make dma_cache_sync a no-op
  powerpc: make dma_cache_sync a no-op
  mn10300: make dma_cache_sync a no-op
  microblaze: make dma_cache_sync a no-op
  ia64: make dma_cache_sync a no-op
  frv: make dma_cache_sync a no-op
  x86: make dma_cache_sync a no-op
  floppy: consolidate the dummy fd_cacheflush definition
  drivers: flag buses which demand DMA configuration
2017-11-14 16:54:12 -08:00
Linus Torvalds 2cd83ba5be IOMMU Updates for Linux v4.15
* Enforce MSI multiple IRQ alignment in AMD IOMMU
 
  * VT-d PASID error handling fixes
 
  * Add r8a7795 IPMMU support
 
  * Manage runtime PM links on exynos at {add,remove}_device callbacks
 
  * Fix Mediatek driver name to avoid conflict
 
  * Add terminate support to qcom fault handler
 
  * 64-bit IOVA optimizations
 
  * Simplfy IOVA domain destruction, better use of rcache, and
    skip anchor nodes on copy
 
  * Convert to IOMMU TLB sync API in io-pgtable-arm{-v7s}
 
  * Drop command queue lock when waiting for CMD_SYNC completion on
    ARM SMMU implementations supporting MSI to cacheable memory
 
  * iomu-vmsa cleanup inspired by missed IOTLB sync callbacks
 
  * Fix sleeping lock with preemption disabled for RT
 
  * Dual MMU support for TI DRA7xx DSPs
 
  * Optional flush option on IOVA allocation avoiding overhead when
    caller can try other options
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJaCgbBAAoJECObm247sIsiUp0P/16GwExsZFKqD0B7XBiBq+hZ
 Q03XiPEnfnMBt05bw4dH8lnfpDPR1+6I3OzrxByc3VurfsBZ6IUh6oXPREcwdkNE
 +iyS7/1RlXd7EJGsaY91fiL1Gm7iPQS/MJ50GcL288gdF68cAHygDoLPR0r1bHBC
 PzjLhmgSYcqYQwRgnoclUe4/sNEIWWu3Z0trltNmiQ9eFEXW4WCc1/OzCoAmfxY7
 EPiKynP+vCBTvzZJ/7m4oBpfciO/WuF8KhnNDjmL/lKGAQGGChiUyQq77pkdaUzm
 6MjGfLAzO1xJ3DYM1ozDGBnVXR3AlDE2f+MlveT9bwVHkFLB3WF5atb70Cx70Zju
 Lp2ZC2sm42AlHN0WqLGU9Q1pGPhZwXGkoF5kAxcsSh1rWC+bjfjtelpIg9PLngMi
 rBAgRLLveGMM3jQcWv4259dZkByJOyFurQwYzHToacj+bE0hGzLv5cFWS/AD8kIE
 rdxFBdNWlMARDWoJO+jYOzEr/QTDXovLgWsNvS2C2QDfmdXL1s0yqjdU9ahC3wWJ
 pi//PydoXoyuEbN3M04ccjDNCZLX+JYIvvaZAxF5gGVt99VCKhf2Rw2wtLPviPNN
 K+8NmUp1xAZX7Ak47HDNHIS3ZVB6CgP5pmoBvqZW1k5yyf5kvbvi2axQta/cyr6a
 4ey4GeRpF+iad1n1U/WX
 =o4qF
 -----END PGP SIGNATURE-----

Merge tag 'iommu-v4.15-rc1' of git://github.com/awilliam/linux-vfio

Pull IOMMU updates from Alex Williamson:
 "As Joerg mentioned[1], he's out on paternity leave through the end of
  the year and I'm filling in for him in the interim:

   - Enforce MSI multiple IRQ alignment in AMD IOMMU

   - VT-d PASID error handling fixes

   - Add r8a7795 IPMMU support

   - Manage runtime PM links on exynos at {add,remove}_device callbacks

   - Fix Mediatek driver name to avoid conflict

   - Add terminate support to qcom fault handler

   - 64-bit IOVA optimizations

   - Simplfy IOVA domain destruction, better use of rcache, and skip
     anchor nodes on copy

   - Convert to IOMMU TLB sync API in io-pgtable-arm{-v7s}

   - Drop command queue lock when waiting for CMD_SYNC completion on ARM
     SMMU implementations supporting MSI to cacheable memory

   - iomu-vmsa cleanup inspired by missed IOTLB sync callbacks

   - Fix sleeping lock with preemption disabled for RT

   - Dual MMU support for TI DRA7xx DSPs

   - Optional flush option on IOVA allocation avoiding overhead when
     caller can try other options

  [1] https://lkml.org/lkml/2017/10/22/72"

* tag 'iommu-v4.15-rc1' of git://github.com/awilliam/linux-vfio: (54 commits)
  iommu/iova: Use raw_cpu_ptr() instead of get_cpu_ptr() for ->fq
  iommu/mediatek: Fix driver name
  iommu/ipmmu-vmsa: Hook up r8a7795 DT matching code
  iommu/ipmmu-vmsa: Allow two bit SL0
  iommu/ipmmu-vmsa: Make IMBUSCTR setup optional
  iommu/ipmmu-vmsa: Write IMCTR twice
  iommu/ipmmu-vmsa: IPMMU device is 40-bit bus master
  iommu/ipmmu-vmsa: Make use of IOMMU_OF_DECLARE()
  iommu/ipmmu-vmsa: Enable multi context support
  iommu/ipmmu-vmsa: Add optional root device feature
  iommu/ipmmu-vmsa: Introduce features, break out alias
  iommu/ipmmu-vmsa: Unify ipmmu_ops
  iommu/ipmmu-vmsa: Clean up struct ipmmu_vmsa_iommu_priv
  iommu/ipmmu-vmsa: Simplify group allocation
  iommu/ipmmu-vmsa: Unify domain alloc/free
  iommu/ipmmu-vmsa: Fix return value check in ipmmu_find_group_dma()
  iommu/vt-d: Clear pasid table entry when memory unbound
  iommu/vt-d: Clear Page Request Overflow fault bit
  iommu/vt-d: Missing checks for pasid tables if allocation fails
  iommu/amd: Limit the IOVA page range to the specified addresses
  ...
2017-11-14 16:43:27 -08:00
Greg Kroah-Hartman b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Mikko Perttunen 45bd862c28 gpu: host1x: Fix incorrect comment for channel_request
This function actually doesn't sleep in the version that was merged.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-10-20 14:19:52 +02:00
Mikko Perttunen 2a79c034b5 gpu: host1x: Disassemble more instructions
The disassembler for debug dumps was missing some newer host1x opcodes.
Add disassembly support for these.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-10-20 14:19:52 +02:00
Mikko Perttunen eb2ee1a28d gpu: host1x: Improve debug disassembly formatting
The host1x driver prints out "disassembly" dumps of the command FIFO
and gather contents on submission timeouts. However, the output has
been quite difficult to read with unnecessary newlines and occasional
missing parentheses.

Fix these problems by using pr_cont to remove unnecessary newlines
and by fixing other small issues.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-10-20 14:19:52 +02:00
Mikko Perttunen 2316f29fb5 gpu: host1x: Enable gather filter
The gather filter is a feature present on Tegra124 and newer where the
hardware prevents GATHERed command buffers from executing commands
normally reserved for the CDMA pushbuffer which is maintained by the
kernel driver.

This commit enables the gather filter on all supporting hardware.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-10-20 14:19:52 +02:00
Mikko Perttunen c3f52220f2 gpu: host1x: Enable Tegra186 syncpoint protection
Since Tegra186 the Host1x hardware allows syncpoints to be assigned to
specific channels, preventing any other channels from incrementing
them.

Enable this feature where available and assign syncpoints to channels
when submitting a job. Syncpoints are currently never unassigned from
channels since that would require extra work and is unnecessary with
the current channel allocation model.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-10-20 14:19:52 +02:00
Mikko Perttunen 2fb0dceb69 gpu: host1x: Call of_dma_configure() after setting bus
of_dma_configure() now checks the device's bus before configuring it, so
we need to set the device's bus before calling.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-10-20 14:19:51 +02:00
Mikko Perttunen f1b53c4e2c gpu: host1x: Add Tegra186 support
Add support for the implementation of Host1x present on the Tegra186.
The register space has been shuffled around a little bit, requiring
addition of some chip-specific code sections. Tegra186 also adds
several new features, most importantly the hypervisor, but those are
not yet supported with this commit.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-10-20 14:19:51 +02:00
Thierry Reding 617dd7cc49 gpu: host1x: syncpt: Request syncpoints per client
Rather than request syncpoints for a struct device *, request them for a
struct host1x_client *. This is important because subsequent patches are
going to break the assumption that host1x will always be the parent for
devices requesting a syncpoint. It's also a more natural choice because
host1x clients are really the only ones that will know how to deal with
syncpoints.

Note that host1x clients are always guaranteed to be children of host1x,
regardless of their location in the device tree.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-10-20 14:19:51 +02:00
Thierry Reding 6a341fdff1 gpu: host1x: Use of_device_get_match_data()
Avoid some boilerplate by calling of_device_get_match_data() instead of
open-coding the equivalent in the driver.

While at it, shuffle around some code to avoid unnecessary local
variables.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-10-20 14:19:50 +02:00
Robin Murphy d89e2378a9 drivers: flag buses which demand DMA configuration
We do not want the common dma_configure() pathway to apply
indiscriminately to all devices, since there are plenty of buses which
do not have DMA capability, and if their child devices were used for
DMA API calls it would only be indicative of a driver bug. However,
there are a number of buses for which DMA is implicitly expected even
when not described by firmware - those we whitelist with an automatic
opt-in to dma_configure(), assuming that the DMA address space and the
physical address space are equivalent if not otherwise specified.

Commit 7232888366 ("of: restrict DMA configuration") introduced a
short-term fix by comparing explicit bus types, but this approach is far
from pretty, doesn't scale well, and fails to cope at all with bus
drivers which may be built as modules, like host1x. Let's refine things
by making that opt-in a property of the bus type, which neatly addresses
those problems and lets the decision of whether firmware description of
DMA capability should be optional or mandatory stay internal to the bus
drivers themselves.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-10-19 16:34:52 +02:00
Zhen Lei aa3ac9469c iommu/iova: Make dma_32bit_pfn implicit
Now that the cached node optimisation can apply to all allocations, the
couple of users which were playing tricks with dma_32bit_pfn in order to
benefit from it can stop doing so. Conversely, there is also no need for
all the other users to explicitly calculate a 'real' 32-bit PFN, when
init_iova_domain() can happily do that itself from the page granularity.

CC: Thierry Reding <thierry.reding@gmail.com>
CC: Jonathan Hunter <jonathanh@nvidia.com>
CC: David Airlie <airlied@linux.ie>
CC: Sudeep Dutt <sudeep.dutt@intel.com>
CC: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Zhen Lei <thunder.leizhen@huawei.com>
Tested-by: Nate Watterson <nwatters@codeaurora.org>
[rm: use iova_shift(), rewrote commit message]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-09-27 17:09:57 +02:00
Dave Airlie 3aadb888b1 drm/tegra: Changes for v4.14-rc1
This contains a couple of fixes and improvements for host1x, with some
 preparatory work for Tegra186 support.
 
 The remainder is cleanup and minor bugfixes for Tegra DRM along with
 enhancements to debuggability.
 
 There have also been some enhancements to the kernel interfaces for
 host1x job submissions and support for mmap'ing PRIME buffers directly,
 all of which get the interfaces very close to ready for serious work.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAlmXBm4THHRyZWRpbmdA
 bnZpZGlhLmNvbQAKCRDdI6zXfz6zoYDJD/9ILmkCZPmv/LnYjQ1vivHN2cwboj2K
 0yM6fwHE2JrvavKaAt1DDxZElxzL4Gcihcg6ALIc0/RH9S82nz7eNMkwLgfLLPHl
 PFCv/rGkFkyQpqOXvUmLYBGL7uNo3GDfT3fE5fJ9ZfRWFLBaI6DgT6flzTZBsd0T
 vC+mExHBdobu9bsyR95NpHGjDPQoUK//m35p+vZixnyjhCbDM/+qKA/iSUE5kaEY
 S2huX/Gzl8jbi2d2Ax9dz905gYrDKl6y5qlGA3BGKpTPxOd/kYtc2eClSRHc1hno
 WT/9yFeTFyXjxarpY9nAT2NNfVn+crvB3vBwP97I6YKszBbGUOHGYqhBy7r0W+Fl
 MqMvlqTJgN4OlVL7pGiCkeDUKZyJ697EZqNdeYREiKwPtsmSiZcnxxk5BcTEzXBX
 cF0udAVfEK8MekjPDz1CWGbH2uMuXxsH+7VTKt3avVYlN8J9rIhZv9hGK6g/znyd
 4N4eyzDxRtChhAcin1fQJosAzc8oTSEE21WQW2D8vme+t0Yx9Oiy7BG5uj+yLruu
 0/l0TUEyyDozg2doBsnDzJdCFzcHZjo4fClYfZu/Ficwb9eEDOx85eif+rGEOclO
 ickwuGEOAjKuyrz4T0fkd6j3aMYUVRmXZ3L0gFD68jUKel00zjSpcL/JfEihwvd/
 Nus3MYLH+IrFKQ==
 =ZaxL
 -----END PGP SIGNATURE-----

Merge tag 'drm/tegra/for-4.14-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next

drm/tegra: Changes for v4.14-rc1

This contains a couple of fixes and improvements for host1x, with some
preparatory work for Tegra186 support.

The remainder is cleanup and minor bugfixes for Tegra DRM along with
enhancements to debuggability.

There have also been some enhancements to the kernel interfaces for
host1x job submissions and support for mmap'ing PRIME buffers directly,
all of which get the interfaces very close to ready for serious work.

* tag 'drm/tegra/for-4.14-rc1' of git://anongit.freedesktop.org/tegra/linux: (21 commits)
  drm/tegra: Prevent BOs from being freed during job submission
  drm/tegra: gem: Implement mmap() for PRIME buffers
  drm/tegra: Support render node
  drm/tegra: sor: Trace register accesses
  drm/tegra: dpaux: Trace register accesses
  drm/tegra: dsi: Trace register accesses
  drm/tegra: hdmi: Trace register accesses
  drm/tegra: dc: Trace register accesses
  drm/tegra: sor: Use unsigned int for register offsets
  drm/tegra: hdmi: Use unsigned int for register offsets
  drm/tegra: dsi: Use unsigned int for register offsets
  drm/tegra: dpaux: Use unsigned int for register offsets
  drm/tegra: dc: Use unsigned int for register offsets
  drm/tegra: Fix NULL deref in debugfs/iova
  drm/tegra: switch to drm_*_get(), drm_*_put() helpers
  drm/tegra: Set MODULE_FIRMWARE for the VIC
  drm/tegra: Add CONFIG_OF dependency
  gpu: host1x: Support sub-devices recursively
  gpu: host1x: fix error return code in host1x_probe()
  gpu: host1x: Fix bitshift/mask multipliers
  ...
2017-08-21 17:37:33 +10:00
Thierry Reding 25ae30d2a8 gpu: host1x: Support sub-devices recursively
The display architecture in Tegra186 changes slightly compared to
earlier Tegra generations, which requires that we recursively scan
host1x sub-devices from device tree.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-08-17 17:57:08 +02:00
Gustavo A. R. Silva 7b2c63de20 gpu: host1x: fix error return code in host1x_probe()
platform_get_irq() returns an error code, but the host1x driver
ignores it and always returns -ENXIO. This is not correct and,
prevents -EPROBE_DEFER from being propagated properly.

Notice that platform_get_irq() no longer returns 0 on error:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e330b9a6bb35dc7097a4f02cb1ae7b6f96df92af

Print and propagate the return value of platform_get_irq on failure.

This issue was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-08-17 17:57:07 +02:00
Mikko Perttunen 4ac45eb8d1 gpu: host1x: Fix bitshift/mask multipliers
Some parts of Host1x uses BIT_WORD/BIT_MASK/BITS_PER_LONG to calculate
register or field offsets. This worked fine on ARMv7, but now that
BITS_PER_LONG is 64 but our registers are still 32-bit things are
broken.

Fix by replacing..
- BIT_WORD with (x / 32)
- BIT_MASK with BIT(x % 32)
- BITS_PER_LONG with 32

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-08-17 17:57:06 +02:00
Mikko Perttunen 18b3f5ac6b gpu: host1x: Don't fail on NULL bo physical address
Pinning a Host1x BO currently cannot fail and zero is a valid address
for a BO when IOMMU is enabled. To avoid false errors remove checks
for NULL BO physical addresses.

Fixes: 404bfb78da ("gpu: host1x: Add IOMMU support")
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-08-17 17:57:06 +02:00
Dave Airlie 0c697fafc6 Linux 4.13-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZkNpUAAoJEHm+PkMAQRiGr68H/2nr8kxpoUhZ7eA5C71waCjh
 gnJSevkzJAp+fCb0KfQFAp1qvpmLLle4e6tAxYgTQZg4Z3W5cJJNfxu9TzY5sGuL
 o9QUr43XzABepW4e4jhRtZv6dj3K6XruNeDQKXDZTDcc/S8zoiS/Pltq7VgPcAuM
 kX+3qsNdUyknngD6b0z9NtJkb0mHKY6J8MpraWRO34egDwsaN/tuhRj0DRQpCoyQ
 x/k+hMbc9MB9Dn8cfACo6Omb+r5Rfd7dTBUAju/TnIIgs//9voHba307N7XvLJZg
 kWc8MqMQQZXfRZHB0atpDMHyZS/XQRlNPXj76j0+Ud/byODKTFkkazmgTpALvj8=
 =CxeU
 -----END PGP SIGNATURE-----

Backmerge tag 'v4.13-rc5' into drm-next

Linux 4.13-rc5

There's a really nasty nouveau collision, hopefully someone can take a look
once I pushed this out.
2017-08-15 16:16:58 +10:00
Sean Paul acadb3dddb gpu/host1x: Remove excess parameter in host1x_subdev_add docs
Fixes the following warning when building docs:
../drivers/gpu/host1x/bus.c:50: warning: Excess function parameter 'driver' description in 'host1x_subdev_add'

Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170720174746.29100-4-seanpaul@chromium.org
2017-07-31 14:24:37 +02:00
Paul Kocialkowski fea2099597 gpu: host1x: Free the IOMMU domain when there is no device to attach
When there is no device to attach to the IOMMU domain, as may be the
case when the device-tree does not contain the proper iommu node, it is
best to keep going without IOMMU support rather than failing.
This allows the driver to probe and function instead of taking down
all of the tegra drm driver, leading to missing display support.

Signed-off-by: Paul Kocialkowski <contact@paulk.fr>
Fixes: 404bfb78da ("gpu: host1x: Add IOMMU support")
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Tested-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170710193305.5987-1-contact@paulk.fr
2017-07-27 16:57:34 +02:00
Dave Airlie 6d61e70ccc Linux 4.12-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZUGOmAAoJEHm+PkMAQRiGhX8H/3fIhingPD01MBf98U0xGrJo
 yIXmhu6nFs7TM0lDVDcHsKgqLQIT69ll7PrSZrMkc1RGUIPINoCuJVuJqDre0kfB
 of5TX2KegqSx8h1vOWjGBCBjdYfPGyMdf9icf6KsGc/SlIdhN6WA99kglAjJA0Ve
 qPTNagF0ntUNg1lsXffxyfcHqFpyqw/Z/C4ie/byFsn9iJ1VG9mNlTWSud09vhuM
 3tvHzTUVAIWWuRrrgrvgqQpnwL+q5BfSDsXScMjBau0EK3RGGqG8EN6Kbkfa7VQ6
 aBoeboQjUijSJnVwvySdQ11MChTIOwZdfrNPra/1HD3WJNsSu4BIRt5JcAKcOhc=
 =qmSg
 -----END PGP SIGNATURE-----

Backmerge tag 'v4.12-rc7' into drm-next

Linux 4.12-rc7

Needed at least rc6 for drm-misc-next-fixes, may as well go to rc7
2017-06-27 08:28:30 +10:00
Sean Paul b15cdca5b5 Merge remote-tracking branch 'airlied/drm-next' into drm-misc-next-fixes
Backmerging airlied/drm-next
2017-06-20 11:50:41 -04:00
Dave Airlie 4a525bad68 drm/tegra: Changes for v4.13-rc1
This starts off with the addition of more documentation for the host1x
 and DRM drivers and finishes with a slew of fixes and enhancements for
 the staging IOCTLs as a result of the awesome work done by Dmitry and
 Erik on the grate reverse-engineering effort.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAllDjYUTHHRyZWRpbmdA
 bnZpZGlhLmNvbQAKCRDdI6zXfz6zoS2hD/90X7glXgD2PReNqaopGI6o9f5Zdhqv
 YULoVoMUAkDRESxPGtGSwLsNXXFBCxshYHT79bygoEabk0xccV7CWMxgenZ56S3s
 JbkwdFoFeJyRVOPhcLgfHk3vjhf4nFFoTtny4ahe43JJZjSC7i+mY9b9VhrCAOg5
 FhhexSHwLqRIxe/jvIYarypBFVk38iFa4GUrvkYO1fDbi+zJyOA3Od6OwbEWJ+HZ
 DyVF3xJB+He5uZ7zn+Q465QKtIIyUstpS2aZYAmkJG054USKZ9RczeppsNUkmyIC
 LnoGnIpw/PvKjAWvchdybjDUX2dv4/oZs2JPa3pDIgyXeyTFAu9K2i/ScW8D1WHu
 Hl1dL0vVNSvwsuqCPrqZQioN3aLefazp9iccjd80Lrg47x9wHgijzyTAiN3Heswn
 CY7/uuDmoXPTci1h4sti8XfpPnkWPuwgY23J/XCNJFZjDKZiiKg5sWDV0DLCnIQi
 l4BemypsQOO+ye4vt72YJo2TQKJUM212TzC6KbimWPorJANr05L/fXlRDAF8RZ/c
 nXdGoSGVL457M3PZJQWlwM+pKGqu1Uec/p6JYBQ9m2Nt4I7Oi9NpbEPdoNFEf6Tg
 c8oqQiw3d4jp8WOfWsucttgvqsFhr13dBtPIVpTfpudQ6cit1pl6hxlOrFiL1gmX
 xNxekgTrdNuwBg==
 =zz/Z
 -----END PGP SIGNATURE-----

Merge tag 'drm/tegra/for-4.13-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next

drm/tegra: Changes for v4.13-rc1

This starts off with the addition of more documentation for the host1x
and DRM drivers and finishes with a slew of fixes and enhancements for
the staging IOCTLs as a result of the awesome work done by Dmitry and
Erik on the grate reverse-engineering effort.

* tag 'drm/tegra/for-4.13-rc1' of git://anongit.freedesktop.org/tegra/linux:
  gpu: host1x: At first try a non-blocking allocation for the gather copy
  gpu: host1x: Refactor channel allocation code
  gpu: host1x: Remove unused host1x_cdma_stop() definition
  gpu: host1x: Remove unused 'struct host1x_cmdbuf'
  gpu: host1x: Check waits in the firewall
  gpu: host1x: Correct swapped arguments in the is_addr_reg() definition
  gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall
  gpu: host1x: Forbid RESTART opcode in the firewall
  gpu: host1x: Forbid relocation address shifting in the firewall
  gpu: host1x: Do not leak BO's phys address to userspace
  gpu: host1x: Correct host1x_job_pin() error handling
  gpu: host1x: Initialize firewall class to the job's one
  drm/tegra: dc: Disable plane if it is invisible
  drm/tegra: dc: Apply clipping to the plane
  drm/tegra: dc: Avoid reset asserts on Tegra20
  drm/tegra: Check syncpoint ID in the 'submit' IOCTL
  drm/tegra: Correct copying of waitchecks and disable them in the 'submit' IOCTL
  drm/tegra: Check for malformed offsets and sizes in the 'submit' IOCTL
  drm/tegra: Add driver documentation
  gpu: host1x: Flesh out kerneldoc
2017-06-20 11:07:03 +10:00
Dmitry Osipenko 43240bbd87 gpu: host1x: At first try a non-blocking allocation for the gather copy
The blocking gather copy allocation is a major performance downside of the
Host1x firewall, it may take hundreds milliseconds which is unacceptable
for the real-time graphics operations. Let's try a non-blocking allocation
first as a least invasive solution, it makes opentegra (Xorg driver)
performance indistinguishable with/without the firewall.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:25:56 +02:00
Mikko Perttunen 8474b02531 gpu: host1x: Refactor channel allocation code
This is largely a rewrite of the Host1x channel allocation code, bringing
several changes:

- The previous code could deadlock due to an interaction
  between the 'reflock' mutex and CDMA timeout handling.
  This gets rid of the mutex.
- Support for more than 32 channels, required for Tegra186
- General refactoring, including better encapsulation
  of channel ownership handling into channel.c

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:25:38 +02:00
Dmitry Osipenko 03f0de770e gpu: host1x: Remove unused host1x_cdma_stop() definition
There is no host1x_cdma_stop() in the code, let's remove its definition
from the header file.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:25:18 +02:00
Dmitry Osipenko 03ebcaa3de gpu: host1x: Remove unused 'struct host1x_cmdbuf'
The struct host1x_cmdbuf is unused, let's remove it.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:24:59 +02:00
Dmitry Osipenko a47ac10e6e gpu: host1x: Check waits in the firewall
Check waits in the firewall in a way it is done for relocations.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:24:41 +02:00
Dmitry Osipenko 0f563a4bf6 gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall
Several channels could be made to write the same unit concurrently via
the SETCLASS opcode, trusting userspace is a bad idea. It should be
possible to drop the per-client channel reservation and add a per-unit
locking by inserting MLOCK's to the command stream to re-allow the
SETCLASS opcode, but it will be much more work. Let's forbid the
unit-unrelated class changes for now.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:23:50 +02:00
Dmitry Osipenko ef81624994 gpu: host1x: Forbid RESTART opcode in the firewall
The RESTART opcode terminates the gather and restarts the CDMA fetching
from a specified word << 2 relative to the CDMA start address. That
shouldn't be allowed to be done by userspace.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:23:18 +02:00
Dmitry Osipenko 571cbf70c1 gpu: host1x: Forbid relocation address shifting in the firewall
Incorrectly shifted relocation address will cause a lower memory
corruption and likely a hang on a write or a read of an arbitrary data
in case of IOMMU absence. As of now, there is no known use for the
address shifting and adding a proper shifts / sizes validation is a much
more work. Let's forbid shifts in the firewall till a proper validation
is implemented.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:22:32 +02:00