Граф коммитов

2840 Коммитов

Автор SHA1 Сообщение Дата
Hawking Zhang 2fcd43cef6 drm/amdgpu: add mmhub pg init sequence on raven
MMHub Powergating init sequence.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-29 12:43:45 -04:00
Huang Rui 67bef0f790 drm/amdgpu: fix the memory corruption on S3
psp->cmd will be used on resume phase, so we can not free it on hw_init.
Otherwise, a memory corruption will be triggered.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Xiaojie Yuan <Xiaojie.Yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2017-06-29 12:43:43 -04:00
Rex Zhu 6b0fa871a9 drm/amdgpu: fix vulkan test performance drop and hang on VI
caused by not program dynamic_cu_mask_addr in the KIQ MQD.

v2: create struct vi_mqd_allocation in FB which will contain
1. PM4 MQD structure.
2. Write Pointer Poll Memory.
3. Read Pointer Report Memory
4. Dynamic CU Mask.
5. Dynamic RB Mask.

Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-29 12:43:43 -04:00
Dave Airlie 6d61e70ccc Linux 4.12-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZUGOmAAoJEHm+PkMAQRiGhX8H/3fIhingPD01MBf98U0xGrJo
 yIXmhu6nFs7TM0lDVDcHsKgqLQIT69ll7PrSZrMkc1RGUIPINoCuJVuJqDre0kfB
 of5TX2KegqSx8h1vOWjGBCBjdYfPGyMdf9icf6KsGc/SlIdhN6WA99kglAjJA0Ve
 qPTNagF0ntUNg1lsXffxyfcHqFpyqw/Z/C4ie/byFsn9iJ1VG9mNlTWSud09vhuM
 3tvHzTUVAIWWuRrrgrvgqQpnwL+q5BfSDsXScMjBau0EK3RGGqG8EN6Kbkfa7VQ6
 aBoeboQjUijSJnVwvySdQ11MChTIOwZdfrNPra/1HD3WJNsSu4BIRt5JcAKcOhc=
 =qmSg
 -----END PGP SIGNATURE-----

Backmerge tag 'v4.12-rc7' into drm-next

Linux 4.12-rc7

Needed at least rc6 for drm-misc-next-fixes, may as well go to rc7
2017-06-27 08:28:30 +10:00
Alex Deucher 52b482b0f4 drm/amdgpu: adjust default display clock
Increase the default display clock on newer asics to
accomodate some high res modes with really high refresh
rates.

bug: https://bugs.freedesktop.org/show_bug.cgi?id=93826
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2017-06-20 12:06:50 -04:00
Alex Deucher 05b4017b37 drm/amdgpu/atom: fix ps allocation size for EnableDispPowerGating
We were using the wrong structure which lead to an overflow
on some boards.

bug: https://bugs.freedesktop.org/show_bug.cgi?id=101387
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2017-06-20 12:06:49 -04:00
Alex Xie 5ac55629d6 drm/amdgpu: Optimize mutex usage (v4)
In original function amdgpu_bo_list_get, the waiting
for result->lock can be quite long while mutex
bo_list_lock was holding. It can make other tasks
waiting for bo_list_lock for long period.

Secondly, this patch allows several tasks(readers of idr)
to proceed at the same time.

v2: use rcu and kref (Dave Airlie and Christian König)
v3: update v1 commit message (Michel Dänzer)
v4: rebase on upstream (Alex Deucher)

Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-06-19 17:31:22 -04:00
Alex Xie 99eea4df90 drm/amdgpu: Optimization of AMDGPU_BO_LIST_OP_CREATE (v2)
v2: Remove duplication of zeroing of bo list (Christian König)
    Move idr_alloc function to end of ioctl (Christian König)
    Call kfree bo_list when amdgpu_bo_list_set return error.
    Combine the previous two patches into this patch.
    Add amdgpu_bo_list_set function prototype.

Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-06-19 17:31:22 -04:00
Junshan Fang 6e88491cf2 drm/amdgpu: add Polaris12 DID
Signed-off-by: Junshan Fang <Junshan.Fang@amd.com>
Reviewed-by: Roger.He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-19 15:57:50 -04:00
Dave Airlie 660e855813 amdgpu: use drm sync objects for shared semaphores (v6)
This creates a new command submission chunk for amdgpu
to add in and out sync objects around the submission.

Sync objects are managed via the drm syncobj ioctls.

The command submission interface is enhanced with two new
chunks, one for syncobj pre submission dependencies,
and one for post submission sync obj signalling,
and just takes a list of handles for each.

This is based on work originally done by David Zhou at AMD,
with input from Christian Konig on what things should look like.

In theory VkFences could be backed with sync objects and
just get passed into the cs as syncobj handles as well.

NOTE: this interface addition needs a version bump to expose
it to userspace.

TODO: update to dep_sync when rebasing onto amdgpu master.
(with this - r-b from Christian)

v1.1: keep file reference on import.
v2: move to using syncobjs
v2.1: change some APIs to just use p pointer.
v3: make more robust against CS failures, we now add the
wait sems but only remove them once the CS job has been
submitted.
v4: rewrite names of API and base on new syncobj code.
v5: move post deps earlier, rename some apis
v6: lookup post deps earlier, and just replace fences
in post deps stage (Christian)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-16 16:58:32 -04:00
Dave Airlie 6f0308ebc1 amdgpu/cs: split out fence dependency checking (v2)
This just splits out the fence depenency checking into it's
own function to make it easier to add semaphore dependencies.

v2: rebase onto other changes.

v1-Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-16 16:58:31 -04:00
Alex Deucher 64dab074fe drm/amdgpu: don't check the default value for vm size
Avoids printing spurious messages like this:
[    3.102059] amdgpu 0000:01:00.0: VM size (-1) must be a power of 2

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-16 16:58:00 -04:00
Dave Airlie 925344ccc9 Linux 4.12-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZPdbLAAoJEHm+PkMAQRiGx4wH/1nCjfnl6fE8oJ24/1gEAOUh
 biFdqJkYZmlLYHVtYfLm4Ueg4adJdg0wx6qM/4RaAzmQVvLfDV34bc1qBf1+P95G
 kVF+osWyXrZo5cTwkwapHW/KNu4VJwAx2D1wrlxKDVG5AOrULH1pYOYGOpApEkZU
 4N+q5+M0ce0GJpqtUZX+UnI33ygjdDbBxXoFKsr24B7eA0ouGbAJ7dC88WcaETL+
 2/7tT01SvDMo0jBSV0WIqlgXwZ5gp3yPGnklC3F4159Yze6VFrzHMKS/UpPF8o8E
 W9EbuzwxsKyXUifX2GY348L1f+47glen/1sedbuKnFhP6E9aqUQQJXvEO7ueQl4=
 =m2Gx
 -----END PGP SIGNATURE-----

BackMerge tag 'v4.12-rc5' into drm-next

Linux 4.12-rc5 for nouveau fixes
2017-06-16 13:58:27 +10:00
Dave Airlie 04d4fb5fa6 Merge branch 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux into drm-next
New radeon and amdgpu features for 4.13:
- Lots of Vega10 bug fixes
- Preliminary Raven support
- KIQ support for compute rings
- MEC queue management rework from Andres
- Audio support for DCE6
- SR-IOV improvements
- Improved module parameters for controlling radeon vs amdgpu support
  for SI and CIK
- Bug fixes
- General code cleanups

[airlied: dropped drmP.h header from one file was needed and build broke]

* 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux: (362 commits)
  drm/amdgpu: Fix compiler warnings
  drm/amdgpu: vm_update_ptes remove code duplication
  drm/amd/amdgpu: Port VCN over to new SOC15 macros
  drm/amd/amdgpu: Port PSP v10.0 over to new SOC15 macros
  drm/amd/amdgpu: Port PSP v3.1 over to new SOC15 macros
  drm/amd/amdgpu: Port NBIO v7.0 driver over to new SOC15 macros
  drm/amd/amdgpu: Port NBIO v6.1 driver over to new SOC15 macros
  drm/amd/amdgpu: Port UVD 7.0 over to new SOC15 macros
  drm/amd/amdgpu: Port MMHUB over to new SOC15 macros
  drm/amd/amdgpu: Cleanup gfxhub read-modify-write patterns
  drm/amd/amdgpu: Port GFXHUB over to new SOC15 macros
  drm/amd/amdgpu: Add offset variant to SOC15 macros
  drm/amd/powerplay: add avfs control for Vega10
  drm/amdgpu: add virtual display support for raven
  drm/amdgpu/gfx9: fix compute ring doorbell index
  drm/amd/amdgpu: Rename KIQ ring to avoid spaces
  drm/amd/amdgpu: gfx9 tidy ups (v2)
  drm/amdgpu: add contiguous flag in ucode bo create
  drm/amdgpu: fix missed gpu info firmware when cache firmware during S3
  drm/amdgpu: export test ib debugfs interface
  ...
2017-06-16 09:56:53 +10:00
Harish Kasiviswanathan a1924005a2 drm/amdgpu: Fix compiler warnings
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:36 -04:00
Harish Kasiviswanathan 370f092f30 drm/amdgpu: vm_update_ptes remove code duplication
CPU and GPU paths were mostly the same.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:35 -04:00
Tom St Denis 0ad6f0d387 drm/amd/amdgpu: Port VCN over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:35 -04:00
Tom St Denis c5c1effd85 drm/amd/amdgpu: Port PSP v10.0 over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:34 -04:00
Tom St Denis 3176810d60 drm/amd/amdgpu: Port PSP v3.1 over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:34 -04:00
Tom St Denis ba7d5a22a6 drm/amd/amdgpu: Port NBIO v7.0 driver over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:33 -04:00
Tom St Denis db0c4d26d6 drm/amd/amdgpu: Port NBIO v6.1 driver over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:32 -04:00
Tom St Denis 4ad5751a6c drm/amd/amdgpu: Port UVD 7.0 over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:32 -04:00
Tom St Denis deca8322f1 drm/amd/amdgpu: Port MMHUB over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:31 -04:00
Tom St Denis 805cb75ccb drm/amd/amdgpu: Cleanup gfxhub read-modify-write patterns
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:31 -04:00
Tom St Denis f7047402d1 drm/amd/amdgpu: Port GFXHUB over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:30 -04:00
Tom St Denis 496828e786 drm/amd/amdgpu: Add offset variant to SOC15 macros
Allows reading/writing via SOC15 macros with offset for
various register banks.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:30 -04:00
Alex Deucher d67fed1618 drm/amdgpu: add virtual display support for raven
Same as other asics.  If enabled, exposes a user selectable
number of virtual displays.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:28 -04:00
Alex Deucher 7366af81da drm/amdgpu/gfx9: fix compute ring doorbell index
This got lost when the code was revamped.  Copy/paste bug from
gfx8.

Reported-by: Evan Quan <evan.quan@amd.com>
Fixes: 78c168342 (drm/amdgpu: allow split of queues with kfd at queue granularity v4)
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:28 -04:00
Tom St Denis 2119d0db59 drm/amd/amdgpu: Rename KIQ ring to avoid spaces
Swap space for underscore in ring name.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:27 -04:00
Tom St Denis e5475e16eb drm/amd/amdgpu: gfx9 tidy ups (v2)
A couple of simple tidy ups to register programming.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

(v2): Avoid using 'data' uninitialized

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:26 -04:00
horchen 948edf0951 drm/amdgpu: add contiguous flag in ucode bo create
Under VF environment, the ucode would be settled to the visible VRAM,
As it would be pinned to the visible VRAM, it's better to add
contiguous flag,otherwise it need to move gpu address during the pin
process. This movement is not necessary.

Signed-off-by: horchen <horace.chen@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:26 -04:00
Huang Rui ab4fe3e1f9 drm/amdgpu: fix missed gpu info firmware when cache firmware during S3
gpu_info firmware is released after data is used. But when system enters into
suspend, upper class driver will cache all firmware names. At that time,
gpu_info will be failing to load. It seems an upper class issue, that we should
not release gpu_info firmware until device finished.

[  903.236589] cache_firmware: amdgpu/vega10_sdma1.bin
[  903.236590] fw_set_page_data: fw-amdgpu/vega10_sdma1.bin buf=ffff88041eee10c0 data=ffffc90002561000 size=17408
[  903.236591] cache_firmware: amdgpu/vega10_sdma1.bin ret=0
[  903.464160] __allocate_fw_buf: fw-amdgpu/vega10_gpu_info.bin buf=ffff88041eee2c00
[  903.471815] (NULL device *): loading /lib/firmware/updates/4.11.0-custom/amdgpu/vega10_gpu_info.bin failed with error -2
[  903.482870] (NULL device *): loading /lib/firmware/updates/amdgpu/vega10_gpu_info.bin failed with error -2
[  903.492716] (NULL device *): loading /lib/firmware/4.11.0-custom/amdgpu/vega10_gpu_info.bin failed with error -2
[  903.503156] (NULL device *): direct-loading amdgpu/vega10_gpu_info.bin

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:25 -04:00
Huang Rui 4f0955fcc0 drm/amdgpu: export test ib debugfs interface
As Christian and David's suggestion, submit the test ib ring debug interfaces.
It's useful for debugging with the command submission without VM case.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:25 -04:00
Hawking Zhang 2bbec882c2 drm/amdgpu: avoid to reset wave_front_size to 0
No need to clear it.  The values are set explicitly.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:23 -04:00
Hawking Zhang 51fd037067 drm/amdgpu: add new member in gpu_info fw
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:22 -04:00
Mario Kleiner bea1041393 drm/amdgpu: Fix overflow of watermark calcs at > 4k resolutions.
Commit d63c277dc6
("drm/amdgpu: Make display watermark calculations more accurate")
made watermark calculations more accurate, but not for > 4k
resolutions on 32-Bit architectures, as it introduced an integer
overflow for those setups and resolutions.

Fix this by proper u64 casting and division.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Fixes: d63c277dc6 ("drm/amdgpu: Make display watermark calculations more accurate")
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-14 09:25:05 -04:00
Alex Deucher d0c55cdf4f drm/amdgpu/gfx: fix MEC interrupt enablement for pipes != 0
The interrupt registers are not indexed.

Fixes: 763a47b8e (drm/amdgpu: teach amdgpu how to enable interrupts for any pipe v3)
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-09 11:30:42 -04:00
Alex Xie 0fa4955838 drm/amdgpu: move comment to the right place
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-09 11:30:27 -04:00
Alex Xie eb0f0373e5 drm/amdgpu: fix a typo in comment
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-09 11:30:20 -04:00
Alex Xie a7dba6483d drm/amdgpu: remove duplicate function prototypes
There are two identical function prototypes in same header file

Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-09 11:30:13 -04:00
Harish Kasiviswanathan b4d42511b7 drm/amdgpu: Support page table update via CPU
v2: Fix logical mistake. If CPU update failed amdgpu_vm_bo_update_mapping()
would not return and instead fall through to SDMA update. Minor change due to
amdgpu_vm_bo_wait() prototype change

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-09 11:30:04 -04:00
Harish Kasiviswanathan 3c8241722b drm/amdgpu: Support page directory update via CPU
If amdgpu.vm_update_context param is set to use CPU, then Page
Directories will be updated by CPU instead of SDMA

v2: Call amdgpu_vm_bo_wait before updating the page tables to ensure the
PD/PT BOs are free

v3: Minor changes - due to amdgpu_vm_bo_wait() prototype change, local
variable declaration order and function comments.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-09 11:29:55 -04:00
Harish Kasiviswanathan a6583af4ae drm/amdgpu: Add amdgpu_sync_wait
v2: Add intr option

Helper function useful for CPU update of VM page tables. Also useful if
kernel have to synchronously wait till VM page tables are updated.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-09 11:29:46 -04:00
Harish Kasiviswanathan 9a4b7d4c76 drm/amdgpu: Add vm context module param
Add VM update mode module param (amdgpu.vm_update_mode) that can used to
control how VM pde/pte are updated for Graphics and Compute.

BIT0 controls Graphics and BIT1 Compute.
 BIT0 [= 0] Graphics updated by SDMA [= 1] by CPU
 BIT1 [= 0] Compute updated by SDMA [= 1] by CPU

By default, only for large BAR system vm_update_mode = 2, indicating
that Graphics VMs will be updated via SDMA and Compute VMs will be
updated via CPU. And for all all other systems (by default)
vm_update_mode = 0

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-09 11:29:38 -04:00
Alex Deucher b58c11314a drm/amdgpu: drop deprecated drm_get_pci_dev and drm_put_dev
Open code them so we can adjust the order in the
driver more easily.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-08 10:54:39 -04:00
Alex Deucher 3c50c28732 drm/amdgpu: call pci_[un]register_driver() directly
Rather than calling the deprecated drm_pci_init() and
drm_pci_exit() which just wrapped the pci functions
anyway.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-08 10:54:38 -04:00
Michel Dänzer 2b059658d6 drm/amdgpu/radeon: Use radeon by default for CIK GPUs
Even if CONFIG_DRM_AMDGPU_CIK is enabled.

There is no feature parity yet for CIK, in particular amdgpu doesn't
support HDMI/DisplayPort audio without DC.

v2:
* Clarify the lack of feature parity being related to HDMI/DP audio.
* Fix "SI" typo in DRM_AMDGPU_CIK help entry.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2017-06-08 10:54:37 -04:00
Felix Kuehling ef789173cb drm/amdgpu: Update Kconfig help for SI and CIK support
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2017-06-08 10:54:35 -04:00
Felix Kuehling 6dd1309683 drm/amdgpu: Add module param to control SI support
If AMDGPU supports SI, add a module parameter to control SI
support. It's off by default in AMDGPU as long as SI suppost is
experimental, while it is on by default in radeon.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>

[ Michel Dänzer: Squash in amdgpu_si_support initialization fix ]
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-08 10:54:34 -04:00
Felix Kuehling 7df289865c drm/amdgpu: Add module param to control CIK support
If AMDGPU supports CIK, add a module parameter to control CIK
support. It's on by default in AMDGPU, while it is off by default
in radeon.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2017-06-08 10:54:33 -04:00
Alex Deucher b9683c21f6 drm/amdgpu/gfx: consolidate mqd buffer setup code
It was duplicated across multiple generations.

Reviewed-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-07 18:20:59 -04:00
Alex Deucher 4853bbb6fb drm/amdgpu/gfx: move mec parameter setup into sw_init
This will allow us to share more mec code.

Reviewed-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-07 18:20:58 -04:00
Alex Deucher 71c37505e7 drm/amdgpu/gfx: move more common KIQ code to amdgpu_gfx.c
Lots more common stuff.

Reviewed-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-07 18:20:41 -04:00
Alex Deucher 2db0cdbe28 drm/amdgpu: move mec queue helpers to amdgpu_gfx.h
They are gfx related, not general helpers.

Reviewed-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-07 18:02:06 -04:00
Alex Deucher ee04fac3b7 drm/amdgpu/gfx9: remove spurious line in kiq setup
This overrode what queue was actually assigned for kiq.

Reviewed-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-07 18:01:59 -04:00
Alex Deucher d6b20c8769 drm/amdgpu/gfx8: whitespace change
Make it consistent.

Reviewed-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-07 18:01:49 -04:00
Alex Deucher 5e7c8b0676 drm/amdgpu/gfx9: Raven has two MECs
This was missed when Andres' queue patches were rebased.

Fixes: 42794b27 (drm/amdgpu: take ownership of per-pipe configuration v3)
Reviewed-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-07 15:43:38 -04:00
Alex Deucher 41f6a99abd drm/amdgpu: move gfx_v*_0_compute_queue_acquire to common code
Same function was duplicated in all gfx IP files.

Reviewed-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-07 15:43:28 -04:00
Alex Deucher cf8b611f55 drm/amdgpu: fix mec queue policy on single MEC asics
Fixes hangs on single MEC asics.

Fixes: 2ed286fb434 (drm/amdgpu: new queue policy, take first 2 queues of each pipe v2)
Reviewed-by: Alex Xie <AlexBin.Xie@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-07 15:43:11 -04:00
Alex Deucher 378506a7e6 drm/amdgpu/gfx: create a common bitmask function (v2)
The same function was duplicated in all the gfx IPs. Use
a single implementation for all.

v2: use static inline (Alex Xie)

Reviewed-by: Alex Xie <AlexBin.Xie@amd.com>
Suggested-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-07 00:04:35 -04:00
Alex Deucher 943c05bdb5 drm/amdgpu/gfx8: drop per-APU CU limits
Always use the max for the family rather than the per sku limits.
This makes sure the mask is always the max size to avoid reporting
the wrong number of CUs.

Reviewed-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-07 00:04:21 -04:00
Alex Deucher 6653ebd48f drm/amdgpu/gfx6: properly cache mc_arb_ramcfg
This was missing for gfx6.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2017-06-06 17:01:47 -04:00
Alex Deucher a7049de1e8 drm/amdgpu/gfx9: new queue policy, take first 2 queues of each pipe
Instead of taking the first pipe and giving the rest to kfd, take the
first 2 queues of each pipe.

Effectively, amdgpu and amdkfd own the same number of queues. But
because the queues are spread over multiple pipes the hardware will be
able to better handle concurrent compute workloads.

amdgpu goes from 1 pipe to 4 pipes, i.e. from 1 compute threads to 4
amdkfd goes from 3 pipe to 4 pipes, i.e. from 3 compute threads to 4

gfx9 was missed when this patch set was rebased to include gfx9.

Acked-by: Tom St Denis <tom.stdenis@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 17:01:43 -04:00
Alex Deucher 1361f45531 drm/amdgpu/gfx9: allocate queues horizontally across pipes
Pipes provide better concurrency than queues, therefore we want to make
sure that apps use queues from different pipes whenever possible.

Optimize for the trivial case where an app will consume rings in order,
therefore we don't want adjacent rings to belong to the same pipe.

gfx9 was missed when these patches were rebased.

Reviewed-by: Tom St Denis <tom.stdenis@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 17:01:37 -04:00
Huang Rui b9509c80df drm/amdgpu: update to use RREG32_SOC15/WREG32_SOC15 for gmc9
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 17:00:35 -04:00
Huang Rui 2a4191833e drm/amdgpu: update to use RREG32_SOC15/WREG32_SOC15 for mmhub
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 17:00:28 -04:00
Huang Rui 89f99cebc4 drm/amdgpu: update to use RREG32_SOC15/WREG32_SOC15 for gfxhub
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 17:00:21 -04:00
Huang Rui 916910ad91 drm/amdgpu: fix the gart table cleared issue for S3
Something writes over the first 8 MB so reserve this
on vega10 until we root cause it.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:59:30 -04:00
Huang Rui a0bae3577f drm/amdgpu: add ip block number prints
User is able to follow the ip block number to write the ip_block_mask for
selecting the one which user would like to enable.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:59:23 -04:00
Huang Rui ed8cf00ce4 drm/amdgpu: add ip name print for selecting ips with ip_block_mask
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:59:16 -04:00
Huang Rui 1191d110c3 drm/amdgpu: remove mmhub ip
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:59:09 -04:00
Huang Rui 373f592325 drm/amdgpu: remove gfxhub ip
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:59:03 -04:00
Huang Rui 13052be59a drm/amdgpu: export mmhub get clockgating into gmc
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:58:56 -04:00
Huang Rui d5583d4f69 drm/amdgpu: export mmhub set clockgating into gmc
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:58:49 -04:00
Huang Rui 77f6c76370 drm/amdgpu: export mmhub sw_init into gmc
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:58:43 -04:00
Huang Rui 0c8c0847cc drm/amdgpu: export gfxhub sw_init into gmc
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:58:36 -04:00
Huang Rui 1e4eccdaf2 drm/amdgpu: fix to miss program invalidation at resume
This patch moves invalidation into gart enable function from hw_init.
Because we would like align the sequence calling between init and resume.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:58:29 -04:00
Huang Rui 3dff4cc4b0 drm/amdgpu: abstract setup vmid config for gfxhub/mmhub
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:58:23 -04:00
Huang Rui d5c87390f1 drm/amdgpu: abstract disable identity aperture for gfxhub/mmhub
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:58:16 -04:00
Huang Rui 02c4704bd2 drm/amdgpu: abstract system domain enablement for gfxhub/mmhub
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:58:09 -04:00
Huang Rui 41f6f31111 drm/amdgpu: abstract cache initialization for gfxhub/mmhub
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:58:03 -04:00
Huang Rui 3426983939 drm/amdgpu: abstract TLB initialization for gfxhub/mmhub
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:57:56 -04:00
Huang Rui fc4b884b26 drm/amdgpu: abstract system aperture initialization for gfxhub/mmhub
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:57:49 -04:00
Huang Rui 9bbad6fda0 drm/amdgpu: abstract gart aperture initialization for gfxhub/mmhub
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:57:41 -04:00
Huang Rui a51dca4f21 drm/amdgpu: abstract gart table initialization for gfxhub/mmhub
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06 16:57:09 -04:00
Linus Torvalds 2f48641cfc Use designated initializers for mtk-vcodec, powerplay, amdgpu, and sgi-xp.
Use ERR_CAST() to avoid cross-structure cast in ocf2, ntfs, and NFS.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJZMHWdAAoJEIly9N/cbcAmWOYP/i45fa6JG7Aw9N59Uz4sqeUQ
 ZUlvAUek6GkaGijCPtDYjy0cVj2Cc3QZLSRq9dDw/rU66Mc0ybYWHtIIwJy4ZjVe
 D4w2Cs7K1oSOnhJnPTjQSKuMD81PF75NLChf3XSfLvtOWVIqW33EzLIu5lJ1rc1x
 wh1fEAsJXGA9xklmW+m8Vn1FoS1a1j+9zuCEmGpveOkk6UKhhp73Ke8PP4uK9ld+
 saApe/iH0JdTP6I7030A8hXwz7ZCYbMicw1kVpnsn4rM24p+k3Y2/OrFT2tY6/Y6
 fzkTuVL7omQmUWph9zX6SYPg2GACEBTLb5V1YJ6zDUUzucu7vjfsvsTHXZb1gq2j
 i8hZ6XsNOMWYJiOkOOSKM0rpjG6WSvF/sGc78ap7NJ4QPZ2/h3BTOXfk/ye/xQmL
 WidEESJ4srInpi5ju8JTWHe27aydwiUUF91Y+gFv4G6CGU6/5vjUzOsgeiMxt0JN
 lPaTjjL4lBHI2yohx2Wqy88yYWulK3LB0Hzt9XcSGMBA58H9d0CV0ZTkH3dJJkpC
 QCM+Kt1DPy5A2RPC2APrPPCJsQycX9PSDeRaWkTxHnNLftpq65h1pAKjMcqsUPgb
 HEEMLIBGqm871dr3+aPJPfG3Qil9ANBscDRbHXugCFTseFQO6M26KAxWGN+6LIQp
 6Z0GUaPgJEua9ejodq4m
 =R3qn
 -----END PGP SIGNATURE-----

Merge tag 'gcc-plugins-v4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull gcc-plugin prepwork from Kees Cook:
 "Use designated initializers for mtk-vcodec, powerplay, amdgpu, and
  sgi-xp. Use ERR_CAST() to avoid cross-structure cast in ocf2, ntfs,
  and NFS.

  Christoph Hellwig recommended that I send these fixes now, rather than
  waiting for the v4.13 merge window. These are all initializer and cast
  fixes needed for the future randstruct plugin that haven't been picked
  up by the respective maintainers"

* tag 'gcc-plugins-v4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  mtk-vcodec: Use designated initializers
  drm/amd/powerplay: Use designated initializers
  drm/amdgpu: Use designated initializers
  sgi-xp: Use designated initializers
  ocfs2: Use ERR_CAST() to avoid cross-structure cast
  ntfs: Use ERR_CAST() to avoid cross-structure cast
  NFS: Use ERR_CAST() to avoid cross-structure cast
2017-06-01 16:17:42 -07:00
Leo Liu a107ebf61e drm/amdgpu: add saved_bo to save vce 4.0 context when suspend
We are using PSP to resume firmware after suspend, and it is
resumed at where it got suspended, so we'd better save the
the context.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-01 16:00:22 -04:00
Leo Liu 78b3c83983 drm/amdgpu: use existing function amdgpu_bo_create_kernel
To simplify vce bo create

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-01 16:00:21 -04:00
Leo Liu 91415a09ab drm/amdgpu: add vcpu_bo cpu address for vce
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-01 16:00:21 -04:00
Alex Xie e59c020598 drm/amdgpu: Move compute vm bug logic to amdgpu_vm.c
In review, Christian would like to keep the logic
  inside amdgpu_vm.c with a cost of slightly slower.
  The loop is still optimized out with this patch.

v2: remove the if statement. Now it is not slower.

Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Christian König <christian.koeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-01 16:00:20 -04:00
Andres Rodriguez 90c1130953 drm/amdgpu: use LRU mapping policy for SDMA engines
Spreading the load across multiple SDMA engines can increase memory
transfer performance.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31 16:49:04 -04:00
Andres Rodriguez 6065343a11 drm/amdgpu: guarantee bijective mapping of ring ids for LRU v3
Depending on usage patterns, the current LRU policy may create a
non-injective mapping between userspace ring ids and kernel rings.

This behaviour is undesired as apps that attempt to fill all HW blocks
would be unable to reach some of them.

This change forces the LRU policy to create bijective mappings only.

v2: compress ring_blacklist
v3: simplify amdgpu_ring_is_blacklisted() logic

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31 16:49:03 -04:00
Andres Rodriguez 795f2813e6 drm/amdgpu: implement lru amdgpu_queue_mgr policy for compute v4
Use an LRU policy to map usermode rings to HW compute queues.

Most compute clients use one queue, and usually the first queue
available. This results in poor pipe/queue work distribution when
multiple compute apps are running. In most cases pipe 0 queue 0 is
the only queue that gets used.

In order to better distribute work across multiple HW queues, we adopt
a policy to map the usermode ring ids to the LRU HW queue.

This fixes a large majority of multi-app compute workloads sharing the
same HW queue, even though 7 other queues are available.

v2: use ring->funcs->type instead of ring->hw_ip
v3: remove amdgpu_queue_mapper_funcs
v4: change ring_lru_list_lock to spinlock, grab only once in lru_get()

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31 16:49:02 -04:00
Andres Rodriguez effd924d2f drm/amdgpu: untie user ring ids from kernel ring ids v6
Add amdgpu_queue_mgr, a mechanism that allows disjointing usermode's
ring ids from the kernel's ring ids.

The queue manager maintains a per-file descriptor map of user ring ids
to amdgpu_ring pointers. Once a map is created it is permanent (this is
required to maintain FIFO execution guarantees for a context's ring).

Different queue map policies can be configured for each HW IP.
Currently all HW IPs use the identity mapper, i.e. kernel ring id is
equal to the user ring id.

The purpose of this mechanism is to distribute the load across multiple
queues more effectively for HW IPs that support multiple rings.
Userspace clients are unable to check whether a specific resource is in
use by a different client. Therefore, it is up to the kernel driver to
make the optimal choice.

v2: remove amdgpu_queue_mapper_funcs
v3: made amdgpu_queue_mgr per context instead of per-fd
v4: add context_put on error paths
v5: rebase and include new IPs UVD_ENC & VCN_*
v6: drop unused amdgpu_ring_is_valid_index (Alex)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31 16:49:01 -04:00
Andres Rodriguez ecd910eb1f drm/amdgpu: workaround tonga HW bug in HQD programming sequence
Tonga based asics may experience hangs when an HQD's EOP parameters
are modified.

Workaround this HW issue by avoiding writes to these registers for
tonga asics.

Based on the following ROCm commit:
2a0fb8 - drm/amdgpu: Synchronize KFD HQD load protocol with CP scheduler

From the ROCm git repository:
https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver.git

CC: Jay Cornwall <Jay.Cornwall@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31 16:49:00 -04:00
Andres Rodriguez 894700f3b7 drm/amdgpu: condense mqd programming sequence
The MQD structure matches the reg layout. Take advantage of this to
simplify HQD programming.

Note that the ACTIVE field still needs to be programmed last.

Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31 16:48:59 -04:00
Andres Rodriguez 0a281f5a2c drm/amdgpu: new queue policy, take first 2 queues of each pipe v2
Instead of taking the first pipe and giving the rest to kfd, take the
first 2 queues of each pipe.

Effectively, amdgpu and amdkfd own the same number of queues. But
because the queues are spread over multiple pipes the hardware will be
able to better handle concurrent compute workloads.

amdgpu goes from 1 pipe to 4 pipes, i.e. from 1 compute threads to 4
amdkfd goes from 3 pipe to 4 pipes, i.e. from 3 compute threads to 4

v2: fix policy comment

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31 16:48:59 -04:00
Andres Rodriguez 7b2124a5dd drm/amdgpu: avoid KIQ clashing with compute or KFD queues v2
Instead of picking an arbitrary queue for KIQ, search for one according
to policy. The queue must be unused.

Also report the KIQ as an unavailable resource to KFD.

In testing I ran into KCQ initialization issues when using pipes 2/3 of
MEC2 for the KIQ. Therefore the policy disallows grabbing one of these.

v2: fix (ring.me + 1) to (ring.me -1) in amdgpu_amdkfd_device_init

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31 16:48:58 -04:00
Andres Rodriguez de65513af1 drm/amdgpu: remove hardcoded queue_mask in PACKET3_SET_RESOURCES
The assumption that we are only using the first pipe no longer holds.
Instead, calculate the queue_mask from the queue_bitmap.

Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31 16:48:57 -04:00
Andres Rodriguez e33fec4835 drm/amdgpu: allocate queues horizontally across pipes
Pipes provide better concurrency than queues, therefore we want to make
sure that apps use queues from different pipes whenever possible.

Optimize for the trivial case where an app will consume rings in order,
therefore we don't want adjacent rings to belong to the same pipe.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31 16:48:56 -04:00