Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This adds support for SET_APPEND_CNT packet3 to the VM paths.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This adds support to the command parser for the set append counter
packet3, this is required to support atomic counters on
evergreen/cayman GPUs.
v2: fixup some of the hardcoded numbers with real register names
(Christian)
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
I hate doing this but it hurts my eyes to go over code that does not
comply with indentation rules. Only thing that is not only space change
is in atom.c all other files are space indentation issues.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There doesn't seem to be any need to have 'ib' volatile, the code is
not even consistent with it and some places already miss it. As it is
now it's just making gcc produce worse code. If there are special
requirements for that memory, then proper primitives like memory
barriers or accessor functions should be used, but it doesn't look
like that is needed here.
While at it, change the type to match the one in radeon_ib structure.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
After this patch the register check loop does the same thing as before,
except that now gcc does better job optimizing it: it now sees that
end_reg was already checked against PACKET3_SET_CONTEXT_REG_END and can
optimize REG_SAFE_BM_SIZE comparison out of evergreen_is_safe_reg()
as (PACKET3_SET_CONTEXT_REG_END >> 7) < REG_SAFE_BM_SIZE.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
evergreen_cs_check_reg() is a large function and gcc doesn't want to
inline it. It has a quick check for reg_safe_bm[] to see if register
needs special handling, which often results in early exit. However
because the function is large, it has a long prologue/epilogue to
save/restore all the callee-save registers which according to perf is
taking significant amount of time. To avoid this, we can reuse
evergreen_is_safe_reg() to do the early check directly in register loop.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To avoid having to distinguish between CAYMAN or older on every register
check, place a pointer in evergreen_cs_track and use it unconditionally.
Also make use of the fact that both reg_safe_bm[] arrays are of the same
length to remove another CAYMAN check.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the necessary set of commands to support OpenGL
indirect draw calls on evergreen/cayman devices that
do not have VM.
v2: agd5f: fix warning on 32-bit
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nobody is interested at which index the chunk is. What's needed is
a pointer to the chunk. Remove unused chunk_id field as well.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Better match what it is actually doing.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Just move all fields into radeon_cs_reloc, removing unused/duplicated fields.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
This fixes a bug which was causing rejections of valid GPU commands
from userspace.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The MIP_ADDRESS state has 2 meanings. If the texture has one sample
per pixel, it's a pointer to the mipmap chain. If the texture has
multiple samples per pixel, it's a pointer to FMASK, a metadata buffer
needed for reading compressed MSAA textures. The mipmap
alignment rules do not apply to FMASK.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pull drm merge from Dave Airlie:
"Highlights:
- TI LCD controller KMS driver
- TI OMAP KMS driver merged from staging
- drop gma500 stub driver
- the fbcon locking fixes
- the vgacon dirty like zebra fix.
- open firmware videomode and hdmi common code helpers
- major locking rework for kms object handling - pageflip/cursor
won't block on polling anymore!
- fbcon helper and prime helper cleanups
- i915: all over the map, haswell power well enhancements, valleyview
macro horrors cleaned up, killing lots of legacy GTT code,
- radeon: CS ioctl unification, deprecated UMS support, gpu reset
rework, VM fixes
- nouveau: reworked thermal code, external dp/tmds encoder support
(anx9805), fences sleep instead of polling,
- exynos: all over the driver fixes."
Lovely conflict in radeon/evergreen_cs.c between commit de0babd60d
("drm/radeon: enforce use of radeon_get_ib_value when reading user cmd")
and the new changes that modified that evergreen_dma_cs_parse()
function.
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (508 commits)
drm/tilcdc: only build on arm
drm/i915: Revert hdmi HDP pin checks
drm/tegra: Add list of framebuffers to debugfs
drm/tegra: Fix color expansion
drm/tegra: Split DC_CMD_STATE_CONTROL register write
drm/tegra: Implement page-flipping support
drm/tegra: Implement VBLANK support
drm/tegra: Implement .mode_set_base()
drm/tegra: Add plane support
drm/tegra: Remove bogus tegra_framebuffer structure
drm: Add consistency check for page-flipping
drm/radeon: Use generic HDMI infoframe helpers
drm/tegra: Use generic HDMI infoframe helpers
drm: Add EDID helper documentation
drm: Add HDMI infoframe helpers
video: Add generic HDMI infoframe helpers
drm: Add some missing forward declarations
drm: Move mode tables to drm_edid.c
drm: Remove duplicate drm_mode_cea_vic()
gma500: Fix n, m1 and m2 clock limits for sdvo and lvds
...
When ever parsing cmd buffer supplied by userspace we need to use
radeon_get_ib_value rather than directly accessing the ib as the user
cmd might not yet be copied into the ib thus the parser might read
value that does not correspond to what user is sending and possibly
allowing user to send malicious command undected.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This simplify and cleanup the async dma checking.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
After refactoring the _cs logic, we ended up with many
macros and constants that #define the same thing.
Clean'em up.
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch eliminates ASIC-specific ***_cs_packet_next_reloc
functions and hooks up the new common function.
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
WAIT_REG_MEM on register does not allow the use of PFP.
Enforce this restriction when checking packets sent from
userland.
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
vline packet parsing function for R600 and Evergreen+ are
the same, except that they use different registers. Factor
out the algorithm into a common function that uses register
table passed from ASIC-specific caller.
This reduces ASIC-specific function to (trivial) setup
of register table and call into the common function.
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Once we factored out radeon_cs_packet_parse function,
evergreen_cs_next_is_pkt3_nop and r600_cs_next_is_pkt3_nop
functions became identical, so they can be factored out
into a common function.
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We now have a common radeon_cs_packet_parse function
that is good for all ASICs. Hook it up and eliminate
ASIC-specific versions.
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To make it easier to debug some lockup from userspace add support
to MEM_WRITE packet.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allows us to use the DMA ring from userspace.
DMA doesn't have a good NOP packet in which to embed the
reloc idx, so userspace has to add a reloc for each
buffer used and order them to match the command stream.
v2: fix address bounds checking
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix the size computation of the htile buffer.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@r1@
statement S;
position p,p1;
@@
S@p1;@p
@script:python r2@
p << r1.p;
p1 << r1.p1;
@@
if p[0].line != p1[0].line_end:
cocci.include_match(False)
@@
position r1.p;
@@
-;@p
// </smpl>
Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pull drm merge (part 1) from Dave Airlie:
"So first of all my tree and uapi stuff has a conflict mess, its my
fault as the nouveau stuff didn't hit -next as were trying to rebase
regressions out of it before we merged.
Highlights:
- SH mobile modesetting driver and associated helpers
- some DRM core documentation
- i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write
combined pte writing, ilk rc6 support,
- nouveau: major driver rework into a hw core driver, makes features
like SLI a lot saner to implement,
- psb: add eDP/DP support for Cedarview
- radeon: 2 layer page tables, async VM pte updates, better PLL
selection for > 2 screens, better ACPI interactions
The rest is general grab bag of fixes.
So why part 1? well I have the exynos pull req which came in a bit
late but was waiting for me to do something they shouldn't have and it
looks fairly safe, and David Howells has some more header cleanups
he'd like me to pull, that seem like a good idea, but I'd like to get
this merge out of the way so -next dosen't get blocked."
Tons of conflicts mostly due to silly include line changes, but mostly
mindless. A few other small semantic conflicts too, noted from Dave's
pre-merged branch.
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits)
drm/nv98/crypt: fix fuc build with latest envyas
drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering
drm/nv41/vm: fix and enable use of "real" pciegart
drm/nv44/vm: fix and enable use of "real" pciegart
drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie
drm/nouveau: store supported dma mask in vmmgr
drm/nvc0/ibus: initial implementation of subdev
drm/nouveau/therm: add support for fan-control modes
drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules
drm/nouveau/therm: calculate the pwm divisor on nv50+
drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster
drm/nouveau/therm: move thermal-related functions to the therm subdev
drm/nouveau/bios: parse the pwm divisor from the perf table
drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices
drm/nouveau/therm: rework thermal table parsing
drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table
drm/nouveau: fix pm initialization order
drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it
drm/nouveau: log channel debug/error messages from client object rather than drm client
drm/nouveau: have drm debugging macros build on top of core macros
...
Convert #include "..." to #include <path/...> in drivers/gpu/.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
MIP_ADDRESS should point to the resolved FMASK for an MSAA texture.
Setting MIP_ADDRESS to 0 means the FMASK pointer is invalid (the GPU
won't read the memory then).
The userspace has to set MIP_ADDRESS to 0 and *not* emit any relocation
for it.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Let's allow GCC to optimize better.
This exposed some five unused functions, but this patch doesn't remove them.
Signed-off-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Most of the checking seems to be in place already. As you can see,
log2(number of samples) resides in LAST_LEVEL.
This is required for MSAA support (namely for depth-stencil resolve and
blitting between MSAA resources).
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Fix regresson since the introduction of command stream checking on
evergreen (thread referenced below). Issue is cause by ddx allocating
bo with formula width*height*bpp while programming the GPU command
stream with ALIGN(height, 8). In some case (where page alignment does
not hide the extra size bo should be according to height alignment)
the kernel will reject the command stream.
This patch reprogram the command stream to slice - 1 (slice is
a derivative value from height) which avoid rejecting the command
stream while keeping the value of command stream checking from a
security point of view.
This patch also fix wrong computation of layer size for 2D tiled
surface. Which should fix issue when 2D color tiling is enabled.
This dump the radeon KMS_DRIVER_MINOR so userspace can know if
they are on a fixed kernel or not.
https://lkml.org/lkml/2012/6/3/80https://bugs.freedesktop.org/show_bug.cgi?id=50892https://bugs.freedesktop.org/show_bug.cgi?id=50857
!!! STABLE need a custom version of this patch for 3.4 !!!
v2: actually bump the minor version and add comment about stable
v3: do compute the height the ddx was trying to use
[airlied: drop left over debug]
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
No need to malloc it any more.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
For 6xx+. Required for mesa to use htile support for HiZ/HiS.
Userspace will check radeon version 2.14 with is bumped either
by tiling patch or stream out patch. This patch only add support
for htile relocation which should be enough for any userspace
to implement the hyperz (using htile buffer) feature.
v2: Jerome: Fix size checking for htile buffer.
v3: Jerome: Adapt on top of r600/evergreen cs checker changes,
also check htile surface in case only stencil is
present.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pelloux@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
and document the other unused ones.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
There are also two fixes:
- In DRAW_INDEX_2, we read idx_value, but should have read idx+1.
- When correcting SQ_VTX_CONSTANT_WORD1_0.SIZE, we should subtract
the offset.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We store stuff in texdw[7] so this array needs to have 8 elements.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>