Merge tag 'drm-intel-gt-next-2022-04-27' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

UAPI Changes:

- GuC hwconfig support and query (John Harrison, Rodrigo Vivi, Tvrtko Ursulin)
- Sysfs support for multi-tile devices (Andi Shyti, Sujaritha Sundaresan)
- Per client GPU utilisation via fdinfo (Tvrtko Ursulin, Ashutosh Dixit)
- Add DRM_I915_QUERY_GEOMETRY_SUBSLICES (Matt Atwood)

Cross-subsystem Changes:

- Add GSC as a MEI auxiliary device (Tomas Winkler, Alexander Usyskin)

Core Changes:

- Document fdinfo format specification (Tvrtko Ursulin)

Driver Changes:

- Fix prime_mmap to work when using LMEM (Gwan-gyeong Mun)
- Fix vm open count and remove vma refcount (Thomas Hellström)
- Fixup setting screen_size (Matthew Auld)
- Opportunistically apply ALLOC_CONTIGIOUS (Matthew Auld)
- Limit where we apply TTM_PL_FLAG_CONTIGUOUS (Matthew Auld)
- Drop aux table invalidation on FlatCCS platforms (Matt Roper)
- Add missing boundary check in vm_access (Mastan Katragadda)
- Update topology dumps for Xe_HP (Matt Roper)
- Add support for steered register writes (Matt Roper)
- Add steering info to GuC register save/restore list (Daniele Ceraolo Spurio)
- Small PCI BAR enabling (Matthew Auld, Akeem G Abodunrin, CQ Tang)
- Add preemption changes for Wa_14015141709 (Akeem G Abodunrin)
- Add logical mapping for video decode engines (Matthew Brost)
- Don't evict unmappable VMAs when pinning with PIN_MAPPABLE (v2) (Vivek Kasireddy)
- GuC error capture support (Alan Previn, Daniele Ceraolo Spurio)
- avoid concurrent writes to aux_inv (Fei Yang)
- Add Wa_22014226127 (José Roberto de Souza)
- Sunset igpu legacy mmap support based on GRAPHICS_VER_FULL (Matt Roper)
- Evict and restore of compressed objects (Ramalingam C)
- Update to GuC version 70.1.1 (John Harrison)
- Add Wa_22011802037 force cs halt (Tilak Tangudu)
- Enable Wa_22011802037 for gen12 GuC based platforms (Umesh Nerlige Ramappa)
- GuC based workarounds for DG2 (Vinay Belgaumkar, John Harrison, Matthew Brost, José Roberto de Souza)
- consider min_page_size when migrating (Matthew Auld)

- Prep work for next GuC firmware release (John Harrison)
- Support platforms with CCS engines but no RCS (Matt Roper, Stuart Summers)
- Don't overallocate subslice storage (Matt Roper)
- Reduce stack usage in debugfs due to SSEU (John Harrison)
- Report steering details in debugfs (Matt Roper)
- Refactor some x86-ism out to prepare for non-x86 builds (Michael Cheng)
- add lmem_size modparam (CQ Tang)
- Refactor for non-x86 driver builds (Casey Bowman)
- Centralize computation of freq caps (Ashutosh Dixit)

- Update dma_buf_ops.unmap_dma_buf callback to use drm_gem_unmap_dma_buf() (Gwan-gyeong Mun)
- Limit the async bind to bind_async_flags (Matthew Auld)
- Stop checking for NULL vma->obj (Matthew Auld)
- Reduce overzealous alignment constraints for GGTT (Matthew Auld)
- Remove GEN12_SFC_DONE_MAX from register defs header (Matt Roper)
- Fix renamed struct field (Lucas De Marchi)
- Do not return '0' if there is nothing to return (Andi Shyti)
- fix i915_reg_t initialization (Jani Nikula)
- move the migration sanity check (Matthew Auld)
- handle more rounding in selftests (Matthew Auld)
- Perf and i915 query kerneldoc updates (Matt Roper)
- Use i915_probe_error instead of drm_err (Vinay Belgaumkar)
- sanity check object size in the buddy allocator (Matthew Auld)
- fixup selftests min_alignment usage (Matthew Auld)
- tweak selftests misaligned_case (Matthew Auld)

Signed-off-by: Dave Airlie <airlied@redhat.com>

# Conflicts:
#	drivers/gpu/drm/i915/i915_vma.c
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Ymkfy8FjsG2JrodK@tursulin-mobl2
This commit is contained in:
Dave Airlie 2022-04-28 15:32:29 +10:00
Родитель 4eaf02db9c f15856d7de
Коммит 9bda072a7b
143 изменённых файлов: 8171 добавлений и 2340 удалений

Просмотреть файл

@ -0,0 +1,112 @@
.. _drm-client-usage-stats:
======================
DRM client usage stats
======================
DRM drivers can choose to export partly standardised text output via the
`fops->show_fdinfo()` as part of the driver specific file operations registered
in the `struct drm_driver` object registered with the DRM core.
One purpose of this output is to enable writing as generic as practicaly
feasible `top(1)` like userspace monitoring tools.
Given the differences between various DRM drivers the specification of the
output is split between common and driver specific parts. Having said that,
wherever possible effort should still be made to standardise as much as
possible.
File format specification
=========================
- File shall contain one key value pair per one line of text.
- Colon character (`:`) must be used to delimit keys and values.
- All keys shall be prefixed with `drm-`.
- Whitespace between the delimiter and first non-whitespace character shall be
ignored when parsing.
- Neither keys or values are allowed to contain whitespace characters.
- Numerical key value pairs can end with optional unit string.
- Data type of the value is fixed as defined in the specification.
Key types
---------
1. Mandatory, fully standardised.
2. Optional, fully standardised.
3. Driver specific.
Data types
----------
- <uint> - Unsigned integer without defining the maximum value.
- <str> - String excluding any above defined reserved characters or whitespace.
Mandatory fully standardised keys
---------------------------------
- drm-driver: <str>
String shall contain the name this driver registered as via the respective
`struct drm_driver` data structure.
Optional fully standardised keys
--------------------------------
- drm-pdev: <aaaa:bb.cc.d>
For PCI devices this should contain the PCI slot address of the device in
question.
- drm-client-id: <uint>
Unique value relating to the open DRM file descriptor used to distinguish
duplicated and shared file descriptors. Conceptually the value should map 1:1
to the in kernel representation of `struct drm_file` instances.
Uniqueness of the value shall be either globally unique, or unique within the
scope of each device, in which case `drm-pdev` shall be present as well.
Userspace should make sure to not double account any usage statistics by using
the above described criteria in order to associate data to individual clients.
- drm-engine-<str>: <uint> ns
GPUs usually contain multiple execution engines. Each shall be given a stable
and unique name (str), with possible values documented in the driver specific
documentation.
Value shall be in specified time units which the respective GPU engine spent
busy executing workloads belonging to this client.
Values are not required to be constantly monotonic if it makes the driver
implementation easier, but are required to catch up with the previously reported
larger value within a reasonable period. Upon observing a value lower than what
was previously read, userspace is expected to stay with that larger previous
value until a monotonic update is seen.
- drm-engine-capacity-<str>: <uint>
Engine identifier string must be the same as the one specified in the
drm-engine-<str> tag and shall contain a greater than zero number in case the
exported engine corresponds to a group of identical hardware engines.
In the absence of this tag parser shall assume capacity of one. Zero capacity
is not allowed.
- drm-memory-<str>: <uint> [KiB|MiB]
Each possible memory type which can be used to store buffer objects by the
GPU in question shall be given a stable and unique name to be returned as the
string here.
Value shall reflect the amount of storage currently consumed by the buffer
object belong to this client, in the respective memory region.
Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
indicating kibi- or mebi-bytes.
===============================
Driver specific implementations
===============================
:ref:`i915-usage-stats`

Просмотреть файл

@ -697,3 +697,31 @@ The style guide for ``i915_reg.h``.
.. kernel-doc:: drivers/gpu/drm/i915/i915_reg.h
:doc: The i915 register macro definition style guide
.. _i915-usage-stats:
i915 DRM client usage stats implementation
==========================================
The drm/i915 driver implements the DRM client usage stats specification as
documented in :ref:`drm-client-usage-stats`.
Example of the output showing the implemented key value pairs and entirety of
the currently possible format options:
::
pos: 0
flags: 0100002
mnt_id: 21
drm-driver: i915
drm-pdev: 0000:00:02.0
drm-client-id: 7
drm-engine-render: 9288864723 ns
drm-engine-copy: 2035071108 ns
drm-engine-video: 0 ns
drm-engine-capacity-video: 2
drm-engine-video-enhance: 0 ns
Possible `drm-engine-` key names are: `render`, `copy`, `video` and
`video-enhance`.

Просмотреть файл

@ -10,6 +10,7 @@ Linux GPU Driver Developer's Guide
drm-kms
drm-kms-helpers
drm-uapi
drm-usage-stats
driver-uapi
drm-client
drivers

Просмотреть файл

@ -9996,6 +9996,7 @@ S: Supported
F: Documentation/driver-api/mei/*
F: drivers/misc/mei/
F: drivers/watchdog/mei_wdt.c
F: include/linux/mei_aux.h
F: include/linux/mei_cl_bus.h
F: include/uapi/linux/mei.h
F: samples/mei/*

Просмотреть файл

@ -4,7 +4,7 @@ config DRM_I915
depends on DRM
depends on X86 && PCI
depends on !PREEMPT_RT
select INTEL_GTT
select INTEL_GTT if X86
select INTERVAL_TREE
# we need shmfs for the swappable backing store, and in particular
# the shmem_readpage() which depends upon tmpfs
@ -30,6 +30,7 @@ config DRM_I915
select VMAP_PFN
select DRM_TTM
select DRM_BUDDY
select AUXILIARY_BUS
help
Choose this option if you have a system that has "Intel Graphics
Media Accelerator" or "HD Graphics" integrated graphics,

Просмотреть файл

@ -33,6 +33,7 @@ subdir-ccflags-y += -I$(srctree)/$(src)
# core driver code
i915-y += i915_driver.o \
i915_drm_client.o \
i915_config.o \
i915_getparam.o \
i915_ioctl.o \
@ -106,6 +107,8 @@ gt-y += \
gt/intel_gt_pm_debugfs.o \
gt/intel_gt_pm_irq.o \
gt/intel_gt_requests.o \
gt/intel_gt_sysfs.o \
gt/intel_gt_sysfs_pm.o \
gt/intel_gtt.o \
gt/intel_llc.o \
gt/intel_lrc.o \
@ -125,6 +128,8 @@ gt-y += \
gt/intel_workarounds.o \
gt/shmem_utils.o \
gt/sysfs_engines.o
# x86 intel-gtt module support
gt-$(CONFIG_X86) += gt/intel_gt_gmch.o
# autogenerated null render state
gt-y += \
gt/gen6_renderstate.o \
@ -185,9 +190,11 @@ i915-y += gt/uc/intel_uc.o \
gt/uc/intel_uc_fw.o \
gt/uc/intel_guc.o \
gt/uc/intel_guc_ads.o \
gt/uc/intel_guc_capture.o \
gt/uc/intel_guc_ct.o \
gt/uc/intel_guc_debugfs.o \
gt/uc/intel_guc_fw.o \
gt/uc/intel_guc_hwconfig.o \
gt/uc/intel_guc_log.o \
gt/uc/intel_guc_log_debugfs.o \
gt/uc/intel_guc_rc.o \
@ -197,6 +204,9 @@ i915-y += gt/uc/intel_uc.o \
gt/uc/intel_huc_debugfs.o \
gt/uc/intel_huc_fw.o
# graphics system controller (GSC) support
i915-y += gt/intel_gsc.o
# modesetting core code
i915-y += \
display/hsw_ips.o \

Просмотреть файл

@ -300,5 +300,5 @@ void intel_dpt_destroy(struct i915_address_space *vm)
{
struct i915_dpt *dpt = i915_vm_to_dpt(vm);
i915_vm_close(&dpt->vm);
i915_vm_put(&dpt->vm);
}

Просмотреть файл

@ -2029,7 +2029,7 @@ intel_user_framebuffer_create(struct drm_device *dev,
/* object is backed with LMEM for discrete */
i915 = to_i915(obj->base.dev);
if (HAS_LMEM(i915) && !i915_gem_object_can_migrate(obj, INTEL_REGION_LMEM)) {
if (HAS_LMEM(i915) && !i915_gem_object_can_migrate(obj, INTEL_REGION_LMEM_0)) {
/* object is "remote", not in local memory */
i915_gem_object_put(obj);
return ERR_PTR(-EREMOTE);

Просмотреть файл

@ -140,7 +140,7 @@ retry:
if (!ret && phys_cursor)
ret = i915_gem_object_attach_phys(obj, alignment);
else if (!ret && HAS_LMEM(dev_priv))
ret = i915_gem_object_migrate(obj, &ww, INTEL_REGION_LMEM);
ret = i915_gem_object_migrate(obj, &ww, INTEL_REGION_LMEM_0);
/* TODO: Do we need to sync when migration becomes async? */
if (!ret)
ret = i915_gem_object_pin_pages(obj);

Просмотреть файл

@ -279,7 +279,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
/* Our framebuffer is the entirety of fbdev's system memory */
info->fix.smem_start =
(unsigned long)(ggtt->gmadr.start + vma->node.start);
info->fix.smem_len = vma->node.size;
info->fix.smem_len = vma->size;
}
vaddr = i915_vma_pin_iomap(vma);
@ -290,7 +290,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
goto out_unpin;
}
info->screen_base = vaddr;
info->screen_size = vma->node.size;
info->screen_size = vma->size;
drm_fb_helper_fill_info(info, &ifbdev->helper, sizes);

Просмотреть файл

@ -3,6 +3,7 @@
* Copyright © 2021 Intel Corporation
*/
#include "gem/i915_gem_region.h"
#include "i915_drv.h"
#include "intel_atomic_plane.h"
#include "intel_display.h"
@ -46,16 +47,55 @@ static struct i915_vma *
initial_plane_vma(struct drm_i915_private *i915,
struct intel_initial_plane_config *plane_config)
{
struct intel_memory_region *mem = i915->mm.stolen_region;
struct intel_memory_region *mem;
struct drm_i915_gem_object *obj;
struct i915_vma *vma;
resource_size_t phys_base;
u32 base, size;
u64 pinctl;
if (!mem || plane_config->size == 0)
if (plane_config->size == 0)
return NULL;
base = round_down(plane_config->base, I915_GTT_MIN_ALIGNMENT);
if (IS_DGFX(i915)) {
gen8_pte_t __iomem *gte = to_gt(i915)->ggtt->gsm;
gen8_pte_t pte;
gte += base / I915_GTT_PAGE_SIZE;
pte = ioread64(gte);
if (!(pte & GEN12_GGTT_PTE_LM)) {
drm_err(&i915->drm,
"Initial plane programming missing PTE_LM bit\n");
return NULL;
}
phys_base = pte & I915_GTT_PAGE_MASK;
mem = i915->mm.regions[INTEL_REGION_LMEM_0];
/*
* We don't currently expect this to ever be placed in the
* stolen portion.
*/
if (phys_base >= resource_size(&mem->region)) {
drm_err(&i915->drm,
"Initial plane programming using invalid range, phys_base=%pa\n",
&phys_base);
return NULL;
}
drm_dbg(&i915->drm,
"Using phys_base=%pa, based on initial plane programming\n",
&phys_base);
} else {
phys_base = base;
mem = i915->mm.stolen_region;
}
if (!mem)
return NULL;
base = round_down(plane_config->base,
I915_GTT_MIN_ALIGNMENT);
size = round_up(plane_config->base + plane_config->size,
mem->min_page_size);
size -= base;
@ -66,10 +106,11 @@ initial_plane_vma(struct drm_i915_private *i915,
* features.
*/
if (IS_ENABLED(CONFIG_FRAMEBUFFER_CONSOLE) &&
mem == i915->mm.stolen_region &&
size * 2 > i915->stolen_usable_size)
return NULL;
obj = i915_gem_object_create_stolen_for_preallocated(i915, base, size);
obj = i915_gem_object_create_region_at(mem, phys_base, size, 0);
if (IS_ERR(obj))
return NULL;
@ -99,7 +140,10 @@ initial_plane_vma(struct drm_i915_private *i915,
if (IS_ERR(vma))
goto err_obj;
if (i915_ggtt_pin(vma, NULL, 0, PIN_MAPPABLE | PIN_OFFSET_FIXED | base))
pinctl = PIN_GLOBAL | PIN_OFFSET_FIXED | base;
if (HAS_GMCH(i915))
pinctl |= PIN_MAPPABLE;
if (i915_vma_pin(vma, 0, 0, pinctl))
goto err_obj;
if (i915_gem_object_is_tiled(obj) &&

Просмотреть файл

@ -1031,23 +1031,44 @@ static void free_engines_rcu(struct rcu_head *rcu)
free_engines(engines);
}
static void accumulate_runtime(struct i915_drm_client *client,
struct i915_gem_engines *engines)
{
struct i915_gem_engines_iter it;
struct intel_context *ce;
if (!client)
return;
/* Transfer accumulated runtime to the parent GEM context. */
for_each_gem_engine(ce, engines, it) {
unsigned int class = ce->engine->uabi_class;
GEM_BUG_ON(class >= ARRAY_SIZE(client->past_runtime));
atomic64_add(intel_context_get_total_runtime_ns(ce),
&client->past_runtime[class]);
}
}
static int
engines_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
{
struct i915_gem_engines *engines =
container_of(fence, typeof(*engines), fence);
struct i915_gem_context *ctx = engines->ctx;
switch (state) {
case FENCE_COMPLETE:
if (!list_empty(&engines->link)) {
struct i915_gem_context *ctx = engines->ctx;
unsigned long flags;
spin_lock_irqsave(&ctx->stale.lock, flags);
list_del(&engines->link);
spin_unlock_irqrestore(&ctx->stale.lock, flags);
}
i915_gem_context_put(engines->ctx);
accumulate_runtime(ctx->client, engines);
i915_gem_context_put(ctx);
break;
case FENCE_FREE:
@ -1257,6 +1278,9 @@ static void i915_gem_context_release_work(struct work_struct *work)
if (ctx->pxp_wakeref)
intel_runtime_pm_put(&ctx->i915->runtime_pm, ctx->pxp_wakeref);
if (ctx->client)
i915_drm_client_put(ctx->client);
mutex_destroy(&ctx->engines_mutex);
mutex_destroy(&ctx->lut_mutex);
@ -1467,7 +1491,7 @@ static void set_closed_name(struct i915_gem_context *ctx)
static void context_close(struct i915_gem_context *ctx)
{
struct i915_address_space *vm;
struct i915_drm_client *client;
/* Flush any concurrent set_engines() */
mutex_lock(&ctx->engines_mutex);
@ -1480,19 +1504,6 @@ static void context_close(struct i915_gem_context *ctx)
set_closed_name(ctx);
vm = ctx->vm;
if (vm) {
/* i915_vm_close drops the final reference, which is a bit too
* early and could result in surprises with concurrent
* operations racing with thist ctx close. Keep a full reference
* until the end.
*/
i915_vm_get(vm);
i915_vm_close(vm);
}
ctx->file_priv = ERR_PTR(-EBADF);
/*
* The LUT uses the VMA as a backpointer to unref the object,
* so we need to clear the LUT before we close all the VMA (inside
@ -1500,10 +1511,19 @@ static void context_close(struct i915_gem_context *ctx)
*/
lut_close(ctx);
ctx->file_priv = ERR_PTR(-EBADF);
spin_lock(&ctx->i915->gem.contexts.lock);
list_del(&ctx->link);
spin_unlock(&ctx->i915->gem.contexts.lock);
client = ctx->client;
if (client) {
spin_lock(&client->ctx_lock);
list_del_rcu(&ctx->client_link);
spin_unlock(&client->ctx_lock);
}
mutex_unlock(&ctx->mutex);
/*
@ -1598,12 +1618,8 @@ i915_gem_create_context(struct drm_i915_private *i915,
}
vm = &ppgtt->vm;
}
if (vm) {
ctx->vm = i915_vm_open(vm);
/* i915_vm_open() takes a reference */
i915_vm_put(vm);
}
if (vm)
ctx->vm = vm;
mutex_init(&ctx->engines_mutex);
if (pc->num_user_engines >= 0) {
@ -1653,7 +1669,7 @@ err_engines:
free_engines(e);
err_vm:
if (ctx->vm)
i915_vm_close(ctx->vm);
i915_vm_put(ctx->vm);
err_ctx:
kfree(ctx);
return ERR_PTR(err);
@ -1680,6 +1696,8 @@ static void gem_context_register(struct i915_gem_context *ctx,
ctx->file_priv = fpriv;
ctx->pid = get_task_pid(current, PIDTYPE_PID);
ctx->client = i915_drm_client_get(fpriv->client);
snprintf(ctx->name, sizeof(ctx->name), "%s[%d]",
current->comm, pid_nr(ctx->pid));
@ -1687,6 +1705,10 @@ static void gem_context_register(struct i915_gem_context *ctx,
old = xa_store(&fpriv->context_xa, id, ctx, GFP_KERNEL);
WARN_ON(old);
spin_lock(&ctx->client->ctx_lock);
list_add_tail_rcu(&ctx->client_link, &ctx->client->ctx_list);
spin_unlock(&ctx->client->ctx_lock);
spin_lock(&i915->gem.contexts.lock);
list_add_tail(&ctx->link, &i915->gem.contexts.list);
spin_unlock(&i915->gem.contexts.lock);
@ -1837,7 +1859,7 @@ static int get_ppgtt(struct drm_i915_file_private *file_priv,
if (err)
return err;
i915_vm_open(vm);
i915_vm_get(vm);
GEM_BUG_ON(id == 0); /* reserved for invalid/unassigned ppgtt */
args->value = id;

Просмотреть файл

@ -293,6 +293,12 @@ struct i915_gem_context {
/** @link: place with &drm_i915_private.context_list */
struct list_head link;
/** @client: struct i915_drm_client */
struct i915_drm_client *client;
/** @client_link: for linking onto &i915_drm_client.ctx_list */
struct list_head client_link;
/**
* @ref: reference count
*

Просмотреть файл

@ -123,7 +123,7 @@ __i915_gem_object_create_user_ext(struct drm_i915_private *i915, u64 size,
*/
flags = I915_BO_ALLOC_USER;
ret = mr->ops->init_object(mr, obj, size, 0, flags);
ret = mr->ops->init_object(mr, obj, I915_BO_INVALID_OFFSET, size, 0, flags);
if (ret)
goto object_free;

Просмотреть файл

@ -66,15 +66,6 @@ err:
return ERR_PTR(ret);
}
static void i915_gem_unmap_dma_buf(struct dma_buf_attachment *attachment,
struct sg_table *sg,
enum dma_data_direction dir)
{
dma_unmap_sgtable(attachment->dev, sg, dir, DMA_ATTR_SKIP_CPU_SYNC);
sg_free_table(sg);
kfree(sg);
}
static int i915_gem_dmabuf_vmap(struct dma_buf *dma_buf,
struct iosys_map *map)
{
@ -102,11 +93,15 @@ static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf,
static int i915_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct *vma)
{
struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
struct drm_i915_private *i915 = to_i915(obj->base.dev);
int ret;
if (obj->base.size < vma->vm_end - vma->vm_start)
return -EINVAL;
if (HAS_LMEM(i915))
return drm_gem_prime_mmap(&obj->base, vma);
if (!obj->base.filp)
return -ENODEV;
@ -209,7 +204,7 @@ static const struct dma_buf_ops i915_dmabuf_ops = {
.attach = i915_gem_dmabuf_attach,
.detach = i915_gem_dmabuf_detach,
.map_dma_buf = i915_gem_map_dma_buf,
.unmap_dma_buf = i915_gem_unmap_dma_buf,
.unmap_dma_buf = drm_gem_unmap_dma_buf,
.release = drm_gem_dmabuf_release,
.mmap = i915_gem_dmabuf_mmap,
.vmap = i915_gem_dmabuf_vmap,

Просмотреть файл

@ -1321,10 +1321,8 @@ static void *reloc_vaddr(struct i915_vma *vma,
static void clflush_write32(u32 *addr, u32 value, unsigned int flushes)
{
if (unlikely(flushes & (CLFLUSH_BEFORE | CLFLUSH_AFTER))) {
if (flushes & CLFLUSH_BEFORE) {
clflushopt(addr);
mb();
}
if (flushes & CLFLUSH_BEFORE)
drm_clflush_virt_range(addr, sizeof(*addr));
*addr = value;
@ -1336,7 +1334,7 @@ static void clflush_write32(u32 *addr, u32 value, unsigned int flushes)
* to ensure ordering of clflush wrt to the system.
*/
if (flushes & CLFLUSH_AFTER)
clflushopt(addr);
drm_clflush_virt_range(addr, sizeof(*addr));
} else
*addr = value;
}
@ -2690,6 +2688,11 @@ eb_select_engine(struct i915_execbuffer *eb)
if (err)
goto err;
if (!i915_vm_tryget(ce->vm)) {
err = -ENOENT;
goto err;
}
eb->context = ce;
eb->gt = ce->engine->gt;
@ -2713,6 +2716,7 @@ eb_put_engine(struct i915_execbuffer *eb)
{
struct intel_context *child;
i915_vm_put(eb->context->vm);
intel_gt_pm_put(eb->gt);
for_each_child(eb->context, child)
intel_context_put(child);

Просмотреть файл

@ -102,7 +102,7 @@ __i915_gem_object_create_lmem_with_ps(struct drm_i915_private *i915,
resource_size_t page_size,
unsigned int flags)
{
return i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_LMEM],
return i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_LMEM_0],
size, page_size, flags);
}
@ -137,6 +137,6 @@ i915_gem_object_create_lmem(struct drm_i915_private *i915,
resource_size_t size,
unsigned int flags)
{
return i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_LMEM],
return i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_LMEM_0],
size, 0, flags);
}

Просмотреть файл

@ -70,7 +70,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
* mmap ioctl is disallowed for all discrete platforms,
* and for all platforms with GRAPHICS_VER > 12.
*/
if (IS_DGFX(i915) || GRAPHICS_VER(i915) > 12)
if (IS_DGFX(i915) || GRAPHICS_VER_FULL(i915) > IP_VER(12, 0))
return -EOPNOTSUPP;
if (args->flags & ~(I915_MMAP_WC))

Просмотреть файл

@ -606,6 +606,9 @@ bool i915_gem_object_can_migrate(struct drm_i915_gem_object *obj,
if (!mr)
return false;
if (!IS_ALIGNED(obj->base.size, mr->min_page_size))
return false;
if (obj->mm.region == mr)
return true;

Просмотреть файл

@ -631,6 +631,8 @@ struct drm_i915_gem_object {
struct drm_mm_node *stolen;
resource_size_t bo_offset;
unsigned long scratch;
u64 encode;

Просмотреть файл

@ -29,11 +29,12 @@ void i915_gem_object_release_memory_region(struct drm_i915_gem_object *obj)
mutex_unlock(&mem->objects.lock);
}
struct drm_i915_gem_object *
i915_gem_object_create_region(struct intel_memory_region *mem,
resource_size_t size,
resource_size_t page_size,
unsigned int flags)
static struct drm_i915_gem_object *
__i915_gem_object_create_region(struct intel_memory_region *mem,
resource_size_t offset,
resource_size_t size,
resource_size_t page_size,
unsigned int flags)
{
struct drm_i915_gem_object *obj;
resource_size_t default_page_size;
@ -64,6 +65,9 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
size = round_up(size, default_page_size);
if (default_page_size == size)
flags |= I915_BO_ALLOC_CONTIGUOUS;
GEM_BUG_ON(!size);
GEM_BUG_ON(!IS_ALIGNED(size, I915_GTT_MIN_ALIGNMENT));
@ -85,7 +89,7 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
if (default_page_size < mem->min_page_size)
flags |= I915_BO_ALLOC_PM_EARLY;
err = mem->ops->init_object(mem, obj, size, page_size, flags);
err = mem->ops->init_object(mem, obj, offset, size, page_size, flags);
if (err)
goto err_object_free;
@ -97,6 +101,40 @@ err_object_free:
return ERR_PTR(err);
}
struct drm_i915_gem_object *
i915_gem_object_create_region(struct intel_memory_region *mem,
resource_size_t size,
resource_size_t page_size,
unsigned int flags)
{
return __i915_gem_object_create_region(mem, I915_BO_INVALID_OFFSET,
size, page_size, flags);
}
struct drm_i915_gem_object *
i915_gem_object_create_region_at(struct intel_memory_region *mem,
resource_size_t offset,
resource_size_t size,
unsigned int flags)
{
GEM_BUG_ON(offset == I915_BO_INVALID_OFFSET);
if (GEM_WARN_ON(!IS_ALIGNED(size, mem->min_page_size)) ||
GEM_WARN_ON(!IS_ALIGNED(offset, mem->min_page_size)))
return ERR_PTR(-EINVAL);
if (range_overflows(offset, size, resource_size(&mem->region)))
return ERR_PTR(-EINVAL);
if (!(flags & I915_BO_ALLOC_GPU_ONLY) &&
offset + size > mem->io_size &&
!i915_ggtt_has_aperture(to_gt(mem->i915)->ggtt))
return ERR_PTR(-ENOSPC);
return __i915_gem_object_create_region(mem, offset, size, 0,
flags | I915_BO_ALLOC_CONTIGUOUS);
}
/**
* i915_gem_process_region - Iterate over all objects of a region using ops
* to process and optionally skip objects

Просмотреть файл

@ -14,6 +14,8 @@ struct sg_table;
struct i915_gem_apply_to_region;
#define I915_BO_INVALID_OFFSET ((resource_size_t)-1)
/**
* struct i915_gem_apply_to_region_ops - ops to use when iterating over all
* region objects.
@ -56,6 +58,11 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
resource_size_t size,
resource_size_t page_size,
unsigned int flags);
struct drm_i915_gem_object *
i915_gem_object_create_region_at(struct intel_memory_region *mem,
resource_size_t offset,
resource_size_t size,
unsigned int flags);
int i915_gem_process_region(struct intel_memory_region *mr,
struct i915_gem_apply_to_region *apply);

Просмотреть файл

@ -553,6 +553,7 @@ static int __create_shmem(struct drm_i915_private *i915,
static int shmem_object_init(struct intel_memory_region *mem,
struct drm_i915_gem_object *obj,
resource_size_t offset,
resource_size_t size,
resource_size_t page_size,
unsigned int flags)

Просмотреть файл

@ -12,6 +12,8 @@
#include "gem/i915_gem_lmem.h"
#include "gem/i915_gem_region.h"
#include "gt/intel_gt.h"
#include "gt/intel_region_lmem.h"
#include "i915_drv.h"
#include "i915_gem_stolen.h"
#include "i915_reg.h"
@ -493,7 +495,7 @@ static int i915_gem_init_stolen(struct intel_memory_region *mem)
/* Exclude the reserved region from driver use */
mem->region.end = reserved_base - 1;
mem->io_size = resource_size(&mem->region);
mem->io_size = min(mem->io_size, resource_size(&mem->region));
/* It is possible for the reserved area to end before the end of stolen
* memory, so just consider the start. */
@ -680,6 +682,7 @@ static int __i915_gem_object_create_stolen(struct intel_memory_region *mem,
static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
struct drm_i915_gem_object *obj,
resource_size_t offset,
resource_size_t size,
resource_size_t page_size,
unsigned int flags)
@ -694,12 +697,32 @@ static int _i915_gem_object_stolen_init(struct intel_memory_region *mem,
if (size == 0)
return -EINVAL;
/*
* With discrete devices, where we lack a mappable aperture there is no
* possible way to ever access this memory on the CPU side.
*/
if (mem->type == INTEL_MEMORY_STOLEN_LOCAL && !mem->io_size &&
!(flags & I915_BO_ALLOC_GPU_ONLY))
return -ENOSPC;
stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
if (!stolen)
return -ENOMEM;
ret = i915_gem_stolen_insert_node(i915, stolen, size,
mem->min_page_size);
if (offset != I915_BO_INVALID_OFFSET) {
drm_dbg(&i915->drm,
"creating preallocated stolen object: stolen_offset=%pa, size=%pa\n",
&offset, &size);
stolen->start = offset;
stolen->size = size;
mutex_lock(&i915->mm.stolen_lock);
ret = drm_mm_reserve_node(&i915->mm.stolen, stolen);
mutex_unlock(&i915->mm.stolen_lock);
} else {
ret = i915_gem_stolen_insert_node(i915, stolen, size,
mem->min_page_size);
}
if (ret)
goto err_free;
@ -751,11 +774,6 @@ static int init_stolen_lmem(struct intel_memory_region *mem)
if (GEM_WARN_ON(resource_size(&mem->region) == 0))
return -ENODEV;
if (!io_mapping_init_wc(&mem->iomap,
mem->io_start,
mem->io_size))
return -EIO;
/*
* TODO: For stolen lmem we mostly just care about populating the dsm
* related bits and setting up the drm_mm allocator for the range.
@ -763,18 +781,26 @@ static int init_stolen_lmem(struct intel_memory_region *mem)
*/
err = i915_gem_init_stolen(mem);
if (err)
goto err_fini;
return err;
if (mem->io_size && !io_mapping_init_wc(&mem->iomap,
mem->io_start,
mem->io_size)) {
err = -EIO;
goto err_cleanup;
}
return 0;
err_fini:
io_mapping_fini(&mem->iomap);
err_cleanup:
i915_gem_cleanup_stolen(mem->i915);
return err;
}
static int release_stolen_lmem(struct intel_memory_region *mem)
{
io_mapping_fini(&mem->iomap);
if (mem->io_size)
io_mapping_fini(&mem->iomap);
i915_gem_cleanup_stolen(mem->i915);
return 0;
}
@ -791,25 +817,43 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private *i915, u16 type,
{
struct intel_uncore *uncore = &i915->uncore;
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
resource_size_t dsm_size, dsm_base, lmem_size;
struct intel_memory_region *mem;
resource_size_t io_start, io_size;
resource_size_t min_page_size;
resource_size_t io_start;
resource_size_t lmem_size;
u64 lmem_base;
lmem_base = intel_uncore_read64(uncore, GEN12_DSMBASE);
if (GEM_WARN_ON(lmem_base >= pci_resource_len(pdev, 2)))
if (WARN_ON_ONCE(instance))
return ERR_PTR(-ENODEV);
lmem_size = pci_resource_len(pdev, 2) - lmem_base;
io_start = pci_resource_start(pdev, 2) + lmem_base;
/* Use DSM base address instead for stolen memory */
dsm_base = intel_uncore_read64(uncore, GEN12_DSMBASE);
if (IS_DG1(uncore->i915)) {
lmem_size = pci_resource_len(pdev, 2);
if (WARN_ON(lmem_size < dsm_base))
return ERR_PTR(-ENODEV);
} else {
resource_size_t lmem_range;
lmem_range = intel_gt_read_register(&i915->gt0, XEHPSDV_TILE0_ADDR_RANGE) & 0xFFFF;
lmem_size = lmem_range >> XEHPSDV_TILE_LMEM_RANGE_SHIFT;
lmem_size *= SZ_1G;
}
dsm_size = lmem_size - dsm_base;
if (pci_resource_len(pdev, 2) < lmem_size) {
io_start = 0;
io_size = 0;
} else {
io_start = pci_resource_start(pdev, 2) + dsm_base;
io_size = dsm_size;
}
min_page_size = HAS_64K_PAGES(i915) ? I915_GTT_PAGE_SIZE_64K :
I915_GTT_PAGE_SIZE_4K;
mem = intel_memory_region_create(i915, lmem_base, lmem_size,
mem = intel_memory_region_create(i915, dsm_base, dsm_size,
min_page_size,
io_start, lmem_size,
io_start, io_size,
type, instance,
&i915_region_stolen_lmem_ops);
if (IS_ERR(mem))
@ -823,6 +867,7 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private *i915, u16 type,
drm_dbg(&i915->drm, "Stolen Local memory IO start: %pa\n",
&mem->io_start);
drm_dbg(&i915->drm, "Stolen Local DSM base: %pa\n", &dsm_base);
intel_memory_region_set_name(mem, "stolen-local");
@ -851,63 +896,6 @@ i915_gem_stolen_smem_setup(struct drm_i915_private *i915, u16 type,
return mem;
}
struct drm_i915_gem_object *
i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
resource_size_t stolen_offset,
resource_size_t size)
{
struct intel_memory_region *mem = i915->mm.stolen_region;
struct drm_i915_gem_object *obj;
struct drm_mm_node *stolen;
int ret;
if (!drm_mm_initialized(&i915->mm.stolen))
return ERR_PTR(-ENODEV);
drm_dbg(&i915->drm,
"creating preallocated stolen object: stolen_offset=%pa, size=%pa\n",
&stolen_offset, &size);
/* KISS and expect everything to be page-aligned */
if (GEM_WARN_ON(size == 0) ||
GEM_WARN_ON(!IS_ALIGNED(size, mem->min_page_size)) ||
GEM_WARN_ON(!IS_ALIGNED(stolen_offset, mem->min_page_size)))
return ERR_PTR(-EINVAL);
stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
if (!stolen)
return ERR_PTR(-ENOMEM);
stolen->start = stolen_offset;
stolen->size = size;
mutex_lock(&i915->mm.stolen_lock);
ret = drm_mm_reserve_node(&i915->mm.stolen, stolen);
mutex_unlock(&i915->mm.stolen_lock);
if (ret)
goto err_free;
obj = i915_gem_object_alloc();
if (!obj) {
ret = -ENOMEM;
goto err_stolen;
}
ret = __i915_gem_object_create_stolen(mem, obj, stolen);
if (ret)
goto err_object_free;
i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
return obj;
err_object_free:
i915_gem_object_free(obj);
err_stolen:
i915_gem_stolen_remove_node(i915, stolen);
err_free:
kfree(stolen);
return ERR_PTR(ret);
}
bool i915_gem_object_is_stolen(const struct drm_i915_gem_object *obj)
{
return obj->ops == &i915_gem_object_stolen_ops;

Просмотреть файл

@ -31,10 +31,6 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private *i915, u16 type,
struct drm_i915_gem_object *
i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
resource_size_t size);
struct drm_i915_gem_object *
i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *dev_priv,
resource_size_t stolen_offset,
resource_size_t size);
bool i915_gem_object_is_stolen(const struct drm_i915_gem_object *obj);

Просмотреть файл

@ -20,6 +20,7 @@
#include "gem/i915_gem_ttm.h"
#include "gem/i915_gem_ttm_move.h"
#include "gem/i915_gem_ttm_pm.h"
#include "gt/intel_gpu_commands.h"
#define I915_TTM_PRIO_PURGE 0
#define I915_TTM_PRIO_NO_PAGES 1
@ -126,14 +127,22 @@ i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj)
static void
i915_ttm_place_from_region(const struct intel_memory_region *mr,
struct ttm_place *place,
resource_size_t offset,
resource_size_t size,
unsigned int flags)
{
memset(place, 0, sizeof(*place));
place->mem_type = intel_region_to_ttm_type(mr);
if (mr->type == INTEL_MEMORY_SYSTEM)
return;
if (flags & I915_BO_ALLOC_CONTIGUOUS)
place->flags |= TTM_PL_FLAG_CONTIGUOUS;
if (mr->io_size && mr->io_size < mr->total) {
if (offset != I915_BO_INVALID_OFFSET) {
place->fpfn = offset >> PAGE_SHIFT;
place->lpfn = place->fpfn + (size >> PAGE_SHIFT);
} else if (mr->io_size && mr->io_size < mr->total) {
if (flags & I915_BO_ALLOC_GPU_ONLY) {
place->flags |= TTM_PL_FLAG_TOPDOWN;
} else {
@ -155,12 +164,14 @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj,
placement->num_placement = 1;
i915_ttm_place_from_region(num_allowed ? obj->mm.placements[0] :
obj->mm.region, requested, flags);
obj->mm.region, requested, obj->bo_offset,
obj->base.size, flags);
/* Cache this on object? */
placement->num_busy_placement = num_allowed;
for (i = 0; i < placement->num_busy_placement; ++i)
i915_ttm_place_from_region(obj->mm.placements[i], busy + i, flags);
i915_ttm_place_from_region(obj->mm.placements[i], busy + i,
obj->bo_offset, obj->base.size, flags);
if (num_allowed == 0) {
*busy = *requested;
@ -255,12 +266,33 @@ static const struct i915_refct_sgt_ops tt_rsgt_ops = {
.release = i915_ttm_tt_release
};
static inline bool
i915_gem_object_needs_ccs_pages(struct drm_i915_gem_object *obj)
{
bool lmem_placement = false;
int i;
for (i = 0; i < obj->mm.n_placements; i++) {
/* Compression is not allowed for the objects with smem placement */
if (obj->mm.placements[i]->type == INTEL_MEMORY_SYSTEM)
return false;
if (!lmem_placement &&
obj->mm.placements[i]->type == INTEL_MEMORY_LOCAL)
lmem_placement = true;
}
return lmem_placement;
}
static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
uint32_t page_flags)
{
struct drm_i915_private *i915 = container_of(bo->bdev, typeof(*i915),
bdev);
struct ttm_resource_manager *man =
ttm_manager_type(bo->bdev, bo->resource->mem_type);
struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
unsigned long ccs_pages = 0;
enum ttm_caching caching;
struct i915_ttm_tt *i915_tt;
int ret;
@ -283,7 +315,12 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
i915_tt->is_shmem = true;
}
ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching, 0);
if (HAS_FLAT_CCS(i915) && i915_gem_object_needs_ccs_pages(obj))
ccs_pages = DIV_ROUND_UP(DIV_ROUND_UP(bo->base.size,
NUM_BYTES_PER_CCS_BYTE),
PAGE_SIZE);
ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching, ccs_pages);
if (ret)
goto err_free;
@ -763,6 +800,7 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
i915_sg_dma_sizes(rsgt->table.sgl));
}
GEM_BUG_ON(bo->ttm && ((obj->base.size >> PAGE_SHIFT) < bo->ttm->num_pages));
i915_ttm_adjust_lru(obj);
return ret;
}
@ -802,7 +840,8 @@ static int __i915_ttm_migrate(struct drm_i915_gem_object *obj,
struct ttm_placement placement;
int ret;
i915_ttm_place_from_region(mr, &requested, flags);
i915_ttm_place_from_region(mr, &requested, obj->bo_offset,
obj->base.size, flags);
placement.num_placement = 1;
placement.num_busy_placement = 1;
placement.placement = &requested;
@ -1142,6 +1181,7 @@ void i915_ttm_bo_destroy(struct ttm_buffer_object *bo)
*/
int __i915_gem_ttm_object_init(struct intel_memory_region *mem,
struct drm_i915_gem_object *obj,
resource_size_t offset,
resource_size_t size,
resource_size_t page_size,
unsigned int flags)
@ -1158,6 +1198,8 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem,
drm_gem_private_object_init(&i915->drm, &obj->base, size);
i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class, flags);
obj->bo_offset = offset;
/* Don't put on a region list until we're either locked or fully initialized. */
obj->mm.region = mem;
INIT_LIST_HEAD(&obj->mm.region_link);

Просмотреть файл

@ -45,6 +45,7 @@ i915_ttm_to_gem(struct ttm_buffer_object *bo)
int __i915_gem_ttm_object_init(struct intel_memory_region *mem,
struct drm_i915_gem_object *obj,
resource_size_t offset,
resource_size_t size,
resource_size_t page_size,
unsigned int flags);

Просмотреть файл

@ -88,7 +88,7 @@ out:
static int igt_dmabuf_import_same_driver_lmem(void *arg)
{
struct drm_i915_private *i915 = arg;
struct intel_memory_region *lmem = i915->mm.regions[INTEL_REGION_LMEM];
struct intel_memory_region *lmem = i915->mm.regions[INTEL_REGION_LMEM_0];
struct drm_i915_gem_object *obj;
struct drm_gem_object *import;
struct dma_buf *dmabuf;
@ -253,10 +253,10 @@ static int igt_dmabuf_import_same_driver_lmem_smem(void *arg)
struct drm_i915_private *i915 = arg;
struct intel_memory_region *regions[2];
if (!i915->mm.regions[INTEL_REGION_LMEM])
if (!i915->mm.regions[INTEL_REGION_LMEM_0])
return 0;
regions[0] = i915->mm.regions[INTEL_REGION_LMEM];
regions[0] = i915->mm.regions[INTEL_REGION_LMEM_0];
regions[1] = i915->mm.regions[INTEL_REGION_SMEM];
return igt_dmabuf_import_same_driver(i915, regions, 2);
}

Просмотреть файл

@ -47,14 +47,16 @@ static int igt_create_migrate(struct intel_gt *gt, enum intel_region_id src,
{
struct drm_i915_private *i915 = gt->i915;
struct intel_memory_region *src_mr = i915->mm.regions[src];
struct intel_memory_region *dst_mr = i915->mm.regions[dst];
struct drm_i915_gem_object *obj;
struct i915_gem_ww_ctx ww;
int err = 0;
GEM_BUG_ON(!src_mr);
GEM_BUG_ON(!dst_mr);
/* Switch object backing-store on create */
obj = i915_gem_object_create_region(src_mr, PAGE_SIZE, 0, 0);
obj = i915_gem_object_create_region(src_mr, dst_mr->min_page_size, 0, 0);
if (IS_ERR(obj))
return PTR_ERR(obj);
@ -92,17 +94,17 @@ static int igt_create_migrate(struct intel_gt *gt, enum intel_region_id src,
static int igt_smem_create_migrate(void *arg)
{
return igt_create_migrate(arg, INTEL_REGION_LMEM, INTEL_REGION_SMEM);
return igt_create_migrate(arg, INTEL_REGION_LMEM_0, INTEL_REGION_SMEM);
}
static int igt_lmem_create_migrate(void *arg)
{
return igt_create_migrate(arg, INTEL_REGION_SMEM, INTEL_REGION_LMEM);
return igt_create_migrate(arg, INTEL_REGION_SMEM, INTEL_REGION_LMEM_0);
}
static int igt_same_create_migrate(void *arg)
{
return igt_create_migrate(arg, INTEL_REGION_LMEM, INTEL_REGION_LMEM);
return igt_create_migrate(arg, INTEL_REGION_LMEM_0, INTEL_REGION_LMEM_0);
}
static int lmem_pages_migrate_one(struct i915_gem_ww_ctx *ww,
@ -152,7 +154,7 @@ static int lmem_pages_migrate_one(struct i915_gem_ww_ctx *ww,
}
} else {
err = i915_gem_object_migrate(obj, ww, INTEL_REGION_LMEM);
err = i915_gem_object_migrate(obj, ww, INTEL_REGION_LMEM_0);
if (err) {
pr_err("Object failed migration to lmem\n");
if (err)

Просмотреть файл

@ -42,8 +42,7 @@ mock_context(struct drm_i915_private *i915,
if (!ppgtt)
goto err_free;
ctx->vm = i915_vm_open(&ppgtt->vm);
i915_vm_put(&ppgtt->vm);
ctx->vm = &ppgtt->vm;
}
mutex_init(&ctx->engines_mutex);
@ -59,7 +58,7 @@ mock_context(struct drm_i915_private *i915,
err_vm:
if (ctx->vm)
i915_vm_close(ctx->vm);
i915_vm_put(ctx->vm);
err_free:
kfree(ctx);
return NULL;

Просмотреть файл

@ -322,7 +322,7 @@ int gen6_ppgtt_pin(struct i915_ppgtt *base, struct i915_gem_ww_ctx *ww)
struct gen6_ppgtt *ppgtt = to_gen6_ppgtt(base);
int err;
GEM_BUG_ON(!atomic_read(&ppgtt->base.vm.open));
GEM_BUG_ON(!kref_read(&ppgtt->base.vm.ref));
/*
* Workaround the limited maximum vma->pin_count and the aliasing_ppgtt

Просмотреть файл

@ -6,7 +6,6 @@
#include "gen8_engine_cs.h"
#include "i915_drv.h"
#include "intel_gpu_commands.h"
#include "intel_gt_regs.h"
#include "intel_lrc.h"
#include "intel_ring.h"
@ -165,33 +164,9 @@ static u32 preparser_disable(bool state)
return MI_ARB_CHECK | 1 << 8 | state;
}
static i915_reg_t aux_inv_reg(const struct intel_engine_cs *engine)
u32 *gen12_emit_aux_table_inv(u32 *cs, const i915_reg_t inv_reg)
{
static const i915_reg_t vd[] = {
GEN12_VD0_AUX_NV,
GEN12_VD1_AUX_NV,
GEN12_VD2_AUX_NV,
GEN12_VD3_AUX_NV,
};
static const i915_reg_t ve[] = {
GEN12_VE0_AUX_NV,
GEN12_VE1_AUX_NV,
};
if (engine->class == VIDEO_DECODE_CLASS)
return vd[engine->instance];
if (engine->class == VIDEO_ENHANCEMENT_CLASS)
return ve[engine->instance];
GEM_BUG_ON("unknown aux_inv reg\n");
return INVALID_MMIO_REG;
}
static u32 *gen12_emit_aux_table_inv(const i915_reg_t inv_reg, u32 *cs)
{
*cs++ = MI_LOAD_REGISTER_IMM(1);
*cs++ = MI_LOAD_REGISTER_IMM(1) | MI_LRI_MMIO_REMAP_EN;
*cs++ = i915_mmio_reg_offset(inv_reg);
*cs++ = AUX_INV;
*cs++ = MI_NOOP;
@ -236,7 +211,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
if (mode & EMIT_INVALIDATE) {
u32 flags = 0;
u32 *cs;
u32 *cs, count;
flags |= PIPE_CONTROL_COMMAND_CACHE_INVALIDATE;
flags |= PIPE_CONTROL_TLB_INVALIDATE;
@ -254,7 +229,12 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
if (engine->class == COMPUTE_CLASS)
flags &= ~PIPE_CONTROL_3D_FLAGS;
cs = intel_ring_begin(rq, 8 + 4);
if (!HAS_FLAT_CCS(rq->engine->i915))
count = 8 + 4;
else
count = 8;
cs = intel_ring_begin(rq, count);
if (IS_ERR(cs))
return PTR_ERR(cs);
@ -267,8 +247,10 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
/* hsdes: 1809175790 */
cs = gen12_emit_aux_table_inv(GEN12_GFX_CCS_AUX_NV, cs);
if (!HAS_FLAT_CCS(rq->engine->i915)) {
/* hsdes: 1809175790 */
cs = gen12_emit_aux_table_inv(cs, GEN12_GFX_CCS_AUX_NV);
}
*cs++ = preparser_disable(false);
intel_ring_advance(rq, cs);
@ -283,12 +265,17 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
u32 cmd, *cs;
cmd = 4;
if (mode & EMIT_INVALIDATE)
if (mode & EMIT_INVALIDATE) {
cmd += 2;
if (mode & EMIT_INVALIDATE)
aux_inv = rq->engine->mask & ~BIT(BCS0);
if (aux_inv)
cmd += 2 * hweight32(aux_inv) + 2;
if (!HAS_FLAT_CCS(rq->engine->i915) &&
(rq->engine->class == VIDEO_DECODE_CLASS ||
rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
aux_inv = rq->engine->mask & ~BIT(BCS0);
if (aux_inv)
cmd += 4;
}
}
cs = intel_ring_begin(rq, cmd);
if (IS_ERR(cs))
@ -319,15 +306,10 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
*cs++ = 0; /* value */
if (aux_inv) { /* hsdes: 1809175790 */
struct intel_engine_cs *engine;
unsigned int tmp;
*cs++ = MI_LOAD_REGISTER_IMM(hweight32(aux_inv));
for_each_engine_masked(engine, rq->engine->gt, aux_inv, tmp) {
*cs++ = i915_mmio_reg_offset(aux_inv_reg(engine));
*cs++ = AUX_INV;
}
*cs++ = MI_NOOP;
if (rq->engine->class == VIDEO_DECODE_CLASS)
cs = gen12_emit_aux_table_inv(cs, GEN12_VD0_AUX_NV);
else
cs = gen12_emit_aux_table_inv(cs, GEN12_VE0_AUX_NV);
}
if (mode & EMIT_INVALIDATE)
@ -601,6 +583,43 @@ static u32 *gen12_emit_preempt_busywait(struct i915_request *rq, u32 *cs)
return cs;
}
/* Wa_14014475959:dg2 */
#define CCS_SEMAPHORE_PPHWSP_OFFSET 0x540
static u32 ccs_semaphore_offset(struct i915_request *rq)
{
return i915_ggtt_offset(rq->context->state) +
(LRC_PPHWSP_PN * PAGE_SIZE) + CCS_SEMAPHORE_PPHWSP_OFFSET;
}
/* Wa_14014475959:dg2 */
static u32 *ccs_emit_wa_busywait(struct i915_request *rq, u32 *cs)
{
int i;
*cs++ = MI_ATOMIC_INLINE | MI_ATOMIC_GLOBAL_GTT | MI_ATOMIC_CS_STALL |
MI_ATOMIC_MOVE;
*cs++ = ccs_semaphore_offset(rq);
*cs++ = 0;
*cs++ = 1;
/*
* When MI_ATOMIC_INLINE_DATA set this command must be 11 DW + (1 NOP)
* to align. 4 DWs above + 8 filler DWs here.
*/
for (i = 0; i < 8; ++i)
*cs++ = 0;
*cs++ = MI_SEMAPHORE_WAIT |
MI_SEMAPHORE_GLOBAL_GTT |
MI_SEMAPHORE_POLL |
MI_SEMAPHORE_SAD_EQ_SDD;
*cs++ = 0;
*cs++ = ccs_semaphore_offset(rq);
*cs++ = 0;
return cs;
}
static __always_inline u32*
gen12_emit_fini_breadcrumb_tail(struct i915_request *rq, u32 *cs)
{
@ -611,6 +630,10 @@ gen12_emit_fini_breadcrumb_tail(struct i915_request *rq, u32 *cs)
!intel_uc_uses_guc_submission(&rq->engine->gt->uc))
cs = gen12_emit_preempt_busywait(rq, cs);
/* Wa_14014475959:dg2 */
if (intel_engine_uses_wa_hold_ccs_switchout(rq->engine))
cs = ccs_emit_wa_busywait(rq, cs);
rq->tail = intel_ring_offset(rq, cs);
assert_ring_tail_valid(rq->ring, rq->tail);

Просмотреть файл

@ -10,7 +10,7 @@
#include <linux/types.h>
#include "i915_gem.h" /* GEM_BUG_ON */
#include "intel_gt_regs.h"
#include "intel_gpu_commands.h"
struct i915_request;
@ -38,6 +38,8 @@ u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
u32 *gen11_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
u32 *gen12_emit_aux_table_inv(u32 *cs, const i915_reg_t inv_reg);
static inline u32 *
__gen8_emit_pipe_control(u32 *batch, u32 flags0, u32 flags1, u32 offset)
{

Просмотреть файл

@ -454,11 +454,11 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
pd = pdp->entry[gen8_pd_index(idx, 2)];
}
clflush_cache_range(vaddr, PAGE_SIZE);
drm_clflush_virt_range(vaddr, PAGE_SIZE);
vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
}
} while (1);
clflush_cache_range(vaddr, PAGE_SIZE);
drm_clflush_virt_range(vaddr, PAGE_SIZE);
return idx;
}
@ -631,7 +631,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
}
} while (rem >= page_size && index < I915_PDES);
clflush_cache_range(vaddr, PAGE_SIZE);
drm_clflush_virt_range(vaddr, PAGE_SIZE);
/*
* Is it safe to mark the 2M block as 64K? -- Either we have
@ -647,7 +647,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
I915_GTT_PAGE_SIZE_2M)))) {
vaddr = px_vaddr(pd);
vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
clflush_cache_range(vaddr, PAGE_SIZE);
drm_clflush_virt_range(vaddr, PAGE_SIZE);
page_size = I915_GTT_PAGE_SIZE_64K;
/*
@ -668,7 +668,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
for (i = 1; i < index; i += 16)
memset64(vaddr + i, encode, 15);
clflush_cache_range(vaddr, PAGE_SIZE);
drm_clflush_virt_range(vaddr, PAGE_SIZE);
}
}
@ -722,7 +722,7 @@ static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
vaddr = px_vaddr(pt);
vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
clflush_cache_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
}
static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,

Просмотреть файл

@ -386,7 +386,7 @@ intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
ce->ring = NULL;
ce->ring_size = SZ_4K;
ewma_runtime_init(&ce->runtime.avg);
ewma_runtime_init(&ce->stats.runtime.avg);
ce->vm = i915_vm_get(engine->gt->vm);
@ -400,7 +400,7 @@ intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
INIT_LIST_HEAD(&ce->guc_state.fences);
INIT_LIST_HEAD(&ce->guc_state.requests);
ce->guc_id.id = GUC_INVALID_LRC_ID;
ce->guc_id.id = GUC_INVALID_CONTEXT_ID;
INIT_LIST_HEAD(&ce->guc_id.link);
INIT_LIST_HEAD(&ce->destroyed_link);
@ -576,6 +576,31 @@ void intel_context_bind_parent_child(struct intel_context *parent,
child->parallel.parent = parent;
}
u64 intel_context_get_total_runtime_ns(const struct intel_context *ce)
{
u64 total, active;
total = ce->stats.runtime.total;
if (ce->ops->flags & COPS_RUNTIME_CYCLES)
total *= ce->engine->gt->clock_period_ns;
active = READ_ONCE(ce->stats.active);
if (active)
active = intel_context_clock() - active;
return total + active;
}
u64 intel_context_get_avg_runtime_ns(struct intel_context *ce)
{
u64 avg = ewma_runtime_read(&ce->stats.runtime.avg);
if (ce->ops->flags & COPS_RUNTIME_CYCLES)
avg *= ce->engine->gt->clock_period_ns;
return avg;
}
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
#include "selftest_context.c"
#endif

Просмотреть файл

@ -351,18 +351,13 @@ intel_context_clear_nopreempt(struct intel_context *ce)
clear_bit(CONTEXT_NOPREEMPT, &ce->flags);
}
static inline u64 intel_context_get_total_runtime_ns(struct intel_context *ce)
u64 intel_context_get_total_runtime_ns(const struct intel_context *ce);
u64 intel_context_get_avg_runtime_ns(struct intel_context *ce);
static inline u64 intel_context_clock(void)
{
const u32 period = ce->engine->gt->clock_period_ns;
return READ_ONCE(ce->runtime.total) * period;
}
static inline u64 intel_context_get_avg_runtime_ns(struct intel_context *ce)
{
const u32 period = ce->engine->gt->clock_period_ns;
return mul_u32_u32(ewma_runtime_read(&ce->runtime.avg), period);
/* As we mix CS cycles with CPU clocks, use the raw monotonic clock. */
return ktime_get_raw_fast_ns();
}
#endif /* __INTEL_CONTEXT_H__ */

Просмотреть файл

@ -35,6 +35,9 @@ struct intel_context_ops {
#define COPS_HAS_INFLIGHT_BIT 0
#define COPS_HAS_INFLIGHT BIT(COPS_HAS_INFLIGHT_BIT)
#define COPS_RUNTIME_CYCLES_BIT 1
#define COPS_RUNTIME_CYCLES BIT(COPS_RUNTIME_CYCLES_BIT)
int (*alloc)(struct intel_context *ce);
void (*ban)(struct intel_context *ce, struct i915_request *rq);
@ -134,14 +137,19 @@ struct intel_context {
} lrc;
u32 tag; /* cookie passed to HW to track this context on submission */
/* Time on GPU as tracked by the hw. */
struct {
struct ewma_runtime avg;
u64 total;
u32 last;
I915_SELFTEST_DECLARE(u32 num_underflow);
I915_SELFTEST_DECLARE(u32 max_underflow);
} runtime;
/** stats: Context GPU engine busyness tracking. */
struct intel_context_stats {
u64 active;
/* Time on GPU as tracked by the hw. */
struct {
struct ewma_runtime avg;
u64 total;
u32 last;
I915_SELFTEST_DECLARE(u32 num_underflow);
I915_SELFTEST_DECLARE(u32 max_underflow);
} runtime;
} stats;
unsigned int active_count; /* protected by timeline->mutex */

Просмотреть файл

@ -4,6 +4,7 @@
#include <asm/cacheflush.h>
#include <drm/drm_util.h>
#include <drm/drm_cache.h>
#include <linux/hashtable.h>
#include <linux/irq_work.h>
@ -143,15 +144,9 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
* of extra paranoia to try and ensure that the HWS takes the value
* we give and that it doesn't end up trapped inside the CPU!
*/
if (static_cpu_has(X86_FEATURE_CLFLUSH)) {
mb();
clflush(&engine->status_page.addr[reg]);
engine->status_page.addr[reg] = value;
clflush(&engine->status_page.addr[reg]);
mb();
} else {
WRITE_ONCE(engine->status_page.addr[reg], value);
}
drm_clflush_virt_range(&engine->status_page.addr[reg], sizeof(value));
WRITE_ONCE(engine->status_page.addr[reg], value);
drm_clflush_virt_range(&engine->status_page.addr[reg], sizeof(value));
}
/*

Просмотреть файл

@ -436,6 +436,11 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
if (GRAPHICS_VER(i915) == 12 && engine->class == RENDER_CLASS)
engine->props.preempt_timeout_ms = 0;
if ((engine->class == COMPUTE_CLASS && !RCS_MASK(engine->gt) &&
__ffs(CCS_MASK(engine->gt)) == engine->instance) ||
engine->class == RENDER_CLASS)
engine->flags |= I915_ENGINE_FIRST_RENDER_COMPUTE;
/* features common between engines sharing EUs */
if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
@ -726,12 +731,24 @@ static void populate_logical_ids(struct intel_gt *gt, u8 *logical_ids,
static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 class)
{
int i;
u8 map[MAX_ENGINE_INSTANCE + 1];
/*
* Logical to physical mapping is needed for proper support
* to split-frame feature.
*/
if (MEDIA_VER(gt->i915) >= 11 && class == VIDEO_DECODE_CLASS) {
const u8 map[] = { 0, 2, 4, 6, 1, 3, 5, 7 };
for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
map[i] = i;
populate_logical_ids(gt, logical_ids, class, map, ARRAY_SIZE(map));
populate_logical_ids(gt, logical_ids, class,
map, ARRAY_SIZE(map));
} else {
int i;
u8 map[MAX_ENGINE_INSTANCE + 1];
for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
map[i] = i;
populate_logical_ids(gt, logical_ids, class,
map, ARRAY_SIZE(map));
}
}
/**
@ -1263,6 +1280,15 @@ static int __intel_engine_stop_cs(struct intel_engine_cs *engine,
int err;
intel_uncore_write_fw(uncore, mode, _MASKED_BIT_ENABLE(STOP_RING));
/*
* Wa_22011802037 : gen12, Prior to doing a reset, ensure CS is
* stopped, set ring stop bit and prefetch disable bit to halt CS
*/
if (GRAPHICS_VER(engine->i915) == 12)
intel_uncore_write_fw(uncore, RING_MODE_GEN7(engine->mmio_base),
_MASKED_BIT_ENABLE(GEN12_GFX_PREFETCH_DISABLE));
err = __intel_wait_for_register_fw(engine->uncore, mode,
MODE_IDLE, MODE_IDLE,
fast_timeout_us,
@ -1697,9 +1723,7 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
drm_printf(m, "\tIPEHR: 0x%08x\n", ENGINE_READ(engine, IPEHR));
}
if (intel_engine_uses_guc(engine)) {
/* nothing to print yet */
} else if (HAS_EXECLISTS(dev_priv)) {
if (HAS_EXECLISTS(dev_priv) && !intel_engine_uses_guc(engine)) {
struct i915_request * const *port, *rq;
const u32 *hws =
&engine->status_page.addr[I915_HWS_CSB_BUF0_INDEX];

Просмотреть файл

@ -181,6 +181,7 @@
#define GFX_SURFACE_FAULT_ENABLE (1 << 12)
#define GFX_REPLAY_MODE (1 << 11)
#define GFX_PSMI_GRANULARITY (1 << 10)
#define GEN12_GFX_PREFETCH_DISABLE REG_BIT(10)
#define GFX_PPGTT_ENABLE (1 << 9)
#define GEN8_GFX_PPGTT_48B (1 << 7)
#define GFX_FORWARD_VBLANK_MASK (3 << 5)

Просмотреть файл

@ -96,7 +96,9 @@ struct i915_ctx_workarounds {
#define I915_MAX_VCS 8
#define I915_MAX_VECS 4
#define I915_MAX_SFC (I915_MAX_VCS / 2)
#define I915_MAX_CCS 4
#define I915_MAX_RCS 1
/*
* Engine IDs definitions.
@ -526,6 +528,8 @@ struct intel_engine_cs {
#define I915_ENGINE_WANT_FORCED_PREEMPTION BIT(8)
#define I915_ENGINE_HAS_RCS_REG_STATE BIT(9)
#define I915_ENGINE_HAS_EU_PRIORITY BIT(10)
#define I915_ENGINE_FIRST_RENDER_COMPUTE BIT(11)
#define I915_ENGINE_USES_WA_HOLD_CCS_SWITCHOUT BIT(12)
unsigned int flags;
/*
@ -626,6 +630,13 @@ intel_engine_has_relative_mmio(const struct intel_engine_cs * const engine)
return engine->flags & I915_ENGINE_HAS_RELATIVE_MMIO;
}
/* Wa_14014475959:dg2 */
static inline bool
intel_engine_uses_wa_hold_ccs_switchout(struct intel_engine_cs *engine)
{
return engine->flags & I915_ENGINE_USES_WA_HOLD_CCS_SWITCHOUT;
}
#define instdone_has_slice(dev_priv___, sseu___, slice___) \
((GRAPHICS_VER(dev_priv___) == 7 ? 1 : ((sseu___)->slice_mask)) & BIT(slice___))
@ -643,7 +654,7 @@ intel_engine_has_relative_mmio(const struct intel_engine_cs * const engine)
#define for_each_instdone_gslice_dss_xehp(dev_priv_, sseu_, iter_, gslice_, dss_) \
for ((iter_) = 0, (gslice_) = 0, (dss_) = 0; \
(iter_) < GEN_MAX_SUBSLICES; \
(iter_) < GEN_SS_MASK_SIZE; \
(iter_)++, (gslice_) = (iter_) / GEN_DSS_PER_GSLICE, \
(dss_) = (iter_) % GEN_DSS_PER_GSLICE) \
for_each_if(intel_sseu_has_subslice((sseu_), 0, (iter_)))

Просмотреть файл

@ -193,7 +193,6 @@ static void add_legacy_ring(struct legacy_ring *ring,
void intel_engines_driver_register(struct drm_i915_private *i915)
{
struct legacy_ring ring = {};
u8 uabi_instances[5] = {};
struct list_head *it, *next;
struct rb_node **p, *prev;
LIST_HEAD(engines);
@ -214,8 +213,10 @@ void intel_engines_driver_register(struct drm_i915_private *i915)
GEM_BUG_ON(engine->class >= ARRAY_SIZE(uabi_classes));
engine->uabi_class = uabi_classes[engine->class];
GEM_BUG_ON(engine->uabi_class >= ARRAY_SIZE(uabi_instances));
engine->uabi_instance = uabi_instances[engine->uabi_class]++;
GEM_BUG_ON(engine->uabi_class >=
ARRAY_SIZE(i915->engine_uabi_class_count));
engine->uabi_instance =
i915->engine_uabi_class_count[engine->uabi_class]++;
/* Replace the internal name with the final user facing name */
memcpy(old, engine->name, sizeof(engine->name));
@ -245,8 +246,8 @@ void intel_engines_driver_register(struct drm_i915_private *i915)
int class, inst;
int errors = 0;
for (class = 0; class < ARRAY_SIZE(uabi_instances); class++) {
for (inst = 0; inst < uabi_instances[class]; inst++) {
for (class = 0; class < ARRAY_SIZE(i915->engine_uabi_class_count); class++) {
for (inst = 0; inst < i915->engine_uabi_class_count[class]; inst++) {
engine = intel_engine_lookup_user(i915,
class, inst);
if (!engine) {

Просмотреть файл

@ -625,8 +625,6 @@ static void __execlists_schedule_out(struct i915_request * const rq,
GEM_BUG_ON(test_bit(ccid - 1, &engine->context_tag));
__set_bit(ccid - 1, &engine->context_tag);
}
lrc_update_runtime(ce);
intel_engine_context_out(engine);
execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
if (engine->fw_domain && !--engine->fw_active)
@ -1651,12 +1649,6 @@ cancel_port_requests(struct intel_engine_execlists * const execlists,
return inactive;
}
static void invalidate_csb_entries(const u64 *first, const u64 *last)
{
clflush((void *)first);
clflush((void *)last);
}
/*
* Starting with Gen12, the status has a new format:
*
@ -2004,15 +1996,30 @@ process_csb(struct intel_engine_cs *engine, struct i915_request **inactive)
* the wash as hardware, working or not, will need to do the
* invalidation before.
*/
invalidate_csb_entries(&buf[0], &buf[num_entries - 1]);
drm_clflush_virt_range(&buf[0], num_entries * sizeof(buf[0]));
/*
* We assume that any event reflects a change in context flow
* and merits a fresh timeslice. We reinstall the timer after
* inspecting the queue to see if we need to resumbit.
*/
if (*prev != *execlists->active) /* elide lite-restores */
if (*prev != *execlists->active) { /* elide lite-restores */
/*
* Note the inherent discrepancy between the HW runtime,
* recorded as part of the context switch, and the CPU
* adjustment for active contexts. We have to hope that
* the delay in processing the CS event is very small
* and consistent. It works to our advantage to have
* the CPU adjustment _undershoot_ (i.e. start later than)
* the CS timestamp so we never overreport the runtime
* and correct overselves later when updating from HW.
*/
if (*prev)
lrc_runtime_stop((*prev)->context);
if (*execlists->active)
lrc_runtime_start((*execlists->active)->context);
new_timeslice(execlists);
}
return inactive;
}
@ -2236,11 +2243,11 @@ static struct execlists_capture *capture_regs(struct intel_engine_cs *engine)
if (!cap->error)
goto err_cap;
cap->error->gt = intel_gt_coredump_alloc(engine->gt, gfp);
cap->error->gt = intel_gt_coredump_alloc(engine->gt, gfp, CORE_DUMP_FLAG_NONE);
if (!cap->error->gt)
goto err_gpu;
cap->error->gt->engine = intel_engine_coredump_alloc(engine, gfp);
cap->error->gt->engine = intel_engine_coredump_alloc(engine, gfp, CORE_DUMP_FLAG_NONE);
if (!cap->error->gt->engine)
goto err_gt;
@ -2644,7 +2651,7 @@ unwind:
}
static const struct intel_context_ops execlists_context_ops = {
.flags = COPS_HAS_INFLIGHT,
.flags = COPS_HAS_INFLIGHT | COPS_RUNTIME_CYCLES,
.alloc = execlists_context_alloc,
@ -2788,8 +2795,9 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
/* Check that the GPU does indeed update the CSB entries! */
memset(execlists->csb_status, -1, (reset_value + 1) * sizeof(u64));
invalidate_csb_entries(&execlists->csb_status[0],
&execlists->csb_status[reset_value]);
drm_clflush_virt_range(execlists->csb_status,
execlists->csb_size *
sizeof(execlists->csb_status));
/* Once more for luck and our trusty paranoia */
ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR,
@ -2833,7 +2841,7 @@ static void execlists_sanitize(struct intel_engine_cs *engine)
sanitize_hwsp(engine);
/* And scrub the dirty cachelines for the HWSP */
clflush_cache_range(engine->status_page.addr, PAGE_SIZE);
drm_clflush_virt_range(engine->status_page.addr, PAGE_SIZE);
intel_engine_reset_pinned_contexts(engine);
}
@ -2912,7 +2920,7 @@ static int execlists_resume(struct intel_engine_cs *engine)
enable_execlists(engine);
if (engine->class == RENDER_CLASS)
if (engine->flags & I915_ENGINE_FIRST_RENDER_COMPUTE)
xehp_enable_ccs_engines(engine);
return 0;
@ -2958,9 +2966,8 @@ reset_csb(struct intel_engine_cs *engine, struct i915_request **inactive)
{
struct intel_engine_execlists * const execlists = &engine->execlists;
mb(); /* paranoia: read the CSB pointers from after the reset */
clflush(execlists->csb_write);
mb();
drm_clflush_virt_range(execlists->csb_write,
sizeof(execlists->csb_write[0]));
inactive = process_csb(engine, inactive); /* drain preemption events */
@ -3702,7 +3709,7 @@ virtual_get_sibling(struct intel_engine_cs *engine, unsigned int sibling)
}
static const struct intel_context_ops virtual_context_ops = {
.flags = COPS_HAS_INFLIGHT,
.flags = COPS_HAS_INFLIGHT | COPS_RUNTIME_CYCLES,
.alloc = virtual_context_alloc,

Просмотреть файл

@ -3,18 +3,16 @@
* Copyright © 2020 Intel Corporation
*/
#include <linux/agp_backend.h>
#include <linux/stop_machine.h>
#include <linux/types.h>
#include <asm/set_memory.h>
#include <asm/smp.h>
#include <drm/i915_drm.h>
#include <drm/intel-gtt.h>
#include "gem/i915_gem_lmem.h"
#include "intel_gt.h"
#include "intel_gt_gmch.h"
#include "intel_gt_regs.h"
#include "i915_drv.h"
#include "i915_scatterlist.h"
@ -95,28 +93,6 @@ int i915_ggtt_init_hw(struct drm_i915_private *i915)
return 0;
}
/*
* Certain Gen5 chipsets require idling the GPU before
* unmapping anything from the GTT when VT-d is enabled.
*/
static bool needs_idle_maps(struct drm_i915_private *i915)
{
/*
* Query intel_iommu to see if we need the workaround. Presumably that
* was loaded first.
*/
if (!i915_vtd_active(i915))
return false;
if (GRAPHICS_VER(i915) == 5 && IS_MOBILE(i915))
return true;
if (GRAPHICS_VER(i915) == 12)
return true; /* XXX DMAR fault reason 7 */
return false;
}
/**
* i915_ggtt_suspend_vm - Suspend the memory mappings for a GGTT or DPT VM
* @vm: The VM to suspend the mappings for
@ -127,7 +103,7 @@ static bool needs_idle_maps(struct drm_i915_private *i915)
void i915_ggtt_suspend_vm(struct i915_address_space *vm)
{
struct i915_vma *vma, *vn;
int open;
int save_skip_rewrite;
drm_WARN_ON(&vm->i915->drm, !vm->is_ggtt && !vm->is_dpt);
@ -136,8 +112,12 @@ retry:
mutex_lock(&vm->mutex);
/* Skip rewriting PTE on VMA unbind. */
open = atomic_xchg(&vm->open, 0);
/*
* Skip rewriting PTE on VMA unbind.
* FIXME: Use an argument to i915_vma_unbind() instead?
*/
save_skip_rewrite = vm->skip_pte_rewrite;
vm->skip_pte_rewrite = true;
list_for_each_entry_safe(vma, vn, &vm->bound_list, vm_link) {
struct drm_i915_gem_object *obj = vma->obj;
@ -155,16 +135,14 @@ retry:
*/
i915_gem_object_get(obj);
atomic_set(&vm->open, open);
mutex_unlock(&vm->mutex);
i915_gem_object_lock(obj, NULL);
open = i915_vma_unbind(vma);
GEM_WARN_ON(i915_vma_unbind(vma));
i915_gem_object_unlock(obj);
GEM_WARN_ON(open);
i915_gem_object_put(obj);
vm->skip_pte_rewrite = save_skip_rewrite;
goto retry;
}
@ -180,7 +158,7 @@ retry:
vm->clear_range(vm, 0, vm->total);
atomic_set(&vm->open, open);
vm->skip_pte_rewrite = save_skip_rewrite;
mutex_unlock(&vm->mutex);
}
@ -203,7 +181,7 @@ void gen6_ggtt_invalidate(struct i915_ggtt *ggtt)
spin_unlock_irq(&uncore->lock);
}
static void gen8_ggtt_invalidate(struct i915_ggtt *ggtt)
void gen8_ggtt_invalidate(struct i915_ggtt *ggtt)
{
struct intel_uncore *uncore = ggtt->vm.gt->uncore;
@ -228,11 +206,6 @@ static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
intel_uncore_write_fw(uncore, GEN8_GTCR, GEN8_GTCR_INVALIDATE);
}
static void gmch_ggtt_invalidate(struct i915_ggtt *ggtt)
{
intel_gtt_chipset_flush();
}
u64 gen8_ggtt_pte_encode(dma_addr_t addr,
enum i915_cache_level level,
u32 flags)
@ -245,258 +218,7 @@ u64 gen8_ggtt_pte_encode(dma_addr_t addr,
return pte;
}
static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
{
writeq(pte, addr);
}
static void gen8_ggtt_insert_page(struct i915_address_space *vm,
dma_addr_t addr,
u64 offset,
enum i915_cache_level level,
u32 flags)
{
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
gen8_pte_t __iomem *pte =
(gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, flags));
ggtt->invalidate(ggtt);
}
static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
struct i915_vma_resource *vma_res,
enum i915_cache_level level,
u32 flags)
{
const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, flags);
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
gen8_pte_t __iomem *gte;
gen8_pte_t __iomem *end;
struct sgt_iter iter;
dma_addr_t addr;
/*
* Note that we ignore PTE_READ_ONLY here. The caller must be careful
* not to allow the user to override access to a read only page.
*/
gte = (gen8_pte_t __iomem *)ggtt->gsm;
gte += vma_res->start / I915_GTT_PAGE_SIZE;
end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
gen8_set_pte(gte++, pte_encode | addr);
GEM_BUG_ON(gte > end);
/* Fill the allocated but "unused" space beyond the end of the buffer */
while (gte < end)
gen8_set_pte(gte++, vm->scratch[0]->encode);
/*
* We want to flush the TLBs only after we're certain all the PTE
* updates have finished.
*/
ggtt->invalidate(ggtt);
}
static void gen6_ggtt_insert_page(struct i915_address_space *vm,
dma_addr_t addr,
u64 offset,
enum i915_cache_level level,
u32 flags)
{
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
gen6_pte_t __iomem *pte =
(gen6_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
iowrite32(vm->pte_encode(addr, level, flags), pte);
ggtt->invalidate(ggtt);
}
/*
* Binds an object into the global gtt with the specified cache level.
* The object will be accessible to the GPU via commands whose operands
* reference offsets within the global GTT as well as accessible by the GPU
* through the GMADR mapped BAR (i915->mm.gtt->gtt).
*/
static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
struct i915_vma_resource *vma_res,
enum i915_cache_level level,
u32 flags)
{
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
gen6_pte_t __iomem *gte;
gen6_pte_t __iomem *end;
struct sgt_iter iter;
dma_addr_t addr;
gte = (gen6_pte_t __iomem *)ggtt->gsm;
gte += vma_res->start / I915_GTT_PAGE_SIZE;
end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
iowrite32(vm->pte_encode(addr, level, flags), gte++);
GEM_BUG_ON(gte > end);
/* Fill the allocated but "unused" space beyond the end of the buffer */
while (gte < end)
iowrite32(vm->scratch[0]->encode, gte++);
/*
* We want to flush the TLBs only after we're certain all the PTE
* updates have finished.
*/
ggtt->invalidate(ggtt);
}
static void nop_clear_range(struct i915_address_space *vm,
u64 start, u64 length)
{
}
static void gen8_ggtt_clear_range(struct i915_address_space *vm,
u64 start, u64 length)
{
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
unsigned int first_entry = start / I915_GTT_PAGE_SIZE;
unsigned int num_entries = length / I915_GTT_PAGE_SIZE;
const gen8_pte_t scratch_pte = vm->scratch[0]->encode;
gen8_pte_t __iomem *gtt_base =
(gen8_pte_t __iomem *)ggtt->gsm + first_entry;
const int max_entries = ggtt_total_entries(ggtt) - first_entry;
int i;
if (WARN(num_entries > max_entries,
"First entry = %d; Num entries = %d (max=%d)\n",
first_entry, num_entries, max_entries))
num_entries = max_entries;
for (i = 0; i < num_entries; i++)
gen8_set_pte(&gtt_base[i], scratch_pte);
}
static void bxt_vtd_ggtt_wa(struct i915_address_space *vm)
{
/*
* Make sure the internal GAM fifo has been cleared of all GTT
* writes before exiting stop_machine(). This guarantees that
* any aperture accesses waiting to start in another process
* cannot back up behind the GTT writes causing a hang.
* The register can be any arbitrary GAM register.
*/
intel_uncore_posting_read_fw(vm->gt->uncore, GFX_FLSH_CNTL_GEN6);
}
struct insert_page {
struct i915_address_space *vm;
dma_addr_t addr;
u64 offset;
enum i915_cache_level level;
};
static int bxt_vtd_ggtt_insert_page__cb(void *_arg)
{
struct insert_page *arg = _arg;
gen8_ggtt_insert_page(arg->vm, arg->addr, arg->offset, arg->level, 0);
bxt_vtd_ggtt_wa(arg->vm);
return 0;
}
static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
dma_addr_t addr,
u64 offset,
enum i915_cache_level level,
u32 unused)
{
struct insert_page arg = { vm, addr, offset, level };
stop_machine(bxt_vtd_ggtt_insert_page__cb, &arg, NULL);
}
struct insert_entries {
struct i915_address_space *vm;
struct i915_vma_resource *vma_res;
enum i915_cache_level level;
u32 flags;
};
static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
{
struct insert_entries *arg = _arg;
gen8_ggtt_insert_entries(arg->vm, arg->vma_res, arg->level, arg->flags);
bxt_vtd_ggtt_wa(arg->vm);
return 0;
}
static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm,
struct i915_vma_resource *vma_res,
enum i915_cache_level level,
u32 flags)
{
struct insert_entries arg = { vm, vma_res, level, flags };
stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL);
}
static void gen6_ggtt_clear_range(struct i915_address_space *vm,
u64 start, u64 length)
{
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
unsigned int first_entry = start / I915_GTT_PAGE_SIZE;
unsigned int num_entries = length / I915_GTT_PAGE_SIZE;
gen6_pte_t scratch_pte, __iomem *gtt_base =
(gen6_pte_t __iomem *)ggtt->gsm + first_entry;
const int max_entries = ggtt_total_entries(ggtt) - first_entry;
int i;
if (WARN(num_entries > max_entries,
"First entry = %d; Num entries = %d (max=%d)\n",
first_entry, num_entries, max_entries))
num_entries = max_entries;
scratch_pte = vm->scratch[0]->encode;
for (i = 0; i < num_entries; i++)
iowrite32(scratch_pte, &gtt_base[i]);
}
static void i915_ggtt_insert_page(struct i915_address_space *vm,
dma_addr_t addr,
u64 offset,
enum i915_cache_level cache_level,
u32 unused)
{
unsigned int flags = (cache_level == I915_CACHE_NONE) ?
AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
intel_gtt_insert_page(addr, offset >> PAGE_SHIFT, flags);
}
static void i915_ggtt_insert_entries(struct i915_address_space *vm,
struct i915_vma_resource *vma_res,
enum i915_cache_level cache_level,
u32 unused)
{
unsigned int flags = (cache_level == I915_CACHE_NONE) ?
AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
intel_gtt_insert_sg_entries(vma_res->bi.pages, vma_res->start >> PAGE_SHIFT,
flags);
}
static void i915_ggtt_clear_range(struct i915_address_space *vm,
u64 start, u64 length)
{
intel_gtt_clear_range(start >> PAGE_SHIFT, length >> PAGE_SHIFT);
}
static void ggtt_bind_vma(struct i915_address_space *vm,
void intel_ggtt_bind_vma(struct i915_address_space *vm,
struct i915_vm_pt_stash *stash,
struct i915_vma_resource *vma_res,
enum i915_cache_level cache_level,
@ -520,7 +242,7 @@ static void ggtt_bind_vma(struct i915_address_space *vm,
vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
}
static void ggtt_unbind_vma(struct i915_address_space *vm,
void intel_ggtt_unbind_vma(struct i915_address_space *vm,
struct i915_vma_resource *vma_res)
{
vm->clear_range(vm, vma_res->start, vma_res->vma_size);
@ -723,10 +445,10 @@ static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
ggtt->alias = ppgtt;
ggtt->vm.bind_async_flags |= ppgtt->vm.bind_async_flags;
GEM_BUG_ON(ggtt->vm.vma_ops.bind_vma != ggtt_bind_vma);
GEM_BUG_ON(ggtt->vm.vma_ops.bind_vma != intel_ggtt_bind_vma);
ggtt->vm.vma_ops.bind_vma = aliasing_gtt_bind_vma;
GEM_BUG_ON(ggtt->vm.vma_ops.unbind_vma != ggtt_unbind_vma);
GEM_BUG_ON(ggtt->vm.vma_ops.unbind_vma != intel_ggtt_unbind_vma);
ggtt->vm.vma_ops.unbind_vma = aliasing_gtt_unbind_vma;
i915_vm_free_pt_stash(&ppgtt->vm, &stash);
@ -749,8 +471,8 @@ static void fini_aliasing_ppgtt(struct i915_ggtt *ggtt)
i915_vm_put(&ppgtt->vm);
ggtt->vm.vma_ops.bind_vma = ggtt_bind_vma;
ggtt->vm.vma_ops.unbind_vma = ggtt_unbind_vma;
ggtt->vm.vma_ops.bind_vma = intel_ggtt_bind_vma;
ggtt->vm.vma_ops.unbind_vma = intel_ggtt_unbind_vma;
}
int i915_init_ggtt(struct drm_i915_private *i915)
@ -774,13 +496,13 @@ static void ggtt_cleanup_hw(struct i915_ggtt *ggtt)
{
struct i915_vma *vma, *vn;
atomic_set(&ggtt->vm.open, 0);
flush_workqueue(ggtt->vm.i915->wq);
i915_gem_drain_freed_objects(ggtt->vm.i915);
mutex_lock(&ggtt->vm.mutex);
ggtt->vm.skip_pte_rewrite = true;
list_for_each_entry_safe(vma, vn, &ggtt->vm.bound_list, vm_link) {
struct drm_i915_gem_object *obj = vma->obj;
bool trylock;
@ -838,364 +560,12 @@ void i915_ggtt_driver_late_release(struct drm_i915_private *i915)
dma_resv_fini(&ggtt->vm._resv);
}
static unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
{
snb_gmch_ctl >>= SNB_GMCH_GGMS_SHIFT;
snb_gmch_ctl &= SNB_GMCH_GGMS_MASK;
return snb_gmch_ctl << 20;
}
static unsigned int gen8_get_total_gtt_size(u16 bdw_gmch_ctl)
{
bdw_gmch_ctl >>= BDW_GMCH_GGMS_SHIFT;
bdw_gmch_ctl &= BDW_GMCH_GGMS_MASK;
if (bdw_gmch_ctl)
bdw_gmch_ctl = 1 << bdw_gmch_ctl;
#ifdef CONFIG_X86_32
/* Limit 32b platforms to a 2GB GGTT: 4 << 20 / pte size * I915_GTT_PAGE_SIZE */
if (bdw_gmch_ctl > 4)
bdw_gmch_ctl = 4;
#endif
return bdw_gmch_ctl << 20;
}
static unsigned int chv_get_total_gtt_size(u16 gmch_ctrl)
{
gmch_ctrl >>= SNB_GMCH_GGMS_SHIFT;
gmch_ctrl &= SNB_GMCH_GGMS_MASK;
if (gmch_ctrl)
return 1 << (20 + gmch_ctrl);
return 0;
}
static unsigned int gen6_gttmmadr_size(struct drm_i915_private *i915)
{
/*
* GEN6: GTTMMADR size is 4MB and GTTADR starts at 2MB offset
* GEN8: GTTMMADR size is 16MB and GTTADR starts at 8MB offset
*/
GEM_BUG_ON(GRAPHICS_VER(i915) < 6);
return (GRAPHICS_VER(i915) < 8) ? SZ_4M : SZ_16M;
}
static unsigned int gen6_gttadr_offset(struct drm_i915_private *i915)
{
return gen6_gttmmadr_size(i915) / 2;
}
static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
{
struct drm_i915_private *i915 = ggtt->vm.i915;
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
phys_addr_t phys_addr;
u32 pte_flags;
int ret;
GEM_WARN_ON(pci_resource_len(pdev, 0) != gen6_gttmmadr_size(i915));
phys_addr = pci_resource_start(pdev, 0) + gen6_gttadr_offset(i915);
/*
* On BXT+/ICL+ writes larger than 64 bit to the GTT pagetable range
* will be dropped. For WC mappings in general we have 64 byte burst
* writes when the WC buffer is flushed, so we can't use it, but have to
* resort to an uncached mapping. The WC issue is easily caught by the
* readback check when writing GTT PTE entries.
*/
if (IS_GEN9_LP(i915) || GRAPHICS_VER(i915) >= 11)
ggtt->gsm = ioremap(phys_addr, size);
else
ggtt->gsm = ioremap_wc(phys_addr, size);
if (!ggtt->gsm) {
drm_err(&i915->drm, "Failed to map the ggtt page table\n");
return -ENOMEM;
}
kref_init(&ggtt->vm.resv_ref);
ret = setup_scratch_page(&ggtt->vm);
if (ret) {
drm_err(&i915->drm, "Scratch setup failed\n");
/* iounmap will also get called at remove, but meh */
iounmap(ggtt->gsm);
return ret;
}
pte_flags = 0;
if (i915_gem_object_is_lmem(ggtt->vm.scratch[0]))
pte_flags |= PTE_LM;
ggtt->vm.scratch[0]->encode =
ggtt->vm.pte_encode(px_dma(ggtt->vm.scratch[0]),
I915_CACHE_NONE, pte_flags);
return 0;
}
static void gen6_gmch_remove(struct i915_address_space *vm)
{
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
iounmap(ggtt->gsm);
free_scratch(vm);
}
static struct resource pci_resource(struct pci_dev *pdev, int bar)
struct resource intel_pci_resource(struct pci_dev *pdev, int bar)
{
return (struct resource)DEFINE_RES_MEM(pci_resource_start(pdev, bar),
pci_resource_len(pdev, bar));
}
static int gen8_gmch_probe(struct i915_ggtt *ggtt)
{
struct drm_i915_private *i915 = ggtt->vm.i915;
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
unsigned int size;
u16 snb_gmch_ctl;
/* TODO: We're not aware of mappable constraints on gen8 yet */
if (!HAS_LMEM(i915)) {
ggtt->gmadr = pci_resource(pdev, 2);
ggtt->mappable_end = resource_size(&ggtt->gmadr);
}
pci_read_config_word(pdev, SNB_GMCH_CTRL, &snb_gmch_ctl);
if (IS_CHERRYVIEW(i915))
size = chv_get_total_gtt_size(snb_gmch_ctl);
else
size = gen8_get_total_gtt_size(snb_gmch_ctl);
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
ggtt->vm.lmem_pt_obj_flags = I915_BO_ALLOC_PM_EARLY;
ggtt->vm.total = (size / sizeof(gen8_pte_t)) * I915_GTT_PAGE_SIZE;
ggtt->vm.cleanup = gen6_gmch_remove;
ggtt->vm.insert_page = gen8_ggtt_insert_page;
ggtt->vm.clear_range = nop_clear_range;
if (intel_scanout_needs_vtd_wa(i915))
ggtt->vm.clear_range = gen8_ggtt_clear_range;
ggtt->vm.insert_entries = gen8_ggtt_insert_entries;
/*
* Serialize GTT updates with aperture access on BXT if VT-d is on,
* and always on CHV.
*/
if (intel_vm_no_concurrent_access_wa(i915)) {
ggtt->vm.insert_entries = bxt_vtd_ggtt_insert_entries__BKL;
ggtt->vm.insert_page = bxt_vtd_ggtt_insert_page__BKL;
ggtt->vm.bind_async_flags =
I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
}
ggtt->invalidate = gen8_ggtt_invalidate;
ggtt->vm.vma_ops.bind_vma = ggtt_bind_vma;
ggtt->vm.vma_ops.unbind_vma = ggtt_unbind_vma;
ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
setup_private_pat(ggtt->vm.gt->uncore);
return ggtt_probe_common(ggtt, size);
}
static u64 snb_pte_encode(dma_addr_t addr,
enum i915_cache_level level,
u32 flags)
{
gen6_pte_t pte = GEN6_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
switch (level) {
case I915_CACHE_L3_LLC:
case I915_CACHE_LLC:
pte |= GEN6_PTE_CACHE_LLC;
break;
case I915_CACHE_NONE:
pte |= GEN6_PTE_UNCACHED;
break;
default:
MISSING_CASE(level);
}
return pte;
}
static u64 ivb_pte_encode(dma_addr_t addr,
enum i915_cache_level level,
u32 flags)
{
gen6_pte_t pte = GEN6_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
switch (level) {
case I915_CACHE_L3_LLC:
pte |= GEN7_PTE_CACHE_L3_LLC;
break;
case I915_CACHE_LLC:
pte |= GEN6_PTE_CACHE_LLC;
break;
case I915_CACHE_NONE:
pte |= GEN6_PTE_UNCACHED;
break;
default:
MISSING_CASE(level);
}
return pte;
}
static u64 byt_pte_encode(dma_addr_t addr,
enum i915_cache_level level,
u32 flags)
{
gen6_pte_t pte = GEN6_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
if (!(flags & PTE_READ_ONLY))
pte |= BYT_PTE_WRITEABLE;
if (level != I915_CACHE_NONE)
pte |= BYT_PTE_SNOOPED_BY_CPU_CACHES;
return pte;
}
static u64 hsw_pte_encode(dma_addr_t addr,
enum i915_cache_level level,
u32 flags)
{
gen6_pte_t pte = HSW_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
if (level != I915_CACHE_NONE)
pte |= HSW_WB_LLC_AGE3;
return pte;
}
static u64 iris_pte_encode(dma_addr_t addr,
enum i915_cache_level level,
u32 flags)
{
gen6_pte_t pte = HSW_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
switch (level) {
case I915_CACHE_NONE:
break;
case I915_CACHE_WT:
pte |= HSW_WT_ELLC_LLC_AGE3;
break;
default:
pte |= HSW_WB_ELLC_LLC_AGE3;
break;
}
return pte;
}
static int gen6_gmch_probe(struct i915_ggtt *ggtt)
{
struct drm_i915_private *i915 = ggtt->vm.i915;
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
unsigned int size;
u16 snb_gmch_ctl;
ggtt->gmadr = pci_resource(pdev, 2);
ggtt->mappable_end = resource_size(&ggtt->gmadr);
/*
* 64/512MB is the current min/max we actually know of, but this is
* just a coarse sanity check.
*/
if (ggtt->mappable_end < (64<<20) || ggtt->mappable_end > (512<<20)) {
drm_err(&i915->drm, "Unknown GMADR size (%pa)\n",
&ggtt->mappable_end);
return -ENXIO;
}
pci_read_config_word(pdev, SNB_GMCH_CTRL, &snb_gmch_ctl);
size = gen6_get_total_gtt_size(snb_gmch_ctl);
ggtt->vm.total = (size / sizeof(gen6_pte_t)) * I915_GTT_PAGE_SIZE;
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
ggtt->vm.clear_range = nop_clear_range;
if (!HAS_FULL_PPGTT(i915) || intel_scanout_needs_vtd_wa(i915))
ggtt->vm.clear_range = gen6_ggtt_clear_range;
ggtt->vm.insert_page = gen6_ggtt_insert_page;
ggtt->vm.insert_entries = gen6_ggtt_insert_entries;
ggtt->vm.cleanup = gen6_gmch_remove;
ggtt->invalidate = gen6_ggtt_invalidate;
if (HAS_EDRAM(i915))
ggtt->vm.pte_encode = iris_pte_encode;
else if (IS_HASWELL(i915))
ggtt->vm.pte_encode = hsw_pte_encode;
else if (IS_VALLEYVIEW(i915))
ggtt->vm.pte_encode = byt_pte_encode;
else if (GRAPHICS_VER(i915) >= 7)
ggtt->vm.pte_encode = ivb_pte_encode;
else
ggtt->vm.pte_encode = snb_pte_encode;
ggtt->vm.vma_ops.bind_vma = ggtt_bind_vma;
ggtt->vm.vma_ops.unbind_vma = ggtt_unbind_vma;
return ggtt_probe_common(ggtt, size);
}
static void i915_gmch_remove(struct i915_address_space *vm)
{
intel_gmch_remove();
}
static int i915_gmch_probe(struct i915_ggtt *ggtt)
{
struct drm_i915_private *i915 = ggtt->vm.i915;
phys_addr_t gmadr_base;
int ret;
ret = intel_gmch_probe(i915->bridge_dev, to_pci_dev(i915->drm.dev), NULL);
if (!ret) {
drm_err(&i915->drm, "failed to set up gmch\n");
return -EIO;
}
intel_gtt_get(&ggtt->vm.total, &gmadr_base, &ggtt->mappable_end);
ggtt->gmadr =
(struct resource)DEFINE_RES_MEM(gmadr_base, ggtt->mappable_end);
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
if (needs_idle_maps(i915)) {
drm_notice(&i915->drm,
"Flushing DMA requests before IOMMU unmaps; performance may be degraded\n");
ggtt->do_idle_maps = true;
}
ggtt->vm.insert_page = i915_ggtt_insert_page;
ggtt->vm.insert_entries = i915_ggtt_insert_entries;
ggtt->vm.clear_range = i915_ggtt_clear_range;
ggtt->vm.cleanup = i915_gmch_remove;
ggtt->invalidate = gmch_ggtt_invalidate;
ggtt->vm.vma_ops.bind_vma = ggtt_bind_vma;
ggtt->vm.vma_ops.unbind_vma = ggtt_unbind_vma;
if (unlikely(ggtt->do_idle_maps))
drm_notice(&i915->drm,
"Applying Ironlake quirks for intel_iommu\n");
return 0;
}
static int ggtt_probe_hw(struct i915_ggtt *ggtt, struct intel_gt *gt)
{
struct drm_i915_private *i915 = gt->i915;
@ -1207,11 +577,11 @@ static int ggtt_probe_hw(struct i915_ggtt *ggtt, struct intel_gt *gt)
dma_resv_init(&ggtt->vm._resv);
if (GRAPHICS_VER(i915) <= 5)
ret = i915_gmch_probe(ggtt);
ret = intel_gt_gmch_gen5_probe(ggtt);
else if (GRAPHICS_VER(i915) < 8)
ret = gen6_gmch_probe(ggtt);
ret = intel_gt_gmch_gen6_probe(ggtt);
else
ret = gen8_gmch_probe(ggtt);
ret = intel_gt_gmch_gen8_probe(ggtt);
if (ret) {
dma_resv_fini(&ggtt->vm._resv);
return ret;
@ -1265,10 +635,7 @@ int i915_ggtt_probe_hw(struct drm_i915_private *i915)
int i915_ggtt_enable_hw(struct drm_i915_private *i915)
{
if (GRAPHICS_VER(i915) < 6 && !intel_enable_gtt())
return -EIO;
return 0;
return intel_gt_gmch_gen5_enable_hw(i915);
}
void i915_ggtt_enable_guc(struct i915_ggtt *ggtt)
@ -1308,16 +675,12 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)
{
struct i915_vma *vma;
bool write_domain_objs = false;
int open;
drm_WARN_ON(&vm->i915->drm, !vm->is_ggtt && !vm->is_dpt);
/* First fill our portion of the GTT with scratch pages */
vm->clear_range(vm, 0, vm->total);
/* Skip rewriting PTE on VMA unbind. */
open = atomic_xchg(&vm->open, 0);
/* clflush objects bound into the GGTT and rebind them. */
list_for_each_entry(vma, &vm->bound_list, vm_link) {
struct drm_i915_gem_object *obj = vma->obj;
@ -1334,8 +697,6 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)
}
}
atomic_set(&vm->open, open);
return write_domain_objs;
}

Просмотреть файл

@ -134,6 +134,13 @@
#define MI_MEM_VIRTUAL (1 << 22) /* 945,g33,965 */
#define MI_USE_GGTT (1 << 22) /* g4x+ */
#define MI_STORE_DWORD_INDEX MI_INSTR(0x21, 1)
#define MI_ATOMIC MI_INSTR(0x2f, 1)
#define MI_ATOMIC_INLINE (MI_INSTR(0x2f, 9) | MI_ATOMIC_INLINE_DATA)
#define MI_ATOMIC_GLOBAL_GTT (1 << 22)
#define MI_ATOMIC_INLINE_DATA (1 << 18)
#define MI_ATOMIC_CS_STALL (1 << 17)
#define MI_ATOMIC_MOVE (0x4 << 8)
/*
* Official intel docs are somewhat sloppy concerning MI_LOAD_REGISTER_IMM:
* - Always issue a MI_NOOP _before_ the MI_LOAD_REGISTER_IMM - otherwise hw
@ -144,6 +151,7 @@
#define MI_LOAD_REGISTER_IMM(x) MI_INSTR(0x22, 2*(x)-1)
/* Gen11+. addr = base + (ctx_restore ? offset & GENMASK(12,2) : offset) */
#define MI_LRI_LRM_CS_MMIO REG_BIT(19)
#define MI_LRI_MMIO_REMAP_EN REG_BIT(17)
#define MI_LRI_FORCE_POSTED (1<<12)
#define MI_LOAD_REGISTER_IMM_MAX_REGS (126)
#define MI_STORE_REGISTER_MEM MI_INSTR(0x24, 1)
@ -153,8 +161,10 @@
#define MI_FLUSH_DW_PROTECTED_MEM_EN (1 << 22)
#define MI_FLUSH_DW_STORE_INDEX (1<<21)
#define MI_INVALIDATE_TLB (1<<18)
#define MI_FLUSH_DW_CCS (1<<16)
#define MI_FLUSH_DW_OP_STOREDW (1<<14)
#define MI_FLUSH_DW_OP_MASK (3<<14)
#define MI_FLUSH_DW_LLC (1<<9)
#define MI_FLUSH_DW_NOTIFY (1<<8)
#define MI_INVALIDATE_BSD (1<<7)
#define MI_FLUSH_DW_USE_GTT (1<<2)
@ -203,8 +213,27 @@
#define GFX_OP_DRAWRECT_INFO ((0x3<<29)|(0x1d<<24)|(0x80<<16)|(0x3))
#define GFX_OP_DRAWRECT_INFO_I965 ((0x7900<<16)|0x2)
#define XY_CTRL_SURF_INSTR_SIZE 5
#define MI_FLUSH_DW_SIZE 3
#define XY_CTRL_SURF_COPY_BLT ((2 << 29) | (0x48 << 22) | 3)
#define SRC_ACCESS_TYPE_SHIFT 21
#define DST_ACCESS_TYPE_SHIFT 20
#define CCS_SIZE_MASK 0x3FF
#define CCS_SIZE_SHIFT 8
#define XY_CTRL_SURF_MOCS_MASK GENMASK(31, 25)
#define NUM_CCS_BYTES_PER_BLOCK 256
#define NUM_BYTES_PER_CCS_BYTE 256
#define NUM_CCS_BLKS_PER_XFER 1024
#define INDIRECT_ACCESS 0
#define DIRECT_ACCESS 1
#define COLOR_BLT_CMD (2 << 29 | 0x40 << 22 | (5 - 2))
#define XY_COLOR_BLT_CMD (2 << 29 | 0x50 << 22)
#define XY_FAST_COLOR_BLT_CMD (2 << 29 | 0x44 << 22)
#define XY_FAST_COLOR_BLT_DEPTH_32 (2 << 19)
#define XY_FAST_COLOR_BLT_DW 16
#define XY_FAST_COLOR_BLT_MOCS_MASK GENMASK(27, 21)
#define XY_FAST_COLOR_BLT_MEM_TYPE_SHIFT 31
#define SRC_COPY_BLT_CMD (2 << 29 | 0x43 << 22)
#define GEN9_XY_FAST_COPY_BLT_CMD (2 << 29 | 0x42 << 22)
#define XY_SRC_COPY_BLT_CMD (2 << 29 | 0x53 << 22)

Просмотреть файл

@ -0,0 +1,224 @@
// SPDX-License-Identifier: MIT
/*
* Copyright(c) 2019-2022, Intel Corporation. All rights reserved.
*/
#include <linux/irq.h>
#include <linux/mei_aux.h>
#include "i915_drv.h"
#include "i915_reg.h"
#include "gt/intel_gsc.h"
#include "gt/intel_gt.h"
#define GSC_BAR_LENGTH 0x00000FFC
static void gsc_irq_mask(struct irq_data *d)
{
/* generic irq handling */
}
static void gsc_irq_unmask(struct irq_data *d)
{
/* generic irq handling */
}
static struct irq_chip gsc_irq_chip = {
.name = "gsc_irq_chip",
.irq_mask = gsc_irq_mask,
.irq_unmask = gsc_irq_unmask,
};
static int gsc_irq_init(int irq)
{
irq_set_chip_and_handler_name(irq, &gsc_irq_chip,
handle_simple_irq, "gsc_irq_handler");
return irq_set_chip_data(irq, NULL);
}
struct gsc_def {
const char *name;
unsigned long bar;
size_t bar_size;
};
/* gsc resources and definitions (HECI1 and HECI2) */
static const struct gsc_def gsc_def_dg1[] = {
{
/* HECI1 not yet implemented. */
},
{
.name = "mei-gscfi",
.bar = DG1_GSC_HECI2_BASE,
.bar_size = GSC_BAR_LENGTH,
}
};
static const struct gsc_def gsc_def_dg2[] = {
{
.name = "mei-gsc",
.bar = DG2_GSC_HECI1_BASE,
.bar_size = GSC_BAR_LENGTH,
},
{
.name = "mei-gscfi",
.bar = DG2_GSC_HECI2_BASE,
.bar_size = GSC_BAR_LENGTH,
}
};
static void gsc_release_dev(struct device *dev)
{
struct auxiliary_device *aux_dev = to_auxiliary_dev(dev);
struct mei_aux_device *adev = auxiliary_dev_to_mei_aux_dev(aux_dev);
kfree(adev);
}
static void gsc_destroy_one(struct intel_gsc_intf *intf)
{
if (intf->adev) {
auxiliary_device_delete(&intf->adev->aux_dev);
auxiliary_device_uninit(&intf->adev->aux_dev);
intf->adev = NULL;
}
if (intf->irq >= 0)
irq_free_desc(intf->irq);
intf->irq = -1;
}
static void gsc_init_one(struct drm_i915_private *i915,
struct intel_gsc_intf *intf,
unsigned int intf_id)
{
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
struct mei_aux_device *adev;
struct auxiliary_device *aux_dev;
const struct gsc_def *def;
int ret;
intf->irq = -1;
intf->id = intf_id;
if (intf_id == 0 && !HAS_HECI_PXP(i915))
return;
if (IS_DG1(i915)) {
def = &gsc_def_dg1[intf_id];
} else if (IS_DG2(i915)) {
def = &gsc_def_dg2[intf_id];
} else {
drm_warn_once(&i915->drm, "Unknown platform\n");
return;
}
if (!def->name) {
drm_warn_once(&i915->drm, "HECI%d is not implemented!\n", intf_id + 1);
return;
}
intf->irq = irq_alloc_desc(0);
if (intf->irq < 0) {
drm_err(&i915->drm, "gsc irq error %d\n", intf->irq);
return;
}
ret = gsc_irq_init(intf->irq);
if (ret < 0) {
drm_err(&i915->drm, "gsc irq init failed %d\n", ret);
goto fail;
}
adev = kzalloc(sizeof(*adev), GFP_KERNEL);
if (!adev)
goto fail;
adev->irq = intf->irq;
adev->bar.parent = &pdev->resource[0];
adev->bar.start = def->bar + pdev->resource[0].start;
adev->bar.end = adev->bar.start + def->bar_size - 1;
adev->bar.flags = IORESOURCE_MEM;
adev->bar.desc = IORES_DESC_NONE;
aux_dev = &adev->aux_dev;
aux_dev->name = def->name;
aux_dev->id = (pci_domain_nr(pdev->bus) << 16) |
PCI_DEVID(pdev->bus->number, pdev->devfn);
aux_dev->dev.parent = &pdev->dev;
aux_dev->dev.release = gsc_release_dev;
ret = auxiliary_device_init(aux_dev);
if (ret < 0) {
drm_err(&i915->drm, "gsc aux init failed %d\n", ret);
kfree(adev);
goto fail;
}
ret = auxiliary_device_add(aux_dev);
if (ret < 0) {
drm_err(&i915->drm, "gsc aux add failed %d\n", ret);
/* adev will be freed with the put_device() and .release sequence */
auxiliary_device_uninit(aux_dev);
goto fail;
}
intf->adev = adev;
return;
fail:
gsc_destroy_one(intf);
}
static void gsc_irq_handler(struct intel_gt *gt, unsigned int intf_id)
{
int ret;
if (intf_id >= INTEL_GSC_NUM_INTERFACES) {
drm_warn_once(&gt->i915->drm, "GSC irq: intf_id %d is out of range", intf_id);
return;
}
if (!HAS_HECI_GSC(gt->i915)) {
drm_warn_once(&gt->i915->drm, "GSC irq: not supported");
return;
}
if (gt->gsc.intf[intf_id].irq < 0) {
drm_err_ratelimited(&gt->i915->drm, "GSC irq: irq not set");
return;
}
ret = generic_handle_irq(gt->gsc.intf[intf_id].irq);
if (ret)
drm_err_ratelimited(&gt->i915->drm, "error handling GSC irq: %d\n", ret);
}
void intel_gsc_irq_handler(struct intel_gt *gt, u32 iir)
{
if (iir & GSC_IRQ_INTF(0))
gsc_irq_handler(gt, 0);
if (iir & GSC_IRQ_INTF(1))
gsc_irq_handler(gt, 1);
}
void intel_gsc_init(struct intel_gsc *gsc, struct drm_i915_private *i915)
{
unsigned int i;
if (!HAS_HECI_GSC(i915))
return;
for (i = 0; i < INTEL_GSC_NUM_INTERFACES; i++)
gsc_init_one(i915, &gsc->intf[i], i);
}
void intel_gsc_fini(struct intel_gsc *gsc)
{
struct intel_gt *gt = gsc_to_gt(gsc);
unsigned int i;
if (!HAS_HECI_GSC(gt->i915))
return;
for (i = 0; i < INTEL_GSC_NUM_INTERFACES; i++)
gsc_destroy_one(&gsc->intf[i]);
}

Просмотреть файл

@ -0,0 +1,37 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright(c) 2019-2022, Intel Corporation. All rights reserved.
*/
#ifndef __INTEL_GSC_DEV_H__
#define __INTEL_GSC_DEV_H__
#include <linux/types.h>
struct drm_i915_private;
struct intel_gt;
struct mei_aux_device;
#define INTEL_GSC_NUM_INTERFACES 2
/*
* The HECI1 bit corresponds to bit15 and HECI2 to bit14.
* The reason for this is to allow growth for more interfaces in the future.
*/
#define GSC_IRQ_INTF(_x) BIT(15 - (_x))
/**
* struct intel_gsc - graphics security controller
* @intf : gsc interface
*/
struct intel_gsc {
struct intel_gsc_intf {
struct mei_aux_device *adev;
int irq;
unsigned int id;
} intf[INTEL_GSC_NUM_INTERFACES];
};
void intel_gsc_init(struct intel_gsc *gsc, struct drm_i915_private *dev_priv);
void intel_gsc_fini(struct intel_gsc *gsc);
void intel_gsc_irq_handler(struct intel_gt *gt, u32 iir);
#endif /* __INTEL_GSC_DEV_H__ */

Просмотреть файл

@ -4,7 +4,6 @@
*/
#include <drm/drm_managed.h>
#include <drm/intel-gtt.h>
#include "gem/i915_gem_internal.h"
#include "gem/i915_gem_lmem.h"
@ -17,6 +16,7 @@
#include "intel_gt_buffer_pool.h"
#include "intel_gt_clock_utils.h"
#include "intel_gt_debugfs.h"
#include "intel_gt_gmch.h"
#include "intel_gt_pm.h"
#include "intel_gt_regs.h"
#include "intel_gt_requests.h"
@ -26,10 +26,11 @@
#include "intel_rc6.h"
#include "intel_renderstate.h"
#include "intel_rps.h"
#include "intel_gt_sysfs.h"
#include "intel_uncore.h"
#include "shmem_utils.h"
void __intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915)
static void __intel_gt_init_early(struct intel_gt *gt)
{
spin_lock_init(&gt->irq_lock);
@ -51,17 +52,23 @@ void __intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915)
intel_rps_init_early(&gt->rps);
}
void intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915)
/* Preliminary initialization of Tile 0 */
void intel_root_gt_init_early(struct drm_i915_private *i915)
{
struct intel_gt *gt = to_gt(i915);
gt->i915 = i915;
gt->uncore = &i915->uncore;
__intel_gt_init_early(gt);
}
int intel_gt_probe_lmem(struct intel_gt *gt)
static int intel_gt_probe_lmem(struct intel_gt *gt)
{
struct drm_i915_private *i915 = gt->i915;
unsigned int instance = gt->info.id;
int id = INTEL_REGION_LMEM_0 + instance;
struct intel_memory_region *mem;
int id;
int err;
mem = intel_gt_setup_lmem(gt);
@ -76,9 +83,8 @@ int intel_gt_probe_lmem(struct intel_gt *gt)
return err;
}
id = INTEL_REGION_LMEM;
mem->id = id;
mem->instance = instance;
intel_memory_region_set_name(mem, "local%u", mem->instance);
@ -96,6 +102,12 @@ int intel_gt_assign_ggtt(struct intel_gt *gt)
return gt->ggtt ? 0 : -ENOMEM;
}
static const char * const intel_steering_types[] = {
"L3BANK",
"MSLICE",
"LNCF",
};
static const struct intel_mmio_range icl_l3bank_steering_table[] = {
{ 0x00B100, 0x00B3FF },
{},
@ -439,14 +451,17 @@ void intel_gt_chipset_flush(struct intel_gt *gt)
{
wmb();
if (GRAPHICS_VER(gt->i915) < 6)
intel_gtt_chipset_flush();
intel_gt_gmch_gen5_chipset_flush(gt);
}
void intel_gt_driver_register(struct intel_gt *gt)
{
intel_gsc_init(&gt->gsc, gt->i915);
intel_rps_driver_register(&gt->rps);
intel_gt_debugfs_register(gt);
intel_gt_sysfs_register(gt);
}
static int intel_gt_init_scratch(struct intel_gt *gt, unsigned int size)
@ -712,6 +727,11 @@ int intel_gt_init(struct intel_gt *gt)
if (err)
goto err_uc_init;
err = intel_gt_init_hwconfig(gt);
if (err)
drm_err(&gt->i915->drm, "Failed to retrieve hwconfig table: %pe\n",
ERR_PTR(err));
err = __engines_record_defaults(gt);
if (err)
goto err_gt;
@ -766,6 +786,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
intel_wakeref_t wakeref;
intel_rps_driver_unregister(&gt->rps);
intel_gsc_fini(&gt->gsc);
intel_pxp_fini(&gt->pxp);
@ -793,18 +814,24 @@ void intel_gt_driver_release(struct intel_gt *gt)
intel_gt_pm_fini(gt);
intel_gt_fini_scratch(gt);
intel_gt_fini_buffer_pool(gt);
intel_gt_fini_hwconfig(gt);
}
void intel_gt_driver_late_release(struct intel_gt *gt)
void intel_gt_driver_late_release_all(struct drm_i915_private *i915)
{
struct intel_gt *gt;
unsigned int id;
/* We need to wait for inflight RCU frees to release their grip */
rcu_barrier();
intel_uc_driver_late_release(&gt->uc);
intel_gt_fini_requests(gt);
intel_gt_fini_reset(gt);
intel_gt_fini_timelines(gt);
intel_engines_free(gt);
for_each_gt(gt, i915, id) {
intel_uc_driver_late_release(&gt->uc);
intel_gt_fini_requests(gt);
intel_gt_fini_reset(gt);
intel_gt_fini_timelines(gt);
intel_engines_free(gt);
}
}
/**
@ -913,6 +940,35 @@ u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg)
return intel_uncore_read_fw(gt->uncore, reg);
}
/**
* intel_gt_get_valid_steering_for_reg - get a valid steering for a register
* @gt: GT structure
* @reg: register for which the steering is required
* @sliceid: return variable for slice steering
* @subsliceid: return variable for subslice steering
*
* This function returns a slice/subslice pair that is guaranteed to work for
* read steering of the given register. Note that a value will be returned even
* if the register is not replicated and therefore does not actually require
* steering.
*/
void intel_gt_get_valid_steering_for_reg(struct intel_gt *gt, i915_reg_t reg,
u8 *sliceid, u8 *subsliceid)
{
int type;
for (type = 0; type < NUM_STEERING_TYPES; type++) {
if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
intel_gt_get_valid_steering(gt, type, sliceid,
subsliceid);
return;
}
}
*sliceid = gt->default_steering.groupid;
*subsliceid = gt->default_steering.instanceid;
}
u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg)
{
int type;
@ -932,6 +988,145 @@ u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg)
return intel_uncore_read(gt->uncore, reg);
}
static void report_steering_type(struct drm_printer *p,
struct intel_gt *gt,
enum intel_steering_type type,
bool dump_table)
{
const struct intel_mmio_range *entry;
u8 slice, subslice;
BUILD_BUG_ON(ARRAY_SIZE(intel_steering_types) != NUM_STEERING_TYPES);
if (!gt->steering_table[type]) {
drm_printf(p, "%s steering: uses default steering\n",
intel_steering_types[type]);
return;
}
intel_gt_get_valid_steering(gt, type, &slice, &subslice);
drm_printf(p, "%s steering: sliceid=0x%x, subsliceid=0x%x\n",
intel_steering_types[type], slice, subslice);
if (!dump_table)
return;
for (entry = gt->steering_table[type]; entry->end; entry++)
drm_printf(p, "\t0x%06x - 0x%06x\n", entry->start, entry->end);
}
void intel_gt_report_steering(struct drm_printer *p, struct intel_gt *gt,
bool dump_table)
{
drm_printf(p, "Default steering: sliceid=0x%x, subsliceid=0x%x\n",
gt->default_steering.groupid,
gt->default_steering.instanceid);
if (HAS_MSLICES(gt->i915)) {
report_steering_type(p, gt, MSLICE, dump_table);
report_steering_type(p, gt, LNCF, dump_table);
}
}
static int intel_gt_tile_setup(struct intel_gt *gt, phys_addr_t phys_addr)
{
int ret;
if (!gt_is_root(gt)) {
struct intel_uncore_mmio_debug *mmio_debug;
struct intel_uncore *uncore;
uncore = kzalloc(sizeof(*uncore), GFP_KERNEL);
if (!uncore)
return -ENOMEM;
mmio_debug = kzalloc(sizeof(*mmio_debug), GFP_KERNEL);
if (!mmio_debug) {
kfree(uncore);
return -ENOMEM;
}
gt->uncore = uncore;
gt->uncore->debug = mmio_debug;
__intel_gt_init_early(gt);
}
intel_uncore_init_early(gt->uncore, gt);
ret = intel_uncore_setup_mmio(gt->uncore, phys_addr);
if (ret)
return ret;
gt->phys_addr = phys_addr;
return 0;
}
static void
intel_gt_tile_cleanup(struct intel_gt *gt)
{
intel_uncore_cleanup_mmio(gt->uncore);
if (!gt_is_root(gt)) {
kfree(gt->uncore->debug);
kfree(gt->uncore);
kfree(gt);
}
}
int intel_gt_probe_all(struct drm_i915_private *i915)
{
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
struct intel_gt *gt = &i915->gt0;
phys_addr_t phys_addr;
unsigned int mmio_bar;
int ret;
mmio_bar = GRAPHICS_VER(i915) == 2 ? 1 : 0;
phys_addr = pci_resource_start(pdev, mmio_bar);
/*
* We always have at least one primary GT on any device
* and it has been already initialized early during probe
* in i915_driver_probe()
*/
ret = intel_gt_tile_setup(gt, phys_addr);
if (ret)
return ret;
i915->gt[0] = gt;
/* TODO: add more tiles */
return 0;
}
int intel_gt_tiles_init(struct drm_i915_private *i915)
{
struct intel_gt *gt;
unsigned int id;
int ret;
for_each_gt(gt, i915, id) {
ret = intel_gt_probe_lmem(gt);
if (ret)
return ret;
}
return 0;
}
void intel_gt_release_all(struct drm_i915_private *i915)
{
struct intel_gt *gt;
unsigned int id;
for_each_gt(gt, i915, id) {
intel_gt_tile_cleanup(gt);
i915->gt[id] = NULL;
}
}
void intel_gt_info_print(const struct intel_gt_info *info,
struct drm_printer *p)
{

Просмотреть файл

@ -13,12 +13,24 @@
struct drm_i915_private;
struct drm_printer;
struct insert_entries {
struct i915_address_space *vm;
struct i915_vma_resource *vma_res;
enum i915_cache_level level;
u32 flags;
};
#define GT_TRACE(gt, fmt, ...) do { \
const struct intel_gt *gt__ __maybe_unused = (gt); \
GEM_TRACE("%s " fmt, dev_name(gt__->i915->drm.dev), \
##__VA_ARGS__); \
} while (0)
static inline bool gt_is_root(struct intel_gt *gt)
{
return !gt->info.id;
}
static inline struct intel_gt *uc_to_gt(struct intel_uc *uc)
{
return container_of(uc, struct intel_gt, uc);
@ -34,10 +46,13 @@ static inline struct intel_gt *huc_to_gt(struct intel_huc *huc)
return container_of(huc, struct intel_gt, uc.huc);
}
void intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915);
void __intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915);
static inline struct intel_gt *gsc_to_gt(struct intel_gsc *gsc)
{
return container_of(gsc, struct intel_gt, gsc);
}
void intel_root_gt_init_early(struct drm_i915_private *i915);
int intel_gt_assign_ggtt(struct intel_gt *gt);
int intel_gt_probe_lmem(struct intel_gt *gt);
int intel_gt_init_mmio(struct intel_gt *gt);
int __must_check intel_gt_init_hw(struct intel_gt *gt);
int intel_gt_init(struct intel_gt *gt);
@ -47,7 +62,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt);
void intel_gt_driver_remove(struct intel_gt *gt);
void intel_gt_driver_release(struct intel_gt *gt);
void intel_gt_driver_late_release(struct intel_gt *gt);
void intel_gt_driver_late_release_all(struct drm_i915_private *i915);
int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
@ -84,9 +99,25 @@ static inline bool intel_gt_needs_read_steering(struct intel_gt *gt,
return gt->steering_table[type];
}
void intel_gt_get_valid_steering_for_reg(struct intel_gt *gt, i915_reg_t reg,
u8 *sliceid, u8 *subsliceid);
u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg);
u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg);
void intel_gt_report_steering(struct drm_printer *p, struct intel_gt *gt,
bool dump_table);
int intel_gt_probe_all(struct drm_i915_private *i915);
int intel_gt_tiles_init(struct drm_i915_private *i915);
void intel_gt_release_all(struct drm_i915_private *i915);
#define for_each_gt(gt__, i915__, id__) \
for ((id__) = 0; \
(id__) < I915_MAX_GT; \
(id__)++) \
for_each_if(((gt__) = (i915__)->gt[(id__)]))
void intel_gt_info_print(const struct intel_gt_info *info,
struct drm_printer *p);
@ -94,4 +125,6 @@ void intel_gt_watchdog_work(struct work_struct *work);
void intel_gt_invalidate_tlbs(struct intel_gt *gt);
struct resource intel_pci_resource(struct pci_dev *pdev, int bar);
#endif /* __INTEL_GT_H__ */

Просмотреть файл

@ -161,6 +161,10 @@ void intel_gt_init_clock_frequency(struct intel_gt *gt)
if (gt->clock_frequency)
gt->clock_period_ns = intel_gt_clock_interval_to_ns(gt, 1);
/* Icelake appears to use another fixed frequency for CTX_TIMESTAMP */
if (GRAPHICS_VER(gt->i915) == 11)
gt->clock_period_ns = NSEC_PER_SEC / 13750000;
GT_TRACE(gt,
"Using clock frequency: %dkHz, period: %dns, wrap: %lldms\n",
gt->clock_frequency / 1000,

Просмотреть файл

@ -6,6 +6,7 @@
#include <linux/debugfs.h>
#include "i915_drv.h"
#include "intel_gt.h"
#include "intel_gt_debugfs.h"
#include "intel_gt_engines_debugfs.h"
#include "intel_gt_pm_debugfs.h"
@ -29,7 +30,7 @@ int intel_gt_debugfs_reset_show(struct intel_gt *gt, u64 *val)
}
}
int intel_gt_debugfs_reset_store(struct intel_gt *gt, u64 val)
void intel_gt_debugfs_reset_store(struct intel_gt *gt, u64 val)
{
/* Flush any previous reset before applying for a new one */
wait_event(gt->reset.queue,
@ -37,7 +38,6 @@ int intel_gt_debugfs_reset_store(struct intel_gt *gt, u64 val)
intel_gt_handle_error(gt, val, I915_ERROR_CAPTURE,
"Manually reset engine mask to %llx", val);
return 0;
}
/*
@ -51,16 +51,30 @@ static int __intel_gt_debugfs_reset_show(void *data, u64 *val)
static int __intel_gt_debugfs_reset_store(void *data, u64 val)
{
return intel_gt_debugfs_reset_store(data, val);
intel_gt_debugfs_reset_store(data, val);
return 0;
}
DEFINE_SIMPLE_ATTRIBUTE(reset_fops, __intel_gt_debugfs_reset_show,
__intel_gt_debugfs_reset_store, "%llu\n");
static int steering_show(struct seq_file *m, void *data)
{
struct drm_printer p = drm_seq_file_printer(m);
struct intel_gt *gt = m->private;
intel_gt_report_steering(&p, gt, true);
return 0;
}
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(steering);
static void gt_debugfs_register(struct intel_gt *gt, struct dentry *root)
{
static const struct intel_gt_debugfs_file files[] = {
{ "reset", &reset_fops, NULL },
{ "steering", &steering_fops },
};
intel_gt_debugfs_register_files(root, files, ARRAY_SIZE(files), gt);

Просмотреть файл

@ -48,6 +48,6 @@ void intel_gt_debugfs_register_files(struct dentry *root,
/* functions that need to be accessed by the upper level non-gt interfaces */
int intel_gt_debugfs_reset_show(struct intel_gt *gt, u64 *val);
int intel_gt_debugfs_reset_store(struct intel_gt *gt, u64 val);
void intel_gt_debugfs_reset_store(struct intel_gt *gt, u64 val);
#endif /* INTEL_GT_DEBUGFS_H */

Просмотреть файл

@ -0,0 +1,654 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <drm/intel-gtt.h>
#include <drm/i915_drm.h>
#include <linux/agp_backend.h>
#include <linux/stop_machine.h>
#include "i915_drv.h"
#include "intel_gt_gmch.h"
#include "intel_gt_regs.h"
#include "intel_gt.h"
#include "i915_utils.h"
#include "gen8_ppgtt.h"
struct insert_page {
struct i915_address_space *vm;
dma_addr_t addr;
u64 offset;
enum i915_cache_level level;
};
static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
{
writeq(pte, addr);
}
static void nop_clear_range(struct i915_address_space *vm,
u64 start, u64 length)
{
}
static u64 snb_pte_encode(dma_addr_t addr,
enum i915_cache_level level,
u32 flags)
{
gen6_pte_t pte = GEN6_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
switch (level) {
case I915_CACHE_L3_LLC:
case I915_CACHE_LLC:
pte |= GEN6_PTE_CACHE_LLC;
break;
case I915_CACHE_NONE:
pte |= GEN6_PTE_UNCACHED;
break;
default:
MISSING_CASE(level);
}
return pte;
}
static u64 ivb_pte_encode(dma_addr_t addr,
enum i915_cache_level level,
u32 flags)
{
gen6_pte_t pte = GEN6_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
switch (level) {
case I915_CACHE_L3_LLC:
pte |= GEN7_PTE_CACHE_L3_LLC;
break;
case I915_CACHE_LLC:
pte |= GEN6_PTE_CACHE_LLC;
break;
case I915_CACHE_NONE:
pte |= GEN6_PTE_UNCACHED;
break;
default:
MISSING_CASE(level);
}
return pte;
}
static u64 byt_pte_encode(dma_addr_t addr,
enum i915_cache_level level,
u32 flags)
{
gen6_pte_t pte = GEN6_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
if (!(flags & PTE_READ_ONLY))
pte |= BYT_PTE_WRITEABLE;
if (level != I915_CACHE_NONE)
pte |= BYT_PTE_SNOOPED_BY_CPU_CACHES;
return pte;
}
static u64 hsw_pte_encode(dma_addr_t addr,
enum i915_cache_level level,
u32 flags)
{
gen6_pte_t pte = HSW_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
if (level != I915_CACHE_NONE)
pte |= HSW_WB_LLC_AGE3;
return pte;
}
static u64 iris_pte_encode(dma_addr_t addr,
enum i915_cache_level level,
u32 flags)
{
gen6_pte_t pte = HSW_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
switch (level) {
case I915_CACHE_NONE:
break;
case I915_CACHE_WT:
pte |= HSW_WT_ELLC_LLC_AGE3;
break;
default:
pte |= HSW_WB_ELLC_LLC_AGE3;
break;
}
return pte;
}
static void gen5_ggtt_insert_page(struct i915_address_space *vm,
dma_addr_t addr,
u64 offset,
enum i915_cache_level cache_level,
u32 unused)
{
unsigned int flags = (cache_level == I915_CACHE_NONE) ?
AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
intel_gtt_insert_page(addr, offset >> PAGE_SHIFT, flags);
}
static void gen6_ggtt_insert_page(struct i915_address_space *vm,
dma_addr_t addr,
u64 offset,
enum i915_cache_level level,
u32 flags)
{
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
gen6_pte_t __iomem *pte =
(gen6_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
iowrite32(vm->pte_encode(addr, level, flags), pte);
ggtt->invalidate(ggtt);
}
static void gen8_ggtt_insert_page(struct i915_address_space *vm,
dma_addr_t addr,
u64 offset,
enum i915_cache_level level,
u32 flags)
{
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
gen8_pte_t __iomem *pte =
(gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, flags));
ggtt->invalidate(ggtt);
}
static void gen5_ggtt_insert_entries(struct i915_address_space *vm,
struct i915_vma_resource *vma_res,
enum i915_cache_level cache_level,
u32 unused)
{
unsigned int flags = (cache_level == I915_CACHE_NONE) ?
AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
intel_gtt_insert_sg_entries(vma_res->bi.pages, vma_res->start >> PAGE_SHIFT,
flags);
}
/*
* Binds an object into the global gtt with the specified cache level.
* The object will be accessible to the GPU via commands whose operands
* reference offsets within the global GTT as well as accessible by the GPU
* through the GMADR mapped BAR (i915->mm.gtt->gtt).
*/
static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
struct i915_vma_resource *vma_res,
enum i915_cache_level level,
u32 flags)
{
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
gen6_pte_t __iomem *gte;
gen6_pte_t __iomem *end;
struct sgt_iter iter;
dma_addr_t addr;
gte = (gen6_pte_t __iomem *)ggtt->gsm;
gte += vma_res->start / I915_GTT_PAGE_SIZE;
end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
iowrite32(vm->pte_encode(addr, level, flags), gte++);
GEM_BUG_ON(gte > end);
/* Fill the allocated but "unused" space beyond the end of the buffer */
while (gte < end)
iowrite32(vm->scratch[0]->encode, gte++);
/*
* We want to flush the TLBs only after we're certain all the PTE
* updates have finished.
*/
ggtt->invalidate(ggtt);
}
static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
struct i915_vma_resource *vma_res,
enum i915_cache_level level,
u32 flags)
{
const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, flags);
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
gen8_pte_t __iomem *gte;
gen8_pte_t __iomem *end;
struct sgt_iter iter;
dma_addr_t addr;
/*
* Note that we ignore PTE_READ_ONLY here. The caller must be careful
* not to allow the user to override access to a read only page.
*/
gte = (gen8_pte_t __iomem *)ggtt->gsm;
gte += vma_res->start / I915_GTT_PAGE_SIZE;
end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
gen8_set_pte(gte++, pte_encode | addr);
GEM_BUG_ON(gte > end);
/* Fill the allocated but "unused" space beyond the end of the buffer */
while (gte < end)
gen8_set_pte(gte++, vm->scratch[0]->encode);
/*
* We want to flush the TLBs only after we're certain all the PTE
* updates have finished.
*/
ggtt->invalidate(ggtt);
}
static void bxt_vtd_ggtt_wa(struct i915_address_space *vm)
{
/*
* Make sure the internal GAM fifo has been cleared of all GTT
* writes before exiting stop_machine(). This guarantees that
* any aperture accesses waiting to start in another process
* cannot back up behind the GTT writes causing a hang.
* The register can be any arbitrary GAM register.
*/
intel_uncore_posting_read_fw(vm->gt->uncore, GFX_FLSH_CNTL_GEN6);
}
static int bxt_vtd_ggtt_insert_page__cb(void *_arg)
{
struct insert_page *arg = _arg;
gen8_ggtt_insert_page(arg->vm, arg->addr, arg->offset, arg->level, 0);
bxt_vtd_ggtt_wa(arg->vm);
return 0;
}
static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
dma_addr_t addr,
u64 offset,
enum i915_cache_level level,
u32 unused)
{
struct insert_page arg = { vm, addr, offset, level };
stop_machine(bxt_vtd_ggtt_insert_page__cb, &arg, NULL);
}
static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
{
struct insert_entries *arg = _arg;
gen8_ggtt_insert_entries(arg->vm, arg->vma_res, arg->level, arg->flags);
bxt_vtd_ggtt_wa(arg->vm);
return 0;
}
static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm,
struct i915_vma_resource *vma_res,
enum i915_cache_level level,
u32 flags)
{
struct insert_entries arg = { vm, vma_res, level, flags };
stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL);
}
void intel_gt_gmch_gen5_chipset_flush(struct intel_gt *gt)
{
intel_gtt_chipset_flush();
}
static void gmch_ggtt_invalidate(struct i915_ggtt *ggtt)
{
intel_gtt_chipset_flush();
}
static void gen5_ggtt_clear_range(struct i915_address_space *vm,
u64 start, u64 length)
{
intel_gtt_clear_range(start >> PAGE_SHIFT, length >> PAGE_SHIFT);
}
static void gen6_ggtt_clear_range(struct i915_address_space *vm,
u64 start, u64 length)
{
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
unsigned int first_entry = start / I915_GTT_PAGE_SIZE;
unsigned int num_entries = length / I915_GTT_PAGE_SIZE;
gen6_pte_t scratch_pte, __iomem *gtt_base =
(gen6_pte_t __iomem *)ggtt->gsm + first_entry;
const int max_entries = ggtt_total_entries(ggtt) - first_entry;
int i;
if (WARN(num_entries > max_entries,
"First entry = %d; Num entries = %d (max=%d)\n",
first_entry, num_entries, max_entries))
num_entries = max_entries;
scratch_pte = vm->scratch[0]->encode;
for (i = 0; i < num_entries; i++)
iowrite32(scratch_pte, &gtt_base[i]);
}
static void gen8_ggtt_clear_range(struct i915_address_space *vm,
u64 start, u64 length)
{
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
unsigned int first_entry = start / I915_GTT_PAGE_SIZE;
unsigned int num_entries = length / I915_GTT_PAGE_SIZE;
const gen8_pte_t scratch_pte = vm->scratch[0]->encode;
gen8_pte_t __iomem *gtt_base =
(gen8_pte_t __iomem *)ggtt->gsm + first_entry;
const int max_entries = ggtt_total_entries(ggtt) - first_entry;
int i;
if (WARN(num_entries > max_entries,
"First entry = %d; Num entries = %d (max=%d)\n",
first_entry, num_entries, max_entries))
num_entries = max_entries;
for (i = 0; i < num_entries; i++)
gen8_set_pte(&gtt_base[i], scratch_pte);
}
static void gen5_gmch_remove(struct i915_address_space *vm)
{
intel_gmch_remove();
}
static void gen6_gmch_remove(struct i915_address_space *vm)
{
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
iounmap(ggtt->gsm);
free_scratch(vm);
}
/*
* Certain Gen5 chipsets require idling the GPU before
* unmapping anything from the GTT when VT-d is enabled.
*/
static bool needs_idle_maps(struct drm_i915_private *i915)
{
/*
* Query intel_iommu to see if we need the workaround. Presumably that
* was loaded first.
*/
if (!i915_vtd_active(i915))
return false;
if (GRAPHICS_VER(i915) == 5 && IS_MOBILE(i915))
return true;
if (GRAPHICS_VER(i915) == 12)
return true; /* XXX DMAR fault reason 7 */
return false;
}
static unsigned int gen6_gttmmadr_size(struct drm_i915_private *i915)
{
/*
* GEN6: GTTMMADR size is 4MB and GTTADR starts at 2MB offset
* GEN8: GTTMMADR size is 16MB and GTTADR starts at 8MB offset
*/
GEM_BUG_ON(GRAPHICS_VER(i915) < 6);
return (GRAPHICS_VER(i915) < 8) ? SZ_4M : SZ_16M;
}
static unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
{
snb_gmch_ctl >>= SNB_GMCH_GGMS_SHIFT;
snb_gmch_ctl &= SNB_GMCH_GGMS_MASK;
return snb_gmch_ctl << 20;
}
static unsigned int gen8_get_total_gtt_size(u16 bdw_gmch_ctl)
{
bdw_gmch_ctl >>= BDW_GMCH_GGMS_SHIFT;
bdw_gmch_ctl &= BDW_GMCH_GGMS_MASK;
if (bdw_gmch_ctl)
bdw_gmch_ctl = 1 << bdw_gmch_ctl;
#ifdef CONFIG_X86_32
/* Limit 32b platforms to a 2GB GGTT: 4 << 20 / pte size * I915_GTT_PAGE_SIZE */
if (bdw_gmch_ctl > 4)
bdw_gmch_ctl = 4;
#endif
return bdw_gmch_ctl << 20;
}
static unsigned int gen6_gttadr_offset(struct drm_i915_private *i915)
{
return gen6_gttmmadr_size(i915) / 2;
}
static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
{
struct drm_i915_private *i915 = ggtt->vm.i915;
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
phys_addr_t phys_addr;
u32 pte_flags;
int ret;
GEM_WARN_ON(pci_resource_len(pdev, 0) != gen6_gttmmadr_size(i915));
phys_addr = pci_resource_start(pdev, 0) + gen6_gttadr_offset(i915);
/*
* On BXT+/ICL+ writes larger than 64 bit to the GTT pagetable range
* will be dropped. For WC mappings in general we have 64 byte burst
* writes when the WC buffer is flushed, so we can't use it, but have to
* resort to an uncached mapping. The WC issue is easily caught by the
* readback check when writing GTT PTE entries.
*/
if (IS_GEN9_LP(i915) || GRAPHICS_VER(i915) >= 11)
ggtt->gsm = ioremap(phys_addr, size);
else
ggtt->gsm = ioremap_wc(phys_addr, size);
if (!ggtt->gsm) {
drm_err(&i915->drm, "Failed to map the ggtt page table\n");
return -ENOMEM;
}
kref_init(&ggtt->vm.resv_ref);
ret = setup_scratch_page(&ggtt->vm);
if (ret) {
drm_err(&i915->drm, "Scratch setup failed\n");
/* iounmap will also get called at remove, but meh */
iounmap(ggtt->gsm);
return ret;
}
pte_flags = 0;
if (i915_gem_object_is_lmem(ggtt->vm.scratch[0]))
pte_flags |= PTE_LM;
ggtt->vm.scratch[0]->encode =
ggtt->vm.pte_encode(px_dma(ggtt->vm.scratch[0]),
I915_CACHE_NONE, pte_flags);
return 0;
}
int intel_gt_gmch_gen5_probe(struct i915_ggtt *ggtt)
{
struct drm_i915_private *i915 = ggtt->vm.i915;
phys_addr_t gmadr_base;
int ret;
ret = intel_gmch_probe(i915->bridge_dev, to_pci_dev(i915->drm.dev), NULL);
if (!ret) {
drm_err(&i915->drm, "failed to set up gmch\n");
return -EIO;
}
intel_gtt_get(&ggtt->vm.total, &gmadr_base, &ggtt->mappable_end);
ggtt->gmadr =
(struct resource)DEFINE_RES_MEM(gmadr_base, ggtt->mappable_end);
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
if (needs_idle_maps(i915)) {
drm_notice(&i915->drm,
"Flushing DMA requests before IOMMU unmaps; performance may be degraded\n");
ggtt->do_idle_maps = true;
}
ggtt->vm.insert_page = gen5_ggtt_insert_page;
ggtt->vm.insert_entries = gen5_ggtt_insert_entries;
ggtt->vm.clear_range = gen5_ggtt_clear_range;
ggtt->vm.cleanup = gen5_gmch_remove;
ggtt->invalidate = gmch_ggtt_invalidate;
ggtt->vm.vma_ops.bind_vma = intel_ggtt_bind_vma;
ggtt->vm.vma_ops.unbind_vma = intel_ggtt_unbind_vma;
if (unlikely(ggtt->do_idle_maps))
drm_notice(&i915->drm,
"Applying Ironlake quirks for intel_iommu\n");
return 0;
}
int intel_gt_gmch_gen6_probe(struct i915_ggtt *ggtt)
{
struct drm_i915_private *i915 = ggtt->vm.i915;
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
unsigned int size;
u16 snb_gmch_ctl;
ggtt->gmadr = intel_pci_resource(pdev, 2);
ggtt->mappable_end = resource_size(&ggtt->gmadr);
/*
* 64/512MB is the current min/max we actually know of, but this is
* just a coarse sanity check.
*/
if (ggtt->mappable_end < (64<<20) || ggtt->mappable_end > (512<<20)) {
drm_err(&i915->drm, "Unknown GMADR size (%pa)\n",
&ggtt->mappable_end);
return -ENXIO;
}
pci_read_config_word(pdev, SNB_GMCH_CTRL, &snb_gmch_ctl);
size = gen6_get_total_gtt_size(snb_gmch_ctl);
ggtt->vm.total = (size / sizeof(gen6_pte_t)) * I915_GTT_PAGE_SIZE;
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
ggtt->vm.clear_range = nop_clear_range;
if (!HAS_FULL_PPGTT(i915) || intel_scanout_needs_vtd_wa(i915))
ggtt->vm.clear_range = gen6_ggtt_clear_range;
ggtt->vm.insert_page = gen6_ggtt_insert_page;
ggtt->vm.insert_entries = gen6_ggtt_insert_entries;
ggtt->vm.cleanup = gen6_gmch_remove;
ggtt->invalidate = gen6_ggtt_invalidate;
if (HAS_EDRAM(i915))
ggtt->vm.pte_encode = iris_pte_encode;
else if (IS_HASWELL(i915))
ggtt->vm.pte_encode = hsw_pte_encode;
else if (IS_VALLEYVIEW(i915))
ggtt->vm.pte_encode = byt_pte_encode;
else if (GRAPHICS_VER(i915) >= 7)
ggtt->vm.pte_encode = ivb_pte_encode;
else
ggtt->vm.pte_encode = snb_pte_encode;
ggtt->vm.vma_ops.bind_vma = intel_ggtt_bind_vma;
ggtt->vm.vma_ops.unbind_vma = intel_ggtt_unbind_vma;
return ggtt_probe_common(ggtt, size);
}
static unsigned int chv_get_total_gtt_size(u16 gmch_ctrl)
{
gmch_ctrl >>= SNB_GMCH_GGMS_SHIFT;
gmch_ctrl &= SNB_GMCH_GGMS_MASK;
if (gmch_ctrl)
return 1 << (20 + gmch_ctrl);
return 0;
}
int intel_gt_gmch_gen8_probe(struct i915_ggtt *ggtt)
{
struct drm_i915_private *i915 = ggtt->vm.i915;
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
unsigned int size;
u16 snb_gmch_ctl;
/* TODO: We're not aware of mappable constraints on gen8 yet */
if (!HAS_LMEM(i915)) {
ggtt->gmadr = intel_pci_resource(pdev, 2);
ggtt->mappable_end = resource_size(&ggtt->gmadr);
}
pci_read_config_word(pdev, SNB_GMCH_CTRL, &snb_gmch_ctl);
if (IS_CHERRYVIEW(i915))
size = chv_get_total_gtt_size(snb_gmch_ctl);
else
size = gen8_get_total_gtt_size(snb_gmch_ctl);
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
ggtt->vm.lmem_pt_obj_flags = I915_BO_ALLOC_PM_EARLY;
ggtt->vm.total = (size / sizeof(gen8_pte_t)) * I915_GTT_PAGE_SIZE;
ggtt->vm.cleanup = gen6_gmch_remove;
ggtt->vm.insert_page = gen8_ggtt_insert_page;
ggtt->vm.clear_range = nop_clear_range;
if (intel_scanout_needs_vtd_wa(i915))
ggtt->vm.clear_range = gen8_ggtt_clear_range;
ggtt->vm.insert_entries = gen8_ggtt_insert_entries;
/*
* Serialize GTT updates with aperture access on BXT if VT-d is on,
* and always on CHV.
*/
if (intel_vm_no_concurrent_access_wa(i915)) {
ggtt->vm.insert_entries = bxt_vtd_ggtt_insert_entries__BKL;
ggtt->vm.insert_page = bxt_vtd_ggtt_insert_page__BKL;
ggtt->vm.bind_async_flags =
I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
}
ggtt->invalidate = gen8_ggtt_invalidate;
ggtt->vm.vma_ops.bind_vma = intel_ggtt_bind_vma;
ggtt->vm.vma_ops.unbind_vma = intel_ggtt_unbind_vma;
ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
setup_private_pat(ggtt->vm.gt->uncore);
return ggtt_probe_common(ggtt, size);
}
int intel_gt_gmch_gen5_enable_hw(struct drm_i915_private *i915)
{
if (GRAPHICS_VER(i915) < 6 && !intel_enable_gtt())
return -EIO;
return 0;
}

Просмотреть файл

@ -0,0 +1,46 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef __INTEL_GT_GMCH_H__
#define __INTEL_GT_GMCH_H__
#include "intel_gtt.h"
/* For x86 platforms */
#if IS_ENABLED(CONFIG_X86)
void intel_gt_gmch_gen5_chipset_flush(struct intel_gt *gt);
int intel_gt_gmch_gen6_probe(struct i915_ggtt *ggtt);
int intel_gt_gmch_gen8_probe(struct i915_ggtt *ggtt);
int intel_gt_gmch_gen5_probe(struct i915_ggtt *ggtt);
int intel_gt_gmch_gen5_enable_hw(struct drm_i915_private *i915);
/* Stubs for non-x86 platforms */
#else
static inline void intel_gt_gmch_gen5_chipset_flush(struct intel_gt *gt)
{
}
static inline int intel_gt_gmch_gen5_probe(struct i915_ggtt *ggtt)
{
/* No HW should be probed for this case yet, return fail */
return -ENODEV;
}
static inline int intel_gt_gmch_gen6_probe(struct i915_ggtt *ggtt)
{
/* No HW should be probed for this case yet, return fail */
return -ENODEV;
}
static inline int intel_gt_gmch_gen8_probe(struct i915_ggtt *ggtt)
{
/* No HW should be probed for this case yet, return fail */
return -ENODEV;
}
static inline int intel_gt_gmch_gen5_enable_hw(struct drm_i915_private *i915)
{
/* No HW should be enabled for this case yet, return fail */
return -ENODEV;
}
#endif
#endif /* __INTEL_GT_GMCH_H__ */

Просмотреть файл

@ -68,6 +68,9 @@ gen11_other_irq_handler(struct intel_gt *gt, const u8 instance,
if (instance == OTHER_KCR_INSTANCE)
return intel_pxp_irq_handler(&gt->pxp, iir);
if (instance == OTHER_GSC_INSTANCE)
return intel_gsc_irq_handler(gt, iir);
WARN_ONCE(1, "unhandled other interrupt instance=0x%x, iir=0x%x\n",
instance, iir);
}
@ -184,6 +187,8 @@ void gen11_gt_irq_reset(struct intel_gt *gt)
intel_uncore_write(uncore, GEN11_VCS_VECS_INTR_ENABLE, 0);
if (CCS_MASK(gt))
intel_uncore_write(uncore, GEN12_CCS_RSVD_INTR_ENABLE, 0);
if (HAS_HECI_GSC(gt->i915))
intel_uncore_write(uncore, GEN11_GUNIT_CSME_INTR_ENABLE, 0);
/* Restore masks irqs on RCS, BCS, VCS and VECS engines. */
intel_uncore_write(uncore, GEN11_RCS0_RSVD_INTR_MASK, ~0);
@ -201,6 +206,8 @@ void gen11_gt_irq_reset(struct intel_gt *gt)
intel_uncore_write(uncore, GEN12_CCS0_CCS1_INTR_MASK, ~0);
if (HAS_ENGINE(gt, CCS2) || HAS_ENGINE(gt, CCS3))
intel_uncore_write(uncore, GEN12_CCS2_CCS3_INTR_MASK, ~0);
if (HAS_HECI_GSC(gt->i915))
intel_uncore_write(uncore, GEN11_GUNIT_CSME_INTR_MASK, ~0);
intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_ENABLE, 0);
intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK, ~0);
@ -215,6 +222,7 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt)
{
struct intel_uncore *uncore = gt->uncore;
u32 irqs = GT_RENDER_USER_INTERRUPT;
const u32 gsc_mask = GSC_IRQ_INTF(0) | GSC_IRQ_INTF(1);
u32 dmask;
u32 smask;
@ -233,6 +241,9 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt)
intel_uncore_write(uncore, GEN11_VCS_VECS_INTR_ENABLE, dmask);
if (CCS_MASK(gt))
intel_uncore_write(uncore, GEN12_CCS_RSVD_INTR_ENABLE, smask);
if (HAS_HECI_GSC(gt->i915))
intel_uncore_write(uncore, GEN11_GUNIT_CSME_INTR_ENABLE,
gsc_mask);
/* Unmask irqs on RCS, BCS, VCS and VECS engines. */
intel_uncore_write(uncore, GEN11_RCS0_RSVD_INTR_MASK, ~smask);
@ -250,6 +261,8 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt)
intel_uncore_write(uncore, GEN12_CCS0_CCS1_INTR_MASK, ~dmask);
if (HAS_ENGINE(gt, CCS2) || HAS_ENGINE(gt, CCS3))
intel_uncore_write(uncore, GEN12_CCS2_CCS3_INTR_MASK, ~dmask);
if (HAS_HECI_GSC(gt->i915))
intel_uncore_write(uncore, GEN11_GUNIT_CSME_INTR_MASK, ~gsc_mask);
/*
* RPS interrupts will get enabled/disabled on demand when RPS itself

Просмотреть файл

@ -129,7 +129,14 @@ static const struct intel_wakeref_ops wf_ops = {
void intel_gt_pm_init_early(struct intel_gt *gt)
{
intel_wakeref_init(&gt->wakeref, gt->uncore->rpm, &wf_ops);
/*
* We access the runtime_pm structure via gt->i915 here rather than
* gt->uncore as we do elsewhere in the file because gt->uncore is not
* yet initialized for all tiles at this point in the driver startup.
* runtime_pm is per-device rather than per-tile, so this is still the
* correct structure.
*/
intel_wakeref_init(&gt->wakeref, &gt->i915->runtime_pm, &wf_ops);
seqcount_mutex_init(&gt->stats.lock, &gt->wakeref.mutex);
}
@ -175,15 +182,16 @@ static void gt_sanitize(struct intel_gt *gt, bool force)
if (intel_gt_is_wedged(gt))
intel_gt_unset_wedged(gt);
for_each_engine(engine, gt, id)
/* For GuC mode, ensure submission is disabled before stopping ring */
intel_uc_reset_prepare(&gt->uc);
for_each_engine(engine, gt, id) {
if (engine->reset.prepare)
engine->reset.prepare(engine);
intel_uc_reset_prepare(&gt->uc);
for_each_engine(engine, gt, id)
if (engine->sanitize)
engine->sanitize(engine);
}
if (reset_engines(gt) || force) {
for_each_engine(engine, gt, id)

Просмотреть файл

@ -24,38 +24,38 @@
#include "intel_uncore.h"
#include "vlv_sideband.h"
int intel_gt_pm_debugfs_forcewake_user_open(struct intel_gt *gt)
void intel_gt_pm_debugfs_forcewake_user_open(struct intel_gt *gt)
{
atomic_inc(&gt->user_wakeref);
intel_gt_pm_get(gt);
if (GRAPHICS_VER(gt->i915) >= 6)
intel_uncore_forcewake_user_get(gt->uncore);
return 0;
}
int intel_gt_pm_debugfs_forcewake_user_release(struct intel_gt *gt)
void intel_gt_pm_debugfs_forcewake_user_release(struct intel_gt *gt)
{
if (GRAPHICS_VER(gt->i915) >= 6)
intel_uncore_forcewake_user_put(gt->uncore);
intel_gt_pm_put(gt);
atomic_dec(&gt->user_wakeref);
return 0;
}
static int forcewake_user_open(struct inode *inode, struct file *file)
{
struct intel_gt *gt = inode->i_private;
return intel_gt_pm_debugfs_forcewake_user_open(gt);
intel_gt_pm_debugfs_forcewake_user_open(gt);
return 0;
}
static int forcewake_user_release(struct inode *inode, struct file *file)
{
struct intel_gt *gt = inode->i_private;
return intel_gt_pm_debugfs_forcewake_user_release(gt);
intel_gt_pm_debugfs_forcewake_user_release(gt);
return 0;
}
static const struct file_operations forcewake_user_fops = {
@ -342,17 +342,16 @@ void intel_gt_pm_frequency_dump(struct intel_gt *gt, struct drm_printer *p)
} else if (GRAPHICS_VER(i915) >= 6) {
u32 rp_state_limits;
u32 gt_perf_status;
u32 rp_state_cap;
struct intel_rps_freq_caps caps;
u32 rpmodectl, rpinclimit, rpdeclimit;
u32 rpstat, cagf, reqf;
u32 rpcurupei, rpcurup, rpprevup;
u32 rpcurdownei, rpcurdown, rpprevdown;
u32 rpupei, rpupt, rpdownei, rpdownt;
u32 pm_ier, pm_imr, pm_isr, pm_iir, pm_mask;
int max_freq;
rp_state_limits = intel_uncore_read(uncore, GEN6_RP_STATE_LIMITS);
rp_state_cap = intel_rps_read_state_cap(rps);
gen6_rps_get_freq_caps(rps, &caps);
if (IS_GEN9_LP(i915))
gt_perf_status = intel_uncore_read(uncore, BXT_GT_PERF_STATUS);
else
@ -474,25 +473,12 @@ void intel_gt_pm_frequency_dump(struct intel_gt *gt, struct drm_printer *p)
drm_printf(p, "RP DOWN THRESHOLD: %d (%lldns)\n",
rpdownt, intel_gt_pm_interval_to_ns(gt, rpdownt));
max_freq = (IS_GEN9_LP(i915) ? rp_state_cap >> 0 :
rp_state_cap >> 16) & 0xff;
max_freq *= (IS_GEN9_BC(i915) ||
GRAPHICS_VER(i915) >= 11 ? GEN9_FREQ_SCALER : 1);
drm_printf(p, "Lowest (RPN) frequency: %dMHz\n",
intel_gpu_freq(rps, max_freq));
max_freq = (rp_state_cap & 0xff00) >> 8;
max_freq *= (IS_GEN9_BC(i915) ||
GRAPHICS_VER(i915) >= 11 ? GEN9_FREQ_SCALER : 1);
intel_gpu_freq(rps, caps.min_freq));
drm_printf(p, "Nominal (RP1) frequency: %dMHz\n",
intel_gpu_freq(rps, max_freq));
max_freq = (IS_GEN9_LP(i915) ? rp_state_cap >> 16 :
rp_state_cap >> 0) & 0xff;
max_freq *= (IS_GEN9_BC(i915) ||
GRAPHICS_VER(i915) >= 11 ? GEN9_FREQ_SCALER : 1);
intel_gpu_freq(rps, caps.rp1_freq));
drm_printf(p, "Max non-overclocked (RP0) frequency: %dMHz\n",
intel_gpu_freq(rps, max_freq));
intel_gpu_freq(rps, caps.rp0_freq));
drm_printf(p, "Max overclocked frequency: %dMHz\n",
intel_gpu_freq(rps, rps->max_freq));

Просмотреть файл

@ -14,7 +14,7 @@ void intel_gt_pm_debugfs_register(struct intel_gt *gt, struct dentry *root);
void intel_gt_pm_frequency_dump(struct intel_gt *gt, struct drm_printer *m);
/* functions that need to be accessed by the upper level non-gt interfaces */
int intel_gt_pm_debugfs_forcewake_user_open(struct intel_gt *gt);
int intel_gt_pm_debugfs_forcewake_user_release(struct intel_gt *gt);
void intel_gt_pm_debugfs_forcewake_user_open(struct intel_gt *gt);
void intel_gt_pm_debugfs_forcewake_user_release(struct intel_gt *gt);
#endif /* INTEL_GT_PM_DEBUGFS_H */

Просмотреть файл

@ -46,6 +46,7 @@
#define GEN8_MCR_SLICE_MASK GEN8_MCR_SLICE(3)
#define GEN8_MCR_SUBSLICE(subslice) (((subslice) & 3) << 24)
#define GEN8_MCR_SUBSLICE_MASK GEN8_MCR_SUBSLICE(3)
#define GEN11_MCR_MULTICAST REG_BIT(31)
#define GEN11_MCR_SLICE(slice) (((slice) & 0xf) << 27)
#define GEN11_MCR_SLICE_MASK GEN11_MCR_SLICE(0xf)
#define GEN11_MCR_SUBSLICE(subslice) (((subslice) & 0x7) << 24)
@ -840,6 +841,24 @@
#define CTC_SHIFT_PARAMETER_SHIFT 1
#define CTC_SHIFT_PARAMETER_MASK (0x3 << CTC_SHIFT_PARAMETER_SHIFT)
/* GPM MSG_IDLE */
#define MSG_IDLE_CS _MMIO(0x8000)
#define MSG_IDLE_VCS0 _MMIO(0x8004)
#define MSG_IDLE_VCS1 _MMIO(0x8008)
#define MSG_IDLE_BCS _MMIO(0x800C)
#define MSG_IDLE_VECS0 _MMIO(0x8010)
#define MSG_IDLE_VCS2 _MMIO(0x80C0)
#define MSG_IDLE_VCS3 _MMIO(0x80C4)
#define MSG_IDLE_VCS4 _MMIO(0x80C8)
#define MSG_IDLE_VCS5 _MMIO(0x80CC)
#define MSG_IDLE_VCS6 _MMIO(0x80D0)
#define MSG_IDLE_VCS7 _MMIO(0x80D4)
#define MSG_IDLE_VECS1 _MMIO(0x80D8)
#define MSG_IDLE_VECS2 _MMIO(0x80DC)
#define MSG_IDLE_VECS3 _MMIO(0x80E0)
#define MSG_IDLE_FW_MASK REG_GENMASK(13, 9)
#define MSG_IDLE_FW_SHIFT 9
#define FORCEWAKE_MEDIA_GEN9 _MMIO(0xa270)
#define FORCEWAKE_RENDER_GEN9 _MMIO(0xa278)
@ -1087,6 +1106,7 @@
#define EU_PERF_CNTL3 _MMIO(0xe758)
#define LSC_CHICKEN_BIT_0 _MMIO(0xe7c8)
#define DISABLE_D8_D16_COASLESCE REG_BIT(30)
#define FORCE_1_SUB_MESSAGE_PER_FRAGMENT REG_BIT(15)
#define LSC_CHICKEN_BIT_0_UDW _MMIO(0xe7c8 + 4)
#define DIS_CHAIN_2XSIMD8 REG_BIT(55 - 32)
@ -1482,6 +1502,7 @@
#define OTHER_GUC_INSTANCE 0
#define OTHER_GTPM_INSTANCE 1
#define OTHER_KCR_INSTANCE 4
#define OTHER_GSC_INSTANCE 6
#define GEN11_IIR_REG_SELECTOR(x) _MMIO(0x190070 + ((x) * 4))

Просмотреть файл

@ -0,0 +1,122 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <drm/drm_device.h>
#include <linux/device.h>
#include <linux/kobject.h>
#include <linux/printk.h>
#include <linux/sysfs.h>
#include "i915_drv.h"
#include "i915_sysfs.h"
#include "intel_gt.h"
#include "intel_gt_sysfs.h"
#include "intel_gt_sysfs_pm.h"
#include "intel_gt_types.h"
#include "intel_rc6.h"
bool is_object_gt(struct kobject *kobj)
{
return !strncmp(kobj->name, "gt", 2);
}
static struct intel_gt *kobj_to_gt(struct kobject *kobj)
{
return container_of(kobj, struct kobj_gt, base)->gt;
}
struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
const char *name)
{
struct kobject *kobj = &dev->kobj;
/*
* We are interested at knowing from where the interface
* has been called, whether it's called from gt/ or from
* the parent directory.
* From the interface position it depends also the value of
* the private data.
* If the interface is called from gt/ then private data is
* of the "struct intel_gt *" type, otherwise it's * a
* "struct drm_i915_private *" type.
*/
if (!is_object_gt(kobj)) {
struct drm_i915_private *i915 = kdev_minor_to_i915(dev);
return to_gt(i915);
}
return kobj_to_gt(kobj);
}
static struct kobject *gt_get_parent_obj(struct intel_gt *gt)
{
return &gt->i915->drm.primary->kdev->kobj;
}
static ssize_t id_show(struct device *dev,
struct device_attribute *attr,
char *buf)
{
struct intel_gt *gt = intel_gt_sysfs_get_drvdata(dev, attr->attr.name);
return sysfs_emit(buf, "%u\n", gt->info.id);
}
static DEVICE_ATTR_RO(id);
static struct attribute *id_attrs[] = {
&dev_attr_id.attr,
NULL,
};
ATTRIBUTE_GROUPS(id);
static void kobj_gt_release(struct kobject *kobj)
{
kfree(kobj);
}
static struct kobj_type kobj_gt_type = {
.release = kobj_gt_release,
.sysfs_ops = &kobj_sysfs_ops,
.default_groups = id_groups,
};
void intel_gt_sysfs_register(struct intel_gt *gt)
{
struct kobj_gt *kg;
/*
* We need to make things right with the
* ABI compatibility. The files were originally
* generated under the parent directory.
*
* We generate the files only for gt 0
* to avoid duplicates.
*/
if (gt_is_root(gt))
intel_gt_sysfs_pm_init(gt, gt_get_parent_obj(gt));
kg = kzalloc(sizeof(*kg), GFP_KERNEL);
if (!kg)
goto exit_fail;
kobject_init(&kg->base, &kobj_gt_type);
kg->gt = gt;
/* xfer ownership to sysfs tree */
if (kobject_add(&kg->base, gt->i915->sysfs_gt, "gt%d", gt->info.id))
goto exit_kobj_put;
intel_gt_sysfs_pm_init(gt, &kg->base);
return;
exit_kobj_put:
kobject_put(&kg->base);
exit_fail:
drm_warn(&gt->i915->drm,
"failed to initialize gt%d sysfs root\n", gt->info.id);
}

Просмотреть файл

@ -0,0 +1,34 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef __SYSFS_GT_H__
#define __SYSFS_GT_H__
#include <linux/ctype.h>
#include <linux/kobject.h>
#include "i915_gem.h" /* GEM_BUG_ON() */
struct intel_gt;
struct kobj_gt {
struct kobject base;
struct intel_gt *gt;
};
bool is_object_gt(struct kobject *kobj);
struct drm_i915_private *kobj_to_i915(struct kobject *kobj);
struct kobject *
intel_gt_create_kobj(struct intel_gt *gt,
struct kobject *dir,
const char *name);
void intel_gt_sysfs_register(struct intel_gt *gt);
struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
const char *name);
#endif /* SYSFS_GT_H */

Просмотреть файл

@ -0,0 +1,601 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <drm/drm_device.h>
#include <linux/sysfs.h>
#include <linux/printk.h>
#include "i915_drv.h"
#include "i915_reg.h"
#include "i915_sysfs.h"
#include "intel_gt.h"
#include "intel_gt_regs.h"
#include "intel_gt_sysfs.h"
#include "intel_gt_sysfs_pm.h"
#include "intel_rc6.h"
#include "intel_rps.h"
#ifdef CONFIG_PM
enum intel_gt_sysfs_op {
INTEL_GT_SYSFS_MIN = 0,
INTEL_GT_SYSFS_MAX,
};
static int
sysfs_gt_attribute_w_func(struct device *dev, struct device_attribute *attr,
int (func)(struct intel_gt *gt, u32 val), u32 val)
{
struct intel_gt *gt;
int ret;
if (!is_object_gt(&dev->kobj)) {
int i;
struct drm_i915_private *i915 = kdev_minor_to_i915(dev);
for_each_gt(gt, i915, i) {
ret = func(gt, val);
if (ret)
break;
}
} else {
gt = intel_gt_sysfs_get_drvdata(dev, attr->attr.name);
ret = func(gt, val);
}
return ret;
}
static u32
sysfs_gt_attribute_r_func(struct device *dev, struct device_attribute *attr,
u32 (func)(struct intel_gt *gt),
enum intel_gt_sysfs_op op)
{
struct intel_gt *gt;
u32 ret;
ret = (op == INTEL_GT_SYSFS_MAX) ? 0 : (u32) -1;
if (!is_object_gt(&dev->kobj)) {
int i;
struct drm_i915_private *i915 = kdev_minor_to_i915(dev);
for_each_gt(gt, i915, i) {
u32 val = func(gt);
switch (op) {
case INTEL_GT_SYSFS_MIN:
if (val < ret)
ret = val;
break;
case INTEL_GT_SYSFS_MAX:
if (val > ret)
ret = val;
break;
}
}
} else {
gt = intel_gt_sysfs_get_drvdata(dev, attr->attr.name);
ret = func(gt);
}
return ret;
}
/* RC6 interfaces will show the minimum RC6 residency value */
#define sysfs_gt_attribute_r_min_func(d, a, f) \
sysfs_gt_attribute_r_func(d, a, f, INTEL_GT_SYSFS_MIN)
/* Frequency interfaces will show the maximum frequency value */
#define sysfs_gt_attribute_r_max_func(d, a, f) \
sysfs_gt_attribute_r_func(d, a, f, INTEL_GT_SYSFS_MAX)
static u32 get_residency(struct intel_gt *gt, i915_reg_t reg)
{
intel_wakeref_t wakeref;
u64 res = 0;
with_intel_runtime_pm(gt->uncore->rpm, wakeref)
res = intel_rc6_residency_us(&gt->rc6, reg);
return DIV_ROUND_CLOSEST_ULL(res, 1000);
}
static ssize_t rc6_enable_show(struct device *dev,
struct device_attribute *attr,
char *buff)
{
struct intel_gt *gt = intel_gt_sysfs_get_drvdata(dev, attr->attr.name);
u8 mask = 0;
if (HAS_RC6(gt->i915))
mask |= BIT(0);
if (HAS_RC6p(gt->i915))
mask |= BIT(1);
if (HAS_RC6pp(gt->i915))
mask |= BIT(2);
return sysfs_emit(buff, "%x\n", mask);
}
static u32 __rc6_residency_ms_show(struct intel_gt *gt)
{
return get_residency(gt, GEN6_GT_GFX_RC6);
}
static ssize_t rc6_residency_ms_show(struct device *dev,
struct device_attribute *attr,
char *buff)
{
u32 rc6_residency = sysfs_gt_attribute_r_min_func(dev, attr,
__rc6_residency_ms_show);
return sysfs_emit(buff, "%u\n", rc6_residency);
}
static u32 __rc6p_residency_ms_show(struct intel_gt *gt)
{
return get_residency(gt, GEN6_GT_GFX_RC6p);
}
static ssize_t rc6p_residency_ms_show(struct device *dev,
struct device_attribute *attr,
char *buff)
{
u32 rc6p_residency = sysfs_gt_attribute_r_min_func(dev, attr,
__rc6p_residency_ms_show);
return sysfs_emit(buff, "%u\n", rc6p_residency);
}
static u32 __rc6pp_residency_ms_show(struct intel_gt *gt)
{
return get_residency(gt, GEN6_GT_GFX_RC6pp);
}
static ssize_t rc6pp_residency_ms_show(struct device *dev,
struct device_attribute *attr,
char *buff)
{
u32 rc6pp_residency = sysfs_gt_attribute_r_min_func(dev, attr,
__rc6pp_residency_ms_show);
return sysfs_emit(buff, "%u\n", rc6pp_residency);
}
static u32 __media_rc6_residency_ms_show(struct intel_gt *gt)
{
return get_residency(gt, VLV_GT_MEDIA_RC6);
}
static ssize_t media_rc6_residency_ms_show(struct device *dev,
struct device_attribute *attr,
char *buff)
{
u32 rc6_residency = sysfs_gt_attribute_r_min_func(dev, attr,
__media_rc6_residency_ms_show);
return sysfs_emit(buff, "%u\n", rc6_residency);
}
static DEVICE_ATTR_RO(rc6_enable);
static DEVICE_ATTR_RO(rc6_residency_ms);
static DEVICE_ATTR_RO(rc6p_residency_ms);
static DEVICE_ATTR_RO(rc6pp_residency_ms);
static DEVICE_ATTR_RO(media_rc6_residency_ms);
static struct attribute *rc6_attrs[] = {
&dev_attr_rc6_enable.attr,
&dev_attr_rc6_residency_ms.attr,
NULL
};
static struct attribute *rc6p_attrs[] = {
&dev_attr_rc6p_residency_ms.attr,
&dev_attr_rc6pp_residency_ms.attr,
NULL
};
static struct attribute *media_rc6_attrs[] = {
&dev_attr_media_rc6_residency_ms.attr,
NULL
};
static const struct attribute_group rc6_attr_group[] = {
{ .attrs = rc6_attrs, },
{ .name = power_group_name, .attrs = rc6_attrs, },
};
static const struct attribute_group rc6p_attr_group[] = {
{ .attrs = rc6p_attrs, },
{ .name = power_group_name, .attrs = rc6p_attrs, },
};
static const struct attribute_group media_rc6_attr_group[] = {
{ .attrs = media_rc6_attrs, },
{ .name = power_group_name, .attrs = media_rc6_attrs, },
};
static int __intel_gt_sysfs_create_group(struct kobject *kobj,
const struct attribute_group *grp)
{
return is_object_gt(kobj) ?
sysfs_create_group(kobj, &grp[0]) :
sysfs_merge_group(kobj, &grp[1]);
}
static void intel_sysfs_rc6_init(struct intel_gt *gt, struct kobject *kobj)
{
int ret;
if (!HAS_RC6(gt->i915))
return;
ret = __intel_gt_sysfs_create_group(kobj, rc6_attr_group);
if (ret)
drm_warn(&gt->i915->drm,
"failed to create gt%u RC6 sysfs files (%pe)\n",
gt->info.id, ERR_PTR(ret));
/*
* cannot use the is_visible() attribute because
* the upper object inherits from the parent group.
*/
if (HAS_RC6p(gt->i915)) {
ret = __intel_gt_sysfs_create_group(kobj, rc6p_attr_group);
if (ret)
drm_warn(&gt->i915->drm,
"failed to create gt%u RC6p sysfs files (%pe)\n",
gt->info.id, ERR_PTR(ret));
}
if (IS_VALLEYVIEW(gt->i915) || IS_CHERRYVIEW(gt->i915)) {
ret = __intel_gt_sysfs_create_group(kobj, media_rc6_attr_group);
if (ret)
drm_warn(&gt->i915->drm,
"failed to create media %u RC6 sysfs files (%pe)\n",
gt->info.id, ERR_PTR(ret));
}
}
#else
static void intel_sysfs_rc6_init(struct intel_gt *gt, struct kobject *kobj)
{
}
#endif /* CONFIG_PM */
static u32 __act_freq_mhz_show(struct intel_gt *gt)
{
return intel_rps_read_actual_frequency(&gt->rps);
}
static ssize_t act_freq_mhz_show(struct device *dev,
struct device_attribute *attr, char *buff)
{
u32 actual_freq = sysfs_gt_attribute_r_max_func(dev, attr,
__act_freq_mhz_show);
return sysfs_emit(buff, "%u\n", actual_freq);
}
static u32 __cur_freq_mhz_show(struct intel_gt *gt)
{
return intel_rps_get_requested_frequency(&gt->rps);
}
static ssize_t cur_freq_mhz_show(struct device *dev,
struct device_attribute *attr, char *buff)
{
u32 cur_freq = sysfs_gt_attribute_r_max_func(dev, attr,
__cur_freq_mhz_show);
return sysfs_emit(buff, "%u\n", cur_freq);
}
static u32 __boost_freq_mhz_show(struct intel_gt *gt)
{
return intel_rps_get_boost_frequency(&gt->rps);
}
static ssize_t boost_freq_mhz_show(struct device *dev,
struct device_attribute *attr,
char *buff)
{
u32 boost_freq = sysfs_gt_attribute_r_max_func(dev, attr,
__boost_freq_mhz_show);
return sysfs_emit(buff, "%u\n", boost_freq);
}
static int __boost_freq_mhz_store(struct intel_gt *gt, u32 val)
{
return intel_rps_set_boost_frequency(&gt->rps, val);
}
static ssize_t boost_freq_mhz_store(struct device *dev,
struct device_attribute *attr,
const char *buff, size_t count)
{
ssize_t ret;
u32 val;
ret = kstrtou32(buff, 0, &val);
if (ret)
return ret;
return sysfs_gt_attribute_w_func(dev, attr,
__boost_freq_mhz_store, val) ?: count;
}
static u32 __rp0_freq_mhz_show(struct intel_gt *gt)
{
return intel_rps_get_rp0_frequency(&gt->rps);
}
static ssize_t RP0_freq_mhz_show(struct device *dev,
struct device_attribute *attr, char *buff)
{
u32 rp0_freq = sysfs_gt_attribute_r_max_func(dev, attr,
__rp0_freq_mhz_show);
return sysfs_emit(buff, "%u\n", rp0_freq);
}
static u32 __rp1_freq_mhz_show(struct intel_gt *gt)
{
return intel_rps_get_rp1_frequency(&gt->rps);
}
static ssize_t RP1_freq_mhz_show(struct device *dev,
struct device_attribute *attr, char *buff)
{
u32 rp1_freq = sysfs_gt_attribute_r_max_func(dev, attr,
__rp1_freq_mhz_show);
return sysfs_emit(buff, "%u\n", rp1_freq);
}
static u32 __rpn_freq_mhz_show(struct intel_gt *gt)
{
return intel_rps_get_rpn_frequency(&gt->rps);
}
static ssize_t RPn_freq_mhz_show(struct device *dev,
struct device_attribute *attr, char *buff)
{
u32 rpn_freq = sysfs_gt_attribute_r_max_func(dev, attr,
__rpn_freq_mhz_show);
return sysfs_emit(buff, "%u\n", rpn_freq);
}
static u32 __max_freq_mhz_show(struct intel_gt *gt)
{
return intel_rps_get_max_frequency(&gt->rps);
}
static ssize_t max_freq_mhz_show(struct device *dev,
struct device_attribute *attr, char *buff)
{
u32 max_freq = sysfs_gt_attribute_r_max_func(dev, attr,
__max_freq_mhz_show);
return sysfs_emit(buff, "%u\n", max_freq);
}
static int __set_max_freq(struct intel_gt *gt, u32 val)
{
return intel_rps_set_max_frequency(&gt->rps, val);
}
static ssize_t max_freq_mhz_store(struct device *dev,
struct device_attribute *attr,
const char *buff, size_t count)
{
int ret;
u32 val;
ret = kstrtou32(buff, 0, &val);
if (ret)
return ret;
ret = sysfs_gt_attribute_w_func(dev, attr, __set_max_freq, val);
return ret ?: count;
}
static u32 __min_freq_mhz_show(struct intel_gt *gt)
{
return intel_rps_get_min_frequency(&gt->rps);
}
static ssize_t min_freq_mhz_show(struct device *dev,
struct device_attribute *attr, char *buff)
{
u32 min_freq = sysfs_gt_attribute_r_min_func(dev, attr,
__min_freq_mhz_show);
return sysfs_emit(buff, "%u\n", min_freq);
}
static int __set_min_freq(struct intel_gt *gt, u32 val)
{
return intel_rps_set_min_frequency(&gt->rps, val);
}
static ssize_t min_freq_mhz_store(struct device *dev,
struct device_attribute *attr,
const char *buff, size_t count)
{
int ret;
u32 val;
ret = kstrtou32(buff, 0, &val);
if (ret)
return ret;
ret = sysfs_gt_attribute_w_func(dev, attr, __set_min_freq, val);
return ret ?: count;
}
static u32 __vlv_rpe_freq_mhz_show(struct intel_gt *gt)
{
struct intel_rps *rps = &gt->rps;
return intel_gpu_freq(rps, rps->efficient_freq);
}
static ssize_t vlv_rpe_freq_mhz_show(struct device *dev,
struct device_attribute *attr, char *buff)
{
u32 rpe_freq = sysfs_gt_attribute_r_max_func(dev, attr,
__vlv_rpe_freq_mhz_show);
return sysfs_emit(buff, "%u\n", rpe_freq);
}
#define INTEL_GT_RPS_SYSFS_ATTR(_name, _mode, _show, _store) \
struct device_attribute dev_attr_gt_##_name = __ATTR(gt_##_name, _mode, _show, _store); \
struct device_attribute dev_attr_rps_##_name = __ATTR(rps_##_name, _mode, _show, _store)
#define INTEL_GT_RPS_SYSFS_ATTR_RO(_name) \
INTEL_GT_RPS_SYSFS_ATTR(_name, 0444, _name##_show, NULL)
#define INTEL_GT_RPS_SYSFS_ATTR_RW(_name) \
INTEL_GT_RPS_SYSFS_ATTR(_name, 0644, _name##_show, _name##_store)
static INTEL_GT_RPS_SYSFS_ATTR_RO(act_freq_mhz);
static INTEL_GT_RPS_SYSFS_ATTR_RO(cur_freq_mhz);
static INTEL_GT_RPS_SYSFS_ATTR_RW(boost_freq_mhz);
static INTEL_GT_RPS_SYSFS_ATTR_RO(RP0_freq_mhz);
static INTEL_GT_RPS_SYSFS_ATTR_RO(RP1_freq_mhz);
static INTEL_GT_RPS_SYSFS_ATTR_RO(RPn_freq_mhz);
static INTEL_GT_RPS_SYSFS_ATTR_RW(max_freq_mhz);
static INTEL_GT_RPS_SYSFS_ATTR_RW(min_freq_mhz);
static DEVICE_ATTR_RO(vlv_rpe_freq_mhz);
#define GEN6_ATTR(s) { \
&dev_attr_##s##_act_freq_mhz.attr, \
&dev_attr_##s##_cur_freq_mhz.attr, \
&dev_attr_##s##_boost_freq_mhz.attr, \
&dev_attr_##s##_max_freq_mhz.attr, \
&dev_attr_##s##_min_freq_mhz.attr, \
&dev_attr_##s##_RP0_freq_mhz.attr, \
&dev_attr_##s##_RP1_freq_mhz.attr, \
&dev_attr_##s##_RPn_freq_mhz.attr, \
NULL, \
}
#define GEN6_RPS_ATTR GEN6_ATTR(rps)
#define GEN6_GT_ATTR GEN6_ATTR(gt)
static const struct attribute * const gen6_rps_attrs[] = GEN6_RPS_ATTR;
static const struct attribute * const gen6_gt_attrs[] = GEN6_GT_ATTR;
static ssize_t punit_req_freq_mhz_show(struct device *dev,
struct device_attribute *attr,
char *buff)
{
struct intel_gt *gt = intel_gt_sysfs_get_drvdata(dev, attr->attr.name);
u32 preq = intel_rps_read_punit_req_frequency(&gt->rps);
return sysfs_emit(buff, "%u\n", preq);
}
struct intel_gt_bool_throttle_attr {
struct attribute attr;
ssize_t (*show)(struct device *dev, struct device_attribute *attr,
char *buf);
i915_reg_t reg32;
u32 mask;
};
static ssize_t throttle_reason_bool_show(struct device *dev,
struct device_attribute *attr,
char *buff)
{
struct intel_gt *gt = intel_gt_sysfs_get_drvdata(dev, attr->attr.name);
struct intel_gt_bool_throttle_attr *t_attr =
(struct intel_gt_bool_throttle_attr *) attr;
bool val = rps_read_mask_mmio(&gt->rps, t_attr->reg32, t_attr->mask);
return sysfs_emit(buff, "%u\n", val);
}
#define INTEL_GT_RPS_BOOL_ATTR_RO(sysfs_func__, mask__) \
struct intel_gt_bool_throttle_attr attr_##sysfs_func__ = { \
.attr = { .name = __stringify(sysfs_func__), .mode = 0444 }, \
.show = throttle_reason_bool_show, \
.reg32 = GT0_PERF_LIMIT_REASONS, \
.mask = mask__, \
}
static DEVICE_ATTR_RO(punit_req_freq_mhz);
static INTEL_GT_RPS_BOOL_ATTR_RO(throttle_reason_status, GT0_PERF_LIMIT_REASONS_MASK);
static INTEL_GT_RPS_BOOL_ATTR_RO(throttle_reason_pl1, POWER_LIMIT_1_MASK);
static INTEL_GT_RPS_BOOL_ATTR_RO(throttle_reason_pl2, POWER_LIMIT_2_MASK);
static INTEL_GT_RPS_BOOL_ATTR_RO(throttle_reason_pl4, POWER_LIMIT_4_MASK);
static INTEL_GT_RPS_BOOL_ATTR_RO(throttle_reason_thermal, THERMAL_LIMIT_MASK);
static INTEL_GT_RPS_BOOL_ATTR_RO(throttle_reason_prochot, PROCHOT_MASK);
static INTEL_GT_RPS_BOOL_ATTR_RO(throttle_reason_ratl, RATL_MASK);
static INTEL_GT_RPS_BOOL_ATTR_RO(throttle_reason_vr_thermalert, VR_THERMALERT_MASK);
static INTEL_GT_RPS_BOOL_ATTR_RO(throttle_reason_vr_tdc, VR_TDC_MASK);
static const struct attribute *freq_attrs[] = {
&dev_attr_punit_req_freq_mhz.attr,
&attr_throttle_reason_status.attr,
&attr_throttle_reason_pl1.attr,
&attr_throttle_reason_pl2.attr,
&attr_throttle_reason_pl4.attr,
&attr_throttle_reason_thermal.attr,
&attr_throttle_reason_prochot.attr,
&attr_throttle_reason_ratl.attr,
&attr_throttle_reason_vr_thermalert.attr,
&attr_throttle_reason_vr_tdc.attr,
NULL
};
static int intel_sysfs_rps_init(struct intel_gt *gt, struct kobject *kobj,
const struct attribute * const *attrs)
{
int ret;
if (GRAPHICS_VER(gt->i915) < 6)
return 0;
ret = sysfs_create_files(kobj, attrs);
if (ret)
return ret;
if (IS_VALLEYVIEW(gt->i915) || IS_CHERRYVIEW(gt->i915))
ret = sysfs_create_file(kobj, &dev_attr_vlv_rpe_freq_mhz.attr);
return ret;
}
void intel_gt_sysfs_pm_init(struct intel_gt *gt, struct kobject *kobj)
{
int ret;
intel_sysfs_rc6_init(gt, kobj);
ret = is_object_gt(kobj) ?
intel_sysfs_rps_init(gt, kobj, gen6_rps_attrs) :
intel_sysfs_rps_init(gt, kobj, gen6_gt_attrs);
if (ret)
drm_warn(&gt->i915->drm,
"failed to create gt%u RPS sysfs files (%pe)",
gt->info.id, ERR_PTR(ret));
/* end of the legacy interfaces */
if (!is_object_gt(kobj))
return;
ret = sysfs_create_files(kobj, freq_attrs);
if (ret)
drm_warn(&gt->i915->drm,
"failed to create gt%u throttle sysfs files (%pe)",
gt->info.id, ERR_PTR(ret));
}

Просмотреть файл

@ -0,0 +1,15 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef __SYSFS_GT_PM_H__
#define __SYSFS_GT_PM_H__
#include <linux/kobject.h>
#include "intel_gt_types.h"
void intel_gt_sysfs_pm_init(struct intel_gt *gt, struct kobject *kobj);
#endif /* SYSFS_RC6_H */

Просмотреть файл

@ -16,10 +16,12 @@
#include <linux/workqueue.h>
#include "uc/intel_uc.h"
#include "intel_gsc.h"
#include "i915_vma.h"
#include "intel_engine_types.h"
#include "intel_gt_buffer_pool_types.h"
#include "intel_hwconfig.h"
#include "intel_llc_types.h"
#include "intel_reset_types.h"
#include "intel_rc6_types.h"
@ -72,6 +74,7 @@ struct intel_gt {
struct i915_ggtt *ggtt;
struct intel_uc uc;
struct intel_gsc gsc;
struct mutex tlb_invalidate_lock;
@ -182,7 +185,19 @@ struct intel_gt {
const struct intel_mmio_range *steering_table[NUM_STEERING_TYPES];
struct {
u8 groupid;
u8 instanceid;
} default_steering;
/*
* Base of per-tile GTTMMADR where we can derive the MMIO and the GGTT.
*/
phys_addr_t phys_addr;
struct intel_gt_info {
unsigned int id;
intel_engine_mask_t engine_mask;
u32 l3bank_mask;
@ -199,6 +214,9 @@ struct intel_gt {
struct sseu_dev_info sseu;
unsigned long mslice_mask;
/** @hwconfig: hardware configuration data */
struct intel_hwconfig hwconfig;
} info;
struct {

Просмотреть файл

@ -109,32 +109,52 @@ int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object
return 0;
}
void __i915_vm_close(struct i915_address_space *vm)
static void clear_vm_list(struct list_head *list)
{
struct i915_vma *vma, *vn;
if (!atomic_dec_and_mutex_lock(&vm->open, &vm->mutex))
return;
list_for_each_entry_safe(vma, vn, &vm->bound_list, vm_link) {
list_for_each_entry_safe(vma, vn, list, vm_link) {
struct drm_i915_gem_object *obj = vma->obj;
if (!kref_get_unless_zero(&obj->base.refcount)) {
if (!i915_gem_object_get_rcu(obj)) {
/*
* Unbind the dying vma to ensure the bound_list
* Object is dying, but has not yet cleared its
* vma list.
* Unbind the dying vma to ensure our list
* is completely drained. We leave the destruction to
* the object destructor.
* the object destructor to avoid the vma
* disappearing under it.
*/
atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
WARN_ON(__i915_vma_unbind(vma));
continue;
/* Remove from the unbound list */
list_del_init(&vma->vm_link);
/*
* Delay the vm and vm mutex freeing until the
* object is done with destruction.
*/
i915_vm_resv_get(vma->vm);
vma->vm_ddestroy = true;
} else {
i915_vma_destroy_locked(vma);
i915_gem_object_put(obj);
}
/* Keep the obj (and hence the vma) alive as _we_ destroy it */
i915_vma_destroy_locked(vma);
i915_gem_object_put(obj);
}
}
static void __i915_vm_close(struct i915_address_space *vm)
{
mutex_lock(&vm->mutex);
clear_vm_list(&vm->bound_list);
clear_vm_list(&vm->unbound_list);
/* Check for must-fix unanticipated side-effects */
GEM_BUG_ON(!list_empty(&vm->bound_list));
GEM_BUG_ON(!list_empty(&vm->unbound_list));
mutex_unlock(&vm->mutex);
}
@ -156,7 +176,6 @@ int i915_vm_lock_objects(struct i915_address_space *vm,
void i915_address_space_fini(struct i915_address_space *vm)
{
drm_mm_takedown(&vm->mm);
mutex_destroy(&vm->mutex);
}
/**
@ -164,7 +183,8 @@ void i915_address_space_fini(struct i915_address_space *vm)
* @kref: Pointer to the &i915_address_space.resv_ref member.
*
* This function is called when the last lock sharer no longer shares the
* &i915_address_space._resv lock.
* &i915_address_space._resv lock, and also if we raced when
* destroying a vma by the vma destruction
*/
void i915_vm_resv_release(struct kref *kref)
{
@ -172,6 +192,8 @@ void i915_vm_resv_release(struct kref *kref)
container_of(kref, typeof(*vm), resv_ref);
dma_resv_fini(&vm->_resv);
mutex_destroy(&vm->mutex);
kfree(vm);
}
@ -180,6 +202,8 @@ static void __i915_vm_release(struct work_struct *work)
struct i915_address_space *vm =
container_of(work, struct i915_address_space, release_work);
__i915_vm_close(vm);
/* Synchronize async unbinds. */
i915_vma_resource_bind_dep_sync_all(vm);
@ -213,7 +237,6 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
vm->pending_unbind = RB_ROOT_CACHED;
INIT_WORK(&vm->release_work, __i915_vm_release);
atomic_set(&vm->open, 1);
/*
* The vm->mutex must be reclaim safe (for use in the shrinker).
@ -258,6 +281,7 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
vm->mm.head_node.color = I915_COLOR_UNEVICTABLE;
INIT_LIST_HEAD(&vm->bound_list);
INIT_LIST_HEAD(&vm->unbound_list);
}
void *__px_vaddr(struct drm_i915_gem_object *p)
@ -286,7 +310,7 @@ fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count)
void *vaddr = __px_vaddr(p);
memset64(vaddr, val, count);
clflush_cache_range(vaddr, PAGE_SIZE);
drm_clflush_virt_range(vaddr, PAGE_SIZE);
}
static void poison_scratch_page(struct drm_i915_gem_object *scratch)

Просмотреть файл

@ -240,15 +240,6 @@ struct i915_address_space {
unsigned int bind_async_flags;
/*
* Each active user context has its own address space (in full-ppgtt).
* Since the vm may be shared between multiple contexts, we count how
* many contexts keep us "open". Once open hits zero, we are closed
* and do not allow any new attachments, and proceed to shutdown our
* vma and page directories.
*/
atomic_t open;
struct mutex mutex; /* protects vma and our lists */
struct kref resv_ref; /* kref to keep the reservation lock alive. */
@ -263,6 +254,11 @@ struct i915_address_space {
*/
struct list_head bound_list;
/**
* List of vmas not yet bound or evicted.
*/
struct list_head unbound_list;
/* Global GTT */
bool is_ggtt:1;
@ -272,6 +268,9 @@ struct i915_address_space {
/* Some systems support read-only mappings for GGTT and/or PPGTT */
bool has_read_only:1;
/* Skip pte rewrite on unbind for suspend. Protected by @mutex */
bool skip_pte_rewrite:1;
u8 top;
u8 pd_shift;
u8 scratch_order;
@ -448,6 +447,17 @@ i915_vm_get(struct i915_address_space *vm)
return vm;
}
static inline struct i915_address_space *
i915_vm_tryget(struct i915_address_space *vm)
{
return kref_get_unless_zero(&vm->ref) ? vm : NULL;
}
static inline void assert_vm_alive(struct i915_address_space *vm)
{
GEM_BUG_ON(!kref_read(&vm->ref));
}
/**
* i915_vm_resv_get - Obtain a reference on the vm's reservation lock
* @vm: The vm whose reservation lock we want to share.
@ -478,34 +488,6 @@ static inline void i915_vm_resv_put(struct i915_address_space *vm)
kref_put(&vm->resv_ref, i915_vm_resv_release);
}
static inline struct i915_address_space *
i915_vm_open(struct i915_address_space *vm)
{
GEM_BUG_ON(!atomic_read(&vm->open));
atomic_inc(&vm->open);
return i915_vm_get(vm);
}
static inline bool
i915_vm_tryopen(struct i915_address_space *vm)
{
if (atomic_add_unless(&vm->open, 1, 0))
return i915_vm_get(vm);
return false;
}
void __i915_vm_close(struct i915_address_space *vm);
static inline void
i915_vm_close(struct i915_address_space *vm)
{
GEM_BUG_ON(!atomic_read(&vm->open));
__i915_vm_close(vm);
i915_vm_put(vm);
}
void i915_address_space_init(struct i915_address_space *vm, int subclass);
void i915_address_space_fini(struct i915_address_space *vm);
@ -567,6 +549,14 @@ i915_page_dir_dma_addr(const struct i915_ppgtt *ppgtt, const unsigned int n)
void ppgtt_init(struct i915_ppgtt *ppgtt, struct intel_gt *gt,
unsigned long lmem_pt_obj_flags);
void intel_ggtt_bind_vma(struct i915_address_space *vm,
struct i915_vm_pt_stash *stash,
struct i915_vma_resource *vma_res,
enum i915_cache_level cache_level,
u32 flags);
void intel_ggtt_unbind_vma(struct i915_address_space *vm,
struct i915_vma_resource *vma_res);
int i915_ggtt_probe_hw(struct drm_i915_private *i915);
int i915_ggtt_init_hw(struct drm_i915_private *i915);
int i915_ggtt_enable_hw(struct drm_i915_private *i915);
@ -637,6 +627,7 @@ release_pd_entry(struct i915_page_directory * const pd,
struct i915_page_table * const pt,
const struct drm_i915_gem_object * const scratch);
void gen6_ggtt_invalidate(struct i915_ggtt *ggtt);
void gen8_ggtt_invalidate(struct i915_ggtt *ggtt);
void ppgtt_bind_vma(struct i915_address_space *vm,
struct i915_vm_pt_stash *stash,

Просмотреть файл

@ -0,0 +1,21 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _INTEL_HWCONFIG_H_
#define _INTEL_HWCONFIG_H_
#include <linux/types.h>
struct intel_gt;
struct intel_hwconfig {
u32 size;
void *ptr;
};
int intel_gt_init_hwconfig(struct intel_gt *gt);
void intel_gt_fini_hwconfig(struct intel_gt *gt);
#endif /* _INTEL_HWCONFIG_H_ */

Просмотреть файл

@ -778,7 +778,7 @@ static void init_common_regs(u32 * const regs,
CTX_CTRL_RS_CTX_ENABLE);
regs[CTX_CONTEXT_CONTROL] = ctl;
regs[CTX_TIMESTAMP] = ce->runtime.last;
regs[CTX_TIMESTAMP] = ce->stats.runtime.last;
}
static void init_wa_bb_regs(u32 * const regs,
@ -1208,6 +1208,10 @@ gen12_emit_indirect_ctx_rcs(const struct intel_context *ce, u32 *cs)
IS_DG2_G11(ce->engine->i915))
cs = gen8_emit_pipe_control(cs, PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE, 0);
/* hsdes: 1809175790 */
if (!HAS_FLAT_CCS(ce->engine->i915))
cs = gen12_emit_aux_table_inv(cs, GEN12_GFX_CCS_AUX_NV);
return cs;
}
@ -1225,6 +1229,14 @@ gen12_emit_indirect_ctx_xcs(const struct intel_context *ce, u32 *cs)
PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE,
0);
/* hsdes: 1809175790 */
if (!HAS_FLAT_CCS(ce->engine->i915)) {
if (ce->engine->class == VIDEO_DECODE_CLASS)
cs = gen12_emit_aux_table_inv(cs, GEN12_VD0_AUX_NV);
else if (ce->engine->class == VIDEO_ENHANCEMENT_CLASS)
cs = gen12_emit_aux_table_inv(cs, GEN12_VE0_AUX_NV);
}
return cs;
}
@ -1722,11 +1734,12 @@ err:
}
}
static void st_update_runtime_underflow(struct intel_context *ce, s32 dt)
static void st_runtime_underflow(struct intel_context_stats *stats, s32 dt)
{
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
ce->runtime.num_underflow++;
ce->runtime.max_underflow = max_t(u32, ce->runtime.max_underflow, -dt);
stats->runtime.num_underflow++;
stats->runtime.max_underflow =
max_t(u32, stats->runtime.max_underflow, -dt);
#endif
}
@ -1743,25 +1756,25 @@ static u32 lrc_get_runtime(const struct intel_context *ce)
void lrc_update_runtime(struct intel_context *ce)
{
struct intel_context_stats *stats = &ce->stats;
u32 old;
s32 dt;
if (intel_context_is_barrier(ce))
old = stats->runtime.last;
stats->runtime.last = lrc_get_runtime(ce);
dt = stats->runtime.last - old;
if (!dt)
return;
old = ce->runtime.last;
ce->runtime.last = lrc_get_runtime(ce);
dt = ce->runtime.last - old;
if (unlikely(dt < 0)) {
CE_TRACE(ce, "runtime underflow: last=%u, new=%u, delta=%d\n",
old, ce->runtime.last, dt);
st_update_runtime_underflow(ce, dt);
old, stats->runtime.last, dt);
st_runtime_underflow(stats, dt);
return;
}
ewma_runtime_add(&ce->runtime.avg, dt);
ce->runtime.total += dt;
ewma_runtime_add(&stats->runtime.avg, dt);
stats->runtime.total += dt;
}
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)

Просмотреть файл

@ -11,9 +11,10 @@
#include <linux/bitfield.h>
#include <linux/types.h>
#include "intel_context.h"
struct drm_i915_gem_object;
struct i915_gem_ww_ctx;
struct intel_context;
struct intel_engine_cs;
struct intel_ring;
struct kref;
@ -120,4 +121,28 @@ static inline u32 lrc_desc_priority(int prio)
return GEN12_CTX_PRIORITY_NORMAL;
}
static inline void lrc_runtime_start(struct intel_context *ce)
{
struct intel_context_stats *stats = &ce->stats;
if (intel_context_is_barrier(ce))
return;
if (stats->active)
return;
WRITE_ONCE(stats->active, intel_context_clock());
}
static inline void lrc_runtime_stop(struct intel_context *ce)
{
struct intel_context_stats *stats = &ce->stats;
if (!stats->active)
return;
lrc_update_runtime(ce);
WRITE_ONCE(stats->active, 0);
}
#endif /* __INTEL_LRC_H__ */

Просмотреть файл

@ -17,6 +17,8 @@ struct insert_pte_data {
#define CHUNK_SZ SZ_8M /* ~1ms at 8GiB/s preemption delay */
#define GET_CCS_BYTES(i915, size) (HAS_FLAT_CCS(i915) ? \
DIV_ROUND_UP(size, NUM_BYTES_PER_CCS_BYTE) : 0)
static bool engine_supports_migration(struct intel_engine_cs *engine)
{
if (!engine)
@ -467,6 +469,123 @@ static bool wa_1209644611_applies(int ver, u32 size)
return height % 4 == 3 && height <= 8;
}
/**
* DOC: Flat-CCS - Memory compression for Local memory
*
* On Xe-HP and later devices, we use dedicated compression control state (CCS)
* stored in local memory for each surface, to support the 3D and media
* compression formats.
*
* The memory required for the CCS of the entire local memory is 1/256 of the
* local memory size. So before the kernel boot, the required memory is reserved
* for the CCS data and a secure register will be programmed with the CCS base
* address.
*
* Flat CCS data needs to be cleared when a lmem object is allocated.
* And CCS data can be copied in and out of CCS region through
* XY_CTRL_SURF_COPY_BLT. CPU can't access the CCS data directly.
*
* When we exhaust the lmem, if the object's placements support smem, then we can
* directly decompress the compressed lmem object into smem and start using it
* from smem itself.
*
* But when we need to swapout the compressed lmem object into a smem region
* though objects' placement doesn't support smem, then we copy the lmem content
* as it is into smem region along with ccs data (using XY_CTRL_SURF_COPY_BLT).
* When the object is referred, lmem content will be swaped in along with
* restoration of the CCS data (using XY_CTRL_SURF_COPY_BLT) at corresponding
* location.
*/
static inline u32 *i915_flush_dw(u32 *cmd, u32 flags)
{
*cmd++ = MI_FLUSH_DW | flags;
*cmd++ = 0;
*cmd++ = 0;
return cmd;
}
static u32 calc_ctrl_surf_instr_size(struct drm_i915_private *i915, int size)
{
u32 num_cmds, num_blks, total_size;
if (!GET_CCS_BYTES(i915, size))
return 0;
/*
* XY_CTRL_SURF_COPY_BLT transfers CCS in 256 byte
* blocks. one XY_CTRL_SURF_COPY_BLT command can
* transfer upto 1024 blocks.
*/
num_blks = DIV_ROUND_UP(GET_CCS_BYTES(i915, size),
NUM_CCS_BYTES_PER_BLOCK);
num_cmds = DIV_ROUND_UP(num_blks, NUM_CCS_BLKS_PER_XFER);
total_size = XY_CTRL_SURF_INSTR_SIZE * num_cmds;
/*
* Adding a flush before and after XY_CTRL_SURF_COPY_BLT
*/
total_size += 2 * MI_FLUSH_DW_SIZE;
return total_size;
}
static int emit_copy_ccs(struct i915_request *rq,
u32 dst_offset, u8 dst_access,
u32 src_offset, u8 src_access, int size)
{
struct drm_i915_private *i915 = rq->engine->i915;
int mocs = rq->engine->gt->mocs.uc_index << 1;
u32 num_ccs_blks, ccs_ring_size;
u32 *cs;
ccs_ring_size = calc_ctrl_surf_instr_size(i915, size);
WARN_ON(!ccs_ring_size);
cs = intel_ring_begin(rq, round_up(ccs_ring_size, 2));
if (IS_ERR(cs))
return PTR_ERR(cs);
num_ccs_blks = DIV_ROUND_UP(GET_CCS_BYTES(i915, size),
NUM_CCS_BYTES_PER_BLOCK);
GEM_BUG_ON(num_ccs_blks > NUM_CCS_BLKS_PER_XFER);
cs = i915_flush_dw(cs, MI_FLUSH_DW_LLC | MI_FLUSH_DW_CCS);
/*
* The XY_CTRL_SURF_COPY_BLT instruction is used to copy the CCS
* data in and out of the CCS region.
*
* We can copy at most 1024 blocks of 256 bytes using one
* XY_CTRL_SURF_COPY_BLT instruction.
*
* In case we need to copy more than 1024 blocks, we need to add
* another instruction to the same batch buffer.
*
* 1024 blocks of 256 bytes of CCS represent a total 256KB of CCS.
*
* 256 KB of CCS represents 256 * 256 KB = 64 MB of LMEM.
*/
*cs++ = XY_CTRL_SURF_COPY_BLT |
src_access << SRC_ACCESS_TYPE_SHIFT |
dst_access << DST_ACCESS_TYPE_SHIFT |
((num_ccs_blks - 1) & CCS_SIZE_MASK) << CCS_SIZE_SHIFT;
*cs++ = src_offset;
*cs++ = rq->engine->instance |
FIELD_PREP(XY_CTRL_SURF_MOCS_MASK, mocs);
*cs++ = dst_offset;
*cs++ = rq->engine->instance |
FIELD_PREP(XY_CTRL_SURF_MOCS_MASK, mocs);
cs = i915_flush_dw(cs, MI_FLUSH_DW_LLC | MI_FLUSH_DW_CCS);
if (ccs_ring_size & 1)
*cs++ = MI_NOOP;
intel_ring_advance(rq, cs);
return 0;
}
static int emit_copy(struct i915_request *rq,
u32 dst_offset, u32 src_offset, int size)
{
@ -514,6 +633,65 @@ static int emit_copy(struct i915_request *rq,
return 0;
}
static int scatter_list_length(struct scatterlist *sg)
{
int len = 0;
while (sg && sg_dma_len(sg)) {
len += sg_dma_len(sg);
sg = sg_next(sg);
};
return len;
}
static void
calculate_chunk_sz(struct drm_i915_private *i915, bool src_is_lmem,
int *src_sz, int *ccs_sz, u32 bytes_to_cpy,
u32 ccs_bytes_to_cpy)
{
if (ccs_bytes_to_cpy) {
/*
* We can only copy the ccs data corresponding to
* the CHUNK_SZ of lmem which is
* GET_CCS_BYTES(i915, CHUNK_SZ))
*/
*ccs_sz = min_t(int, ccs_bytes_to_cpy, GET_CCS_BYTES(i915, CHUNK_SZ));
if (!src_is_lmem)
/*
* When CHUNK_SZ is passed all the pages upto CHUNK_SZ
* will be taken for the blt. in Flat-ccs supported
* platform Smem obj will have more pages than required
* for main meory hence limit it to the required size
* for main memory
*/
*src_sz = min_t(int, bytes_to_cpy, CHUNK_SZ);
} else { /* ccs handling is not required */
*src_sz = CHUNK_SZ;
}
}
static void get_ccs_sg_sgt(struct sgt_dma *it, u32 bytes_to_cpy)
{
u32 len;
do {
GEM_BUG_ON(!it->sg || !sg_dma_len(it->sg));
len = it->max - it->dma;
if (len > bytes_to_cpy) {
it->dma += bytes_to_cpy;
break;
}
bytes_to_cpy -= len;
it->sg = __sg_next(it->sg);
it->dma = sg_dma_address(it->sg);
it->max = it->dma + sg_dma_len(it->sg);
} while (bytes_to_cpy);
}
int
intel_context_migrate_copy(struct intel_context *ce,
const struct i915_deps *deps,
@ -525,17 +703,67 @@ intel_context_migrate_copy(struct intel_context *ce,
bool dst_is_lmem,
struct i915_request **out)
{
struct sgt_dma it_src = sg_sgt(src), it_dst = sg_sgt(dst);
struct sgt_dma it_src = sg_sgt(src), it_dst = sg_sgt(dst), it_ccs;
struct drm_i915_private *i915 = ce->engine->i915;
u32 ccs_bytes_to_cpy = 0, bytes_to_cpy;
enum i915_cache_level ccs_cache_level;
int src_sz, dst_sz, ccs_sz;
u32 src_offset, dst_offset;
u8 src_access, dst_access;
struct i915_request *rq;
bool ccs_is_src;
int err;
GEM_BUG_ON(ce->vm != ce->engine->gt->migrate.context->vm);
GEM_BUG_ON(IS_DGFX(ce->engine->i915) && (!src_is_lmem && !dst_is_lmem));
*out = NULL;
GEM_BUG_ON(ce->ring->size < SZ_64K);
src_sz = scatter_list_length(src);
bytes_to_cpy = src_sz;
if (HAS_FLAT_CCS(i915) && src_is_lmem ^ dst_is_lmem) {
src_access = !src_is_lmem && dst_is_lmem;
dst_access = !src_access;
dst_sz = scatter_list_length(dst);
if (src_is_lmem) {
it_ccs = it_dst;
ccs_cache_level = dst_cache_level;
ccs_is_src = false;
} else if (dst_is_lmem) {
bytes_to_cpy = dst_sz;
it_ccs = it_src;
ccs_cache_level = src_cache_level;
ccs_is_src = true;
}
/*
* When there is a eviction of ccs needed smem will have the
* extra pages for the ccs data
*
* TO-DO: Want to move the size mismatch check to a WARN_ON,
* but still we have some requests of smem->lmem with same size.
* Need to fix it.
*/
ccs_bytes_to_cpy = src_sz != dst_sz ? GET_CCS_BYTES(i915, bytes_to_cpy) : 0;
if (ccs_bytes_to_cpy)
get_ccs_sg_sgt(&it_ccs, bytes_to_cpy);
}
src_offset = 0;
dst_offset = CHUNK_SZ;
if (HAS_64K_PAGES(ce->engine->i915)) {
src_offset = 0;
dst_offset = 0;
if (src_is_lmem)
src_offset = CHUNK_SZ;
if (dst_is_lmem)
dst_offset = 2 * CHUNK_SZ;
}
do {
u32 src_offset, dst_offset;
int len;
rq = i915_request_create(ce);
@ -563,22 +791,16 @@ intel_context_migrate_copy(struct intel_context *ce,
if (err)
goto out_rq;
src_offset = 0;
dst_offset = CHUNK_SZ;
if (HAS_64K_PAGES(ce->engine->i915)) {
GEM_BUG_ON(!src_is_lmem && !dst_is_lmem);
src_offset = 0;
dst_offset = 0;
if (src_is_lmem)
src_offset = CHUNK_SZ;
if (dst_is_lmem)
dst_offset = 2 * CHUNK_SZ;
}
calculate_chunk_sz(i915, src_is_lmem, &src_sz, &ccs_sz,
bytes_to_cpy, ccs_bytes_to_cpy);
len = emit_pte(rq, &it_src, src_cache_level, src_is_lmem,
src_offset, CHUNK_SZ);
if (len <= 0) {
src_offset, src_sz);
if (!len) {
err = -EINVAL;
goto out_rq;
}
if (len < 0) {
err = len;
goto out_rq;
}
@ -596,7 +818,46 @@ intel_context_migrate_copy(struct intel_context *ce,
if (err)
goto out_rq;
err = emit_copy(rq, dst_offset, src_offset, len);
err = emit_copy(rq, dst_offset, src_offset, len);
if (err)
goto out_rq;
bytes_to_cpy -= len;
if (ccs_bytes_to_cpy) {
err = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
if (err)
goto out_rq;
err = emit_pte(rq, &it_ccs, ccs_cache_level, false,
ccs_is_src ? src_offset : dst_offset,
ccs_sz);
err = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
if (err)
goto out_rq;
/*
* Using max of src_sz and dst_sz, as we need to
* pass the lmem size corresponding to the ccs
* blocks we need to handle.
*/
ccs_sz = max_t(int, ccs_is_src ? ccs_sz : src_sz,
ccs_is_src ? dst_sz : ccs_sz);
err = emit_copy_ccs(rq, dst_offset, dst_access,
src_offset, src_access, ccs_sz);
if (err)
goto out_rq;
err = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
if (err)
goto out_rq;
/* Converting back to ccs bytes */
ccs_sz = GET_CCS_BYTES(rq->engine->i915, ccs_sz);
ccs_bytes_to_cpy -= ccs_sz;
}
/* Arbitration is re-enabled between requests. */
out_rq:
@ -604,9 +865,26 @@ out_rq:
i915_request_put(*out);
*out = i915_request_get(rq);
i915_request_add(rq);
if (err || !it_src.sg || !sg_dma_len(it_src.sg))
if (err)
break;
if (!bytes_to_cpy && !ccs_bytes_to_cpy) {
if (src_is_lmem)
WARN_ON(it_src.sg && sg_dma_len(it_src.sg));
else
WARN_ON(it_dst.sg && sg_dma_len(it_dst.sg));
break;
}
if (WARN_ON(!it_src.sg || !sg_dma_len(it_src.sg) ||
!it_dst.sg || !sg_dma_len(it_dst.sg) ||
(ccs_bytes_to_cpy && (!it_ccs.sg ||
!sg_dma_len(it_ccs.sg))))) {
err = -EINVAL;
break;
}
cond_resched();
} while (1);
@ -614,35 +892,65 @@ out_ce:
return err;
}
static int emit_clear(struct i915_request *rq, u64 offset, int size, u32 value)
static int emit_clear(struct i915_request *rq, u32 offset, int size,
u32 value, bool is_lmem)
{
const int ver = GRAPHICS_VER(rq->engine->i915);
struct drm_i915_private *i915 = rq->engine->i915;
int mocs = rq->engine->gt->mocs.uc_index << 1;
const int ver = GRAPHICS_VER(i915);
int ring_sz;
u32 *cs;
GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
offset += (u64)rq->engine->instance << 32;
if (HAS_FLAT_CCS(i915) && ver >= 12)
ring_sz = XY_FAST_COLOR_BLT_DW;
else if (ver >= 8)
ring_sz = 8;
else
ring_sz = 6;
cs = intel_ring_begin(rq, ver >= 8 ? 8 : 6);
cs = intel_ring_begin(rq, ring_sz);
if (IS_ERR(cs))
return PTR_ERR(cs);
if (ver >= 8) {
if (HAS_FLAT_CCS(i915) && ver >= 12) {
*cs++ = XY_FAST_COLOR_BLT_CMD | XY_FAST_COLOR_BLT_DEPTH_32 |
(XY_FAST_COLOR_BLT_DW - 2);
*cs++ = FIELD_PREP(XY_FAST_COLOR_BLT_MOCS_MASK, mocs) |
(PAGE_SIZE - 1);
*cs++ = 0;
*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
*cs++ = offset;
*cs++ = rq->engine->instance;
*cs++ = !is_lmem << XY_FAST_COLOR_BLT_MEM_TYPE_SHIFT;
/* BG7 */
*cs++ = value;
*cs++ = 0;
*cs++ = 0;
*cs++ = 0;
/* BG11 */
*cs++ = 0;
*cs++ = 0;
/* BG13 */
*cs++ = 0;
*cs++ = 0;
*cs++ = 0;
} else if (ver >= 8) {
*cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7 - 2);
*cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
*cs++ = 0;
*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
*cs++ = lower_32_bits(offset);
*cs++ = upper_32_bits(offset);
*cs++ = offset;
*cs++ = rq->engine->instance;
*cs++ = value;
*cs++ = MI_NOOP;
} else {
GEM_BUG_ON(upper_32_bits(offset));
*cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
*cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
*cs++ = 0;
*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
*cs++ = lower_32_bits(offset);
*cs++ = offset;
*cs++ = value;
}
@ -659,8 +967,10 @@ intel_context_migrate_clear(struct intel_context *ce,
u32 value,
struct i915_request **out)
{
struct drm_i915_private *i915 = ce->engine->i915;
struct sgt_dma it = sg_sgt(sg);
struct i915_request *rq;
u32 offset;
int err;
GEM_BUG_ON(ce->vm != ce->engine->gt->migrate.context->vm);
@ -668,8 +978,11 @@ intel_context_migrate_clear(struct intel_context *ce,
GEM_BUG_ON(ce->ring->size < SZ_64K);
offset = 0;
if (HAS_64K_PAGES(i915) && is_lmem)
offset = CHUNK_SZ;
do {
u32 offset;
int len;
rq = i915_request_create(ce);
@ -697,10 +1010,6 @@ intel_context_migrate_clear(struct intel_context *ce,
if (err)
goto out_rq;
offset = 0;
if (HAS_64K_PAGES(ce->engine->i915) && is_lmem)
offset = CHUNK_SZ;
len = emit_pte(rq, &it, cache_level, is_lmem, offset, CHUNK_SZ);
if (len <= 0) {
err = len;
@ -711,7 +1020,22 @@ intel_context_migrate_clear(struct intel_context *ce,
if (err)
goto out_rq;
err = emit_clear(rq, offset, len, value);
err = emit_clear(rq, offset, len, value, is_lmem);
if (err)
goto out_rq;
if (HAS_FLAT_CCS(i915) && is_lmem && !value) {
/*
* copy the content of memory into corresponding
* ccs surface
*/
err = emit_copy_ccs(rq, offset, INDIRECT_ACCESS, offset,
DIRECT_ACCESS, len);
if (err)
goto out_rq;
}
err = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
/* Arbitration is re-enabled between requests. */
out_rq:

Просмотреть файл

@ -91,7 +91,7 @@ write_dma_entry(struct drm_i915_gem_object * const pdma,
u64 * const vaddr = __px_vaddr(pdma);
vaddr[idx] = encoded_entry;
clflush_cache_range(&vaddr[idx], sizeof(u64));
drm_clflush_virt_range(&vaddr[idx], sizeof(u64));
}
void

Просмотреть файл

@ -6,6 +6,7 @@
#include <linux/pm_runtime.h>
#include <linux/string_helpers.h>
#include "gem/i915_gem_region.h"
#include "i915_drv.h"
#include "i915_reg.h"
#include "i915_vgpu.h"
@ -325,9 +326,10 @@ static int vlv_rc6_init(struct intel_rc6 *rc6)
resource_size_t pcbr_offset;
pcbr_offset = (pcbr & ~4095) - i915->dsm.start;
pctx = i915_gem_object_create_stolen_for_preallocated(i915,
pcbr_offset,
pctx_size);
pctx = i915_gem_object_create_region_at(i915->mm.stolen_region,
pcbr_offset,
pctx_size,
0);
if (IS_ERR(pctx))
return PTR_ERR(pctx);

Просмотреть файл

@ -93,6 +93,7 @@ static struct intel_memory_region *setup_lmem(struct intel_gt *gt)
struct intel_memory_region *mem;
resource_size_t min_page_size;
resource_size_t io_start;
resource_size_t io_size;
resource_size_t lmem_size;
int err;
@ -122,9 +123,14 @@ static struct intel_memory_region *setup_lmem(struct intel_gt *gt)
lmem_size = intel_uncore_read64(&i915->uncore, GEN12_GSMBASE);
}
if (i915->params.lmem_size > 0) {
lmem_size = min_t(resource_size_t, lmem_size,
mul_u32_u32(i915->params.lmem_size, SZ_1M));
}
io_start = pci_resource_start(pdev, 2);
if (GEM_WARN_ON(lmem_size > pci_resource_len(pdev, 2)))
io_size = min(pci_resource_len(pdev, 2), lmem_size);
if (!io_size)
return ERR_PTR(-ENODEV);
min_page_size = HAS_64K_PAGES(i915) ? I915_GTT_PAGE_SIZE_64K :
@ -134,7 +140,7 @@ static struct intel_memory_region *setup_lmem(struct intel_gt *gt)
lmem_size,
min_page_size,
io_start,
lmem_size,
io_size,
INTEL_MEMORY_LOCAL,
0,
&intel_region_lmem_ops);

Просмотреть файл

@ -772,14 +772,15 @@ static intel_engine_mask_t reset_prepare(struct intel_gt *gt)
intel_engine_mask_t awake = 0;
enum intel_engine_id id;
/* For GuC mode, ensure submission is disabled before stopping ring */
intel_uc_reset_prepare(&gt->uc);
for_each_engine(engine, gt, id) {
if (intel_engine_pm_get_if_awake(engine))
awake |= engine->mask;
reset_prepare_engine(engine);
}
intel_uc_reset_prepare(&gt->uc);
return awake;
}
@ -1319,7 +1320,7 @@ void intel_gt_handle_error(struct intel_gt *gt,
engine_mask &= gt->info.engine_mask;
if (flags & I915_ERROR_CAPTURE) {
i915_capture_error_state(gt, engine_mask);
i915_capture_error_state(gt, engine_mask, CORE_DUMP_FLAG_NONE);
intel_gt_clear_error_registers(gt, engine_mask);
}

Просмотреть файл

@ -767,7 +767,7 @@ static int mi_set_context(struct i915_request *rq,
if (GRAPHICS_VER(i915) == 7) {
if (num_engines) {
struct intel_engine_cs *signaller;
i915_reg_t last_reg = {}; /* keep gcc quiet */
i915_reg_t last_reg = INVALID_MMIO_REG; /* keep gcc quiet */
*cs++ = MI_LOAD_REGISTER_IMM(num_engines);
for_each_engine(signaller, engine->gt, id) {

Просмотреть файл

@ -1070,24 +1070,67 @@ int intel_rps_set(struct intel_rps *rps, u8 val)
return 0;
}
static void gen6_rps_init(struct intel_rps *rps)
static u32 intel_rps_read_state_cap(struct intel_rps *rps)
{
struct drm_i915_private *i915 = rps_to_i915(rps);
u32 rp_state_cap = intel_rps_read_state_cap(rps);
struct intel_uncore *uncore = rps_to_uncore(rps);
/* All of these values are in units of 50MHz */
if (IS_XEHPSDV(i915))
return intel_uncore_read(uncore, XEHPSDV_RP_STATE_CAP);
else if (IS_GEN9_LP(i915))
return intel_uncore_read(uncore, BXT_RP_STATE_CAP);
else
return intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
}
/**
* gen6_rps_get_freq_caps - Get freq caps exposed by HW
* @rps: the intel_rps structure
* @caps: returned freq caps
*
* Returned "caps" frequencies should be converted to MHz using
* intel_gpu_freq()
*/
void gen6_rps_get_freq_caps(struct intel_rps *rps, struct intel_rps_freq_caps *caps)
{
struct drm_i915_private *i915 = rps_to_i915(rps);
u32 rp_state_cap;
rp_state_cap = intel_rps_read_state_cap(rps);
/* static values from HW: RP0 > RP1 > RPn (min_freq) */
if (IS_GEN9_LP(i915)) {
rps->rp0_freq = (rp_state_cap >> 16) & 0xff;
rps->rp1_freq = (rp_state_cap >> 8) & 0xff;
rps->min_freq = (rp_state_cap >> 0) & 0xff;
caps->rp0_freq = (rp_state_cap >> 16) & 0xff;
caps->rp1_freq = (rp_state_cap >> 8) & 0xff;
caps->min_freq = (rp_state_cap >> 0) & 0xff;
} else {
rps->rp0_freq = (rp_state_cap >> 0) & 0xff;
rps->rp1_freq = (rp_state_cap >> 8) & 0xff;
rps->min_freq = (rp_state_cap >> 16) & 0xff;
caps->rp0_freq = (rp_state_cap >> 0) & 0xff;
caps->rp1_freq = (rp_state_cap >> 8) & 0xff;
caps->min_freq = (rp_state_cap >> 16) & 0xff;
}
if (IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 11) {
/*
* In this case rp_state_cap register reports frequencies in
* units of 50 MHz. Convert these to the actual "hw unit", i.e.
* units of 16.67 MHz
*/
caps->rp0_freq *= GEN9_FREQ_SCALER;
caps->rp1_freq *= GEN9_FREQ_SCALER;
caps->min_freq *= GEN9_FREQ_SCALER;
}
}
static void gen6_rps_init(struct intel_rps *rps)
{
struct drm_i915_private *i915 = rps_to_i915(rps);
struct intel_rps_freq_caps caps;
gen6_rps_get_freq_caps(rps, &caps);
rps->rp0_freq = caps.rp0_freq;
rps->rp1_freq = caps.rp1_freq;
rps->min_freq = caps.min_freq;
/* hw_max = RP0 until we check for overclocking */
rps->max_freq = rps->rp0_freq;
@ -1095,26 +1138,18 @@ static void gen6_rps_init(struct intel_rps *rps)
if (IS_HASWELL(i915) || IS_BROADWELL(i915) ||
IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 11) {
u32 ddcc_status = 0;
u32 mult = 1;
if (IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 11)
mult = GEN9_FREQ_SCALER;
if (snb_pcode_read(i915, HSW_PCODE_DYNAMIC_DUTY_CYCLE_CONTROL,
&ddcc_status, NULL) == 0)
rps->efficient_freq =
clamp_t(u8,
(ddcc_status >> 8) & 0xff,
clamp_t(u32,
((ddcc_status >> 8) & 0xff) * mult,
rps->min_freq,
rps->max_freq);
}
if (IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 11) {
/* Store the frequency values in 16.66 MHZ units, which is
* the natural hardware unit for SKL
*/
rps->rp0_freq *= GEN9_FREQ_SCALER;
rps->rp1_freq *= GEN9_FREQ_SCALER;
rps->min_freq *= GEN9_FREQ_SCALER;
rps->max_freq *= GEN9_FREQ_SCALER;
rps->efficient_freq *= GEN9_FREQ_SCALER;
}
}
static bool rps_reset(struct intel_rps *rps)
@ -2219,19 +2254,6 @@ int intel_rps_set_min_frequency(struct intel_rps *rps, u32 val)
return set_min_freq(rps, val);
}
u32 intel_rps_read_state_cap(struct intel_rps *rps)
{
struct drm_i915_private *i915 = rps_to_i915(rps);
struct intel_uncore *uncore = rps_to_uncore(rps);
if (IS_XEHPSDV(i915))
return intel_uncore_read(uncore, XEHPSDV_RP_STATE_CAP);
else if (IS_GEN9_LP(i915))
return intel_uncore_read(uncore, BXT_RP_STATE_CAP);
else
return intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
}
static void intel_rps_set_manual(struct intel_rps *rps, bool enable)
{
struct intel_uncore *uncore = rps_to_uncore(rps);
@ -2244,18 +2266,18 @@ static void intel_rps_set_manual(struct intel_rps *rps, bool enable)
void intel_rps_raise_unslice(struct intel_rps *rps)
{
struct intel_uncore *uncore = rps_to_uncore(rps);
u32 rp0_unslice_req;
mutex_lock(&rps->lock);
if (rps_uses_slpc(rps)) {
/* RP limits have not been initialized yet for SLPC path */
rp0_unslice_req = ((intel_rps_read_state_cap(rps) >> 0)
& 0xff) * GEN9_FREQ_SCALER;
struct intel_rps_freq_caps caps;
gen6_rps_get_freq_caps(rps, &caps);
intel_rps_set_manual(rps, true);
intel_uncore_write(uncore, GEN6_RPNSWREQ,
((rp0_unslice_req <<
((caps.rp0_freq <<
GEN9_SW_REQ_UNSLICE_RATIO_SHIFT) |
GEN9_IGNORE_SLICE_RATIO));
intel_rps_set_manual(rps, false);
@ -2269,18 +2291,18 @@ void intel_rps_raise_unslice(struct intel_rps *rps)
void intel_rps_lower_unslice(struct intel_rps *rps)
{
struct intel_uncore *uncore = rps_to_uncore(rps);
u32 rpn_unslice_req;
mutex_lock(&rps->lock);
if (rps_uses_slpc(rps)) {
/* RP limits have not been initialized yet for SLPC path */
rpn_unslice_req = ((intel_rps_read_state_cap(rps) >> 16)
& 0xff) * GEN9_FREQ_SCALER;
struct intel_rps_freq_caps caps;
gen6_rps_get_freq_caps(rps, &caps);
intel_rps_set_manual(rps, true);
intel_uncore_write(uncore, GEN6_RPNSWREQ,
((rpn_unslice_req <<
((caps.min_freq <<
GEN9_SW_REQ_UNSLICE_RATIO_SHIFT) |
GEN9_IGNORE_SLICE_RATIO));
intel_rps_set_manual(rps, false);
@ -2291,6 +2313,24 @@ void intel_rps_lower_unslice(struct intel_rps *rps)
mutex_unlock(&rps->lock);
}
static u32 rps_read_mmio(struct intel_rps *rps, i915_reg_t reg32)
{
struct intel_gt *gt = rps_to_gt(rps);
intel_wakeref_t wakeref;
u32 val;
with_intel_runtime_pm(gt->uncore->rpm, wakeref)
val = intel_uncore_read(gt->uncore, reg32);
return val;
}
bool rps_read_mask_mmio(struct intel_rps *rps,
i915_reg_t reg32, u32 mask)
{
return rps_read_mmio(rps, reg32) & mask;
}
/* External interface for intel_ips.ko */
static struct drm_i915_private __rcu *ips_mchdev;

Просмотреть файл

@ -7,6 +7,7 @@
#define INTEL_RPS_H
#include "intel_rps_types.h"
#include "i915_reg_defs.h"
struct i915_request;
@ -44,10 +45,13 @@ u32 intel_rps_get_rp1_frequency(struct intel_rps *rps);
u32 intel_rps_get_rpn_frequency(struct intel_rps *rps);
u32 intel_rps_read_punit_req(struct intel_rps *rps);
u32 intel_rps_read_punit_req_frequency(struct intel_rps *rps);
u32 intel_rps_read_state_cap(struct intel_rps *rps);
void gen6_rps_get_freq_caps(struct intel_rps *rps, struct intel_rps_freq_caps *caps);
void intel_rps_raise_unslice(struct intel_rps *rps);
void intel_rps_lower_unslice(struct intel_rps *rps);
u32 intel_rps_read_throttle_reason(struct intel_rps *rps);
bool rps_read_mask_mmio(struct intel_rps *rps, i915_reg_t reg32, u32 mask);
void gen5_rps_irq_handler(struct intel_rps *rps);
void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir);
void gen11_rps_irq_handler(struct intel_rps *rps, u32 pm_iir);

Просмотреть файл

@ -37,6 +37,21 @@ enum {
INTEL_RPS_TIMER,
};
/**
* struct intel_rps_freq_caps - rps freq capabilities
* @rp0_freq: non-overclocked max frequency
* @rp1_freq: "less than" RP0 power/freqency
* @min_freq: aka RPn, minimum frequency
*
* Freq caps exposed by HW, values are in "hw units" and intel_gpu_freq()
* should be used to convert to MHz
*/
struct intel_rps_freq_caps {
u8 rp0_freq;
u8 rp1_freq;
u8 min_freq;
};
struct intel_rps {
struct mutex lock; /* protects enabling and the worker */

Просмотреть файл

@ -10,6 +10,8 @@
#include "intel_gt_regs.h"
#include "intel_sseu.h"
#include "linux/string_helpers.h"
void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices,
u8 max_subslices, u8 max_eus_per_subslice)
{
@ -35,8 +37,8 @@ intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
}
static u32
_intel_sseu_get_subslices(const struct sseu_dev_info *sseu,
const u8 *subslice_mask, u8 slice)
sseu_get_subslices(const struct sseu_dev_info *sseu,
const u8 *subslice_mask, u8 slice)
{
int i, offset = slice * sseu->ss_stride;
u32 mask = 0;
@ -51,12 +53,17 @@ _intel_sseu_get_subslices(const struct sseu_dev_info *sseu,
u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice)
{
return _intel_sseu_get_subslices(sseu, sseu->subslice_mask, slice);
return sseu_get_subslices(sseu, sseu->subslice_mask, slice);
}
static u32 sseu_get_geometry_subslices(const struct sseu_dev_info *sseu)
{
return sseu_get_subslices(sseu, sseu->geometry_subslice_mask, 0);
}
u32 intel_sseu_get_compute_subslices(const struct sseu_dev_info *sseu)
{
return _intel_sseu_get_subslices(sseu, sseu->compute_subslice_mask, 0);
return sseu_get_subslices(sseu, sseu->compute_subslice_mask, 0);
}
void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
@ -720,16 +727,11 @@ void intel_sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p)
str_yes_no(sseu->has_eu_pg));
}
void intel_sseu_print_topology(const struct sseu_dev_info *sseu,
struct drm_printer *p)
static void sseu_print_hsw_topology(const struct sseu_dev_info *sseu,
struct drm_printer *p)
{
int s, ss;
if (sseu->max_slices == 0) {
drm_printf(p, "Unavailable\n");
return;
}
for (s = 0; s < sseu->max_slices; s++) {
drm_printf(p, "slice%d: %u subslice(s) (0x%08x):\n",
s, intel_sseu_subslices_per_slice(sseu, s),
@ -744,6 +746,36 @@ void intel_sseu_print_topology(const struct sseu_dev_info *sseu,
}
}
static void sseu_print_xehp_topology(const struct sseu_dev_info *sseu,
struct drm_printer *p)
{
u32 g_dss_mask = sseu_get_geometry_subslices(sseu);
u32 c_dss_mask = intel_sseu_get_compute_subslices(sseu);
int dss;
for (dss = 0; dss < sseu->max_subslices; dss++) {
u16 enabled_eus = sseu_get_eus(sseu, 0, dss);
drm_printf(p, "DSS_%02d: G:%3s C:%3s, %2u EUs (0x%04hx)\n", dss,
str_yes_no(g_dss_mask & BIT(dss)),
str_yes_no(c_dss_mask & BIT(dss)),
hweight16(enabled_eus), enabled_eus);
}
}
void intel_sseu_print_topology(struct drm_i915_private *i915,
const struct sseu_dev_info *sseu,
struct drm_printer *p)
{
if (sseu->max_slices == 0) {
drm_printf(p, "Unavailable\n");
} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
sseu_print_xehp_topology(sseu, p);
} else {
sseu_print_hsw_topology(sseu, p);
}
}
u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice)
{
u16 slice_mask = 0;

Просмотреть файл

@ -15,26 +15,49 @@ struct drm_i915_private;
struct intel_gt;
struct drm_printer;
#define GEN_MAX_SLICES (3) /* SKL upper bound */
#define GEN_MAX_SUBSLICES (32) /* XEHPSDV upper bound */
#define GEN_SSEU_STRIDE(max_entries) DIV_ROUND_UP(max_entries, BITS_PER_BYTE)
#define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_MAX_SUBSLICES)
#define GEN_MAX_EUS (16) /* TGL upper bound */
#define GEN_MAX_EU_STRIDE GEN_SSEU_STRIDE(GEN_MAX_EUS)
/*
* Maximum number of slices on older platforms. Slices no longer exist
* starting on Xe_HP ("gslices," "cslices," etc. are a different concept and
* are not expressed through fusing).
*/
#define GEN_MAX_HSW_SLICES 3
/*
* Maximum number of subslices that can exist within a HSW-style slice. This
* is only relevant to pre-Xe_HP platforms (Xe_HP and beyond use the
* GEN_MAX_DSS value below).
*/
#define GEN_MAX_SS_PER_HSW_SLICE 6
/* Maximum number of DSS on newer platforms (Xe_HP and beyond). */
#define GEN_MAX_DSS 32
/* Maximum number of EUs that can exist within a subslice or DSS. */
#define GEN_MAX_EUS_PER_SS 16
#define SSEU_MAX(a, b) ((a) > (b) ? (a) : (b))
/* The maximum number of bits needed to express each subslice/DSS independently */
#define GEN_SS_MASK_SIZE SSEU_MAX(GEN_MAX_DSS, \
GEN_MAX_HSW_SLICES * GEN_MAX_SS_PER_HSW_SLICE)
#define GEN_SSEU_STRIDE(max_entries) DIV_ROUND_UP(max_entries, BITS_PER_BYTE)
#define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_SS_MASK_SIZE)
#define GEN_MAX_EU_STRIDE GEN_SSEU_STRIDE(GEN_MAX_EUS_PER_SS)
#define GEN_DSS_PER_GSLICE 4
#define GEN_DSS_PER_CSLICE 8
#define GEN_DSS_PER_MSLICE 8
#define GEN_MAX_GSLICES (GEN_MAX_SUBSLICES / GEN_DSS_PER_GSLICE)
#define GEN_MAX_CSLICES (GEN_MAX_SUBSLICES / GEN_DSS_PER_CSLICE)
#define GEN_MAX_GSLICES (GEN_MAX_DSS / GEN_DSS_PER_GSLICE)
#define GEN_MAX_CSLICES (GEN_MAX_DSS / GEN_DSS_PER_CSLICE)
struct sseu_dev_info {
u8 slice_mask;
u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
u8 geometry_subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
u8 compute_subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES * GEN_MAX_EU_STRIDE];
u8 subslice_mask[GEN_SS_MASK_SIZE];
u8 geometry_subslice_mask[GEN_SS_MASK_SIZE];
u8 compute_subslice_mask[GEN_SS_MASK_SIZE];
u8 eu_mask[GEN_SS_MASK_SIZE * GEN_MAX_EU_STRIDE];
u16 eu_total;
u8 eu_per_subslice;
u8 min_eu_in_pool;
@ -116,7 +139,8 @@ u32 intel_sseu_make_rpcs(struct intel_gt *gt,
const struct intel_sseu *req_sseu);
void intel_sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p);
void intel_sseu_print_topology(const struct sseu_dev_info *sseu,
void intel_sseu_print_topology(struct drm_i915_private *i915,
const struct sseu_dev_info *sseu,
struct drm_printer *p);
u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice);

Просмотреть файл

@ -248,7 +248,7 @@ int intel_sseu_status(struct seq_file *m, struct intel_gt *gt)
{
struct drm_i915_private *i915 = gt->i915;
const struct intel_gt_info *info = &gt->info;
struct sseu_dev_info sseu;
struct sseu_dev_info *sseu;
intel_wakeref_t wakeref;
if (GRAPHICS_VER(i915) < 8)
@ -258,23 +258,29 @@ int intel_sseu_status(struct seq_file *m, struct intel_gt *gt)
i915_print_sseu_info(m, true, HAS_POOLED_EU(i915), &info->sseu);
seq_puts(m, "SSEU Device Status\n");
memset(&sseu, 0, sizeof(sseu));
intel_sseu_set_info(&sseu, info->sseu.max_slices,
sseu = kzalloc(sizeof(*sseu), GFP_KERNEL);
if (!sseu)
return -ENOMEM;
intel_sseu_set_info(sseu, info->sseu.max_slices,
info->sseu.max_subslices,
info->sseu.max_eus_per_subslice);
with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
if (IS_CHERRYVIEW(i915))
cherryview_sseu_device_status(gt, &sseu);
cherryview_sseu_device_status(gt, sseu);
else if (IS_BROADWELL(i915))
bdw_sseu_device_status(gt, &sseu);
bdw_sseu_device_status(gt, sseu);
else if (GRAPHICS_VER(i915) == 9)
gen9_sseu_device_status(gt, &sseu);
gen9_sseu_device_status(gt, sseu);
else if (GRAPHICS_VER(i915) >= 11)
gen11_sseu_device_status(gt, &sseu);
gen11_sseu_device_status(gt, sseu);
}
i915_print_sseu_info(m, false, HAS_POOLED_EU(i915), &sseu);
i915_print_sseu_info(m, false, HAS_POOLED_EU(i915), sseu);
kfree(sseu);
return 0;
}
@ -287,22 +293,22 @@ static int sseu_status_show(struct seq_file *m, void *unused)
}
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(sseu_status);
static int rcs_topology_show(struct seq_file *m, void *unused)
static int sseu_topology_show(struct seq_file *m, void *unused)
{
struct intel_gt *gt = m->private;
struct drm_printer p = drm_seq_file_printer(m);
intel_sseu_print_topology(&gt->info.sseu, &p);
intel_sseu_print_topology(gt->i915, &gt->info.sseu, &p);
return 0;
}
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(rcs_topology);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(sseu_topology);
void intel_sseu_debugfs_register(struct intel_gt *gt, struct dentry *root)
{
static const struct intel_gt_debugfs_file files[] = {
{ "sseu_status", &sseu_status_fops, NULL },
{ "rcs_topology", &rcs_topology_fops, NULL },
{ "sseu_topology", &sseu_topology_fops, NULL },
};
intel_gt_debugfs_register_files(root, files, ARRAY_SIZE(files), gt);

Просмотреть файл

@ -1072,9 +1072,15 @@ static void __set_mcr_steering(struct i915_wa_list *wal,
static void __add_mcr_wa(struct intel_gt *gt, struct i915_wa_list *wal,
unsigned int slice, unsigned int subslice)
{
drm_dbg(&gt->i915->drm, "MCR slice=0x%x, subslice=0x%x\n", slice, subslice);
struct drm_printer p = drm_debug_printer("MCR Steering:");
__set_mcr_steering(wal, GEN8_MCR_SELECTOR, slice, subslice);
gt->default_steering.groupid = slice;
gt->default_steering.instanceid = subslice;
if (drm_debug_enabled(DRM_UT_DRIVER))
intel_gt_report_steering(&p, gt, false);
}
static void
@ -2188,11 +2194,15 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
*/
wa_write_or(wal, GEN7_FF_THREAD_MODE,
GEN12_FF_TESSELATION_DOP_GATE_DISABLE);
}
if (IS_ALDERLAKE_P(i915) || IS_DG2(i915) || IS_ALDERLAKE_S(i915) ||
IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
/*
* Wa_1606700617:tgl,dg1,adl-p
* Wa_22010271021:tgl,rkl,dg1,adl-s,adl-p
* Wa_14010826681:tgl,dg1,rkl,adl-p
* Wa_18019627453:dg2
*/
wa_masked_en(wal,
GEN9_CS_DEBUG_MODE1,
@ -2310,7 +2320,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
FF_DOP_CLOCK_GATE_DISABLE);
}
if (IS_GRAPHICS_VER(i915, 9, 12)) {
if (HAS_PERCTX_PREEMPT_CTRL(i915)) {
/* FtrPerCtxtPreemptionGranularityControl:skl,bxt,kbl,cfl,cnl,icl,tgl */
wa_masked_en(wal,
GEN7_FF_SLICE_CS_CHICKEN1,
@ -2618,6 +2628,11 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
wa_write_or(wal, GEN12_GAMCNTRL_CTRL, INVALIDATION_BROADCAST_MODE_DIS |
GLOBAL_INVALIDATION_MODE);
}
if (IS_DG2(i915)) {
/* Wa_22014226127:dg2 */
wa_write_or(wal, LSC_CHICKEN_BIT_0, DISABLE_D8_D16_COASLESCE);
}
}
static void
@ -2633,7 +2648,7 @@ engine_init_workarounds(struct intel_engine_cs *engine, struct i915_wa_list *wal
* to a single RCS/CCS engine's workaround list since
* they're reset as part of the general render domain reset.
*/
if (engine->class == RENDER_CLASS)
if (engine->flags & I915_ENGINE_FIRST_RENDER_COMPUTE)
general_render_compute_wa_init(engine, wal);
if (engine->class == RENDER_CLASS)

Просмотреть файл

@ -1736,15 +1736,9 @@ static int live_preempt(void *arg)
enum intel_engine_id id;
int err = -ENOMEM;
if (igt_spinner_init(&spin_hi, gt))
return -ENOMEM;
if (igt_spinner_init(&spin_lo, gt))
goto err_spin_hi;
ctx_hi = kernel_context(gt->i915, NULL);
if (!ctx_hi)
goto err_spin_lo;
return -ENOMEM;
ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
ctx_lo = kernel_context(gt->i915, NULL);
@ -1752,6 +1746,12 @@ static int live_preempt(void *arg)
goto err_ctx_hi;
ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
if (igt_spinner_init(&spin_hi, gt))
goto err_ctx_lo;
if (igt_spinner_init(&spin_lo, gt))
goto err_spin_hi;
for_each_engine(engine, gt, id) {
struct igt_live_test t;
struct i915_request *rq;
@ -1761,14 +1761,14 @@ static int live_preempt(void *arg)
if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) {
err = -EIO;
goto err_ctx_lo;
goto err_spin_lo;
}
rq = spinner_create_request(&spin_lo, ctx_lo, engine,
MI_ARB_CHECK);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
goto err_ctx_lo;
goto err_spin_lo;
}
i915_request_add(rq);
@ -1777,7 +1777,7 @@ static int live_preempt(void *arg)
GEM_TRACE_DUMP();
intel_gt_set_wedged(gt);
err = -EIO;
goto err_ctx_lo;
goto err_spin_lo;
}
rq = spinner_create_request(&spin_hi, ctx_hi, engine,
@ -1785,7 +1785,7 @@ static int live_preempt(void *arg)
if (IS_ERR(rq)) {
igt_spinner_end(&spin_lo);
err = PTR_ERR(rq);
goto err_ctx_lo;
goto err_spin_lo;
}
i915_request_add(rq);
@ -1794,7 +1794,7 @@ static int live_preempt(void *arg)
GEM_TRACE_DUMP();
intel_gt_set_wedged(gt);
err = -EIO;
goto err_ctx_lo;
goto err_spin_lo;
}
igt_spinner_end(&spin_hi);
@ -1802,19 +1802,19 @@ static int live_preempt(void *arg)
if (igt_live_test_end(&t)) {
err = -EIO;
goto err_ctx_lo;
goto err_spin_lo;
}
}
err = 0;
err_ctx_lo:
kernel_context_close(ctx_lo);
err_ctx_hi:
kernel_context_close(ctx_hi);
err_spin_lo:
igt_spinner_fini(&spin_lo);
err_spin_hi:
igt_spinner_fini(&spin_hi);
err_ctx_lo:
kernel_context_close(ctx_lo);
err_ctx_hi:
kernel_context_close(ctx_hi);
return err;
}
@ -1828,20 +1828,20 @@ static int live_late_preempt(void *arg)
enum intel_engine_id id;
int err = -ENOMEM;
if (igt_spinner_init(&spin_hi, gt))
return -ENOMEM;
if (igt_spinner_init(&spin_lo, gt))
goto err_spin_hi;
ctx_hi = kernel_context(gt->i915, NULL);
if (!ctx_hi)
goto err_spin_lo;
return -ENOMEM;
ctx_lo = kernel_context(gt->i915, NULL);
if (!ctx_lo)
goto err_ctx_hi;
if (igt_spinner_init(&spin_hi, gt))
goto err_ctx_lo;
if (igt_spinner_init(&spin_lo, gt))
goto err_spin_hi;
/* Make sure ctx_lo stays before ctx_hi until we trigger preemption. */
ctx_lo->sched.priority = 1;
@ -1854,14 +1854,14 @@ static int live_late_preempt(void *arg)
if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) {
err = -EIO;
goto err_ctx_lo;
goto err_spin_lo;
}
rq = spinner_create_request(&spin_lo, ctx_lo, engine,
MI_ARB_CHECK);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
goto err_ctx_lo;
goto err_spin_lo;
}
i915_request_add(rq);
@ -1875,7 +1875,7 @@ static int live_late_preempt(void *arg)
if (IS_ERR(rq)) {
igt_spinner_end(&spin_lo);
err = PTR_ERR(rq);
goto err_ctx_lo;
goto err_spin_lo;
}
i915_request_add(rq);
@ -1898,19 +1898,19 @@ static int live_late_preempt(void *arg)
if (igt_live_test_end(&t)) {
err = -EIO;
goto err_ctx_lo;
goto err_spin_lo;
}
}
err = 0;
err_ctx_lo:
kernel_context_close(ctx_lo);
err_ctx_hi:
kernel_context_close(ctx_hi);
err_spin_lo:
igt_spinner_fini(&spin_lo);
err_spin_hi:
igt_spinner_fini(&spin_hi);
err_ctx_lo:
kernel_context_close(ctx_lo);
err_ctx_hi:
kernel_context_close(ctx_hi);
return err;
err_wedged:
@ -1918,7 +1918,7 @@ err_wedged:
igt_spinner_end(&spin_lo);
intel_gt_set_wedged(gt);
err = -EIO;
goto err_ctx_lo;
goto err_spin_lo;
}
struct preempt_client {
@ -3382,12 +3382,9 @@ static int live_preempt_timeout(void *arg)
if (!intel_has_reset_engine(gt))
return 0;
if (igt_spinner_init(&spin_lo, gt))
return -ENOMEM;
ctx_hi = kernel_context(gt->i915, NULL);
if (!ctx_hi)
goto err_spin_lo;
return -ENOMEM;
ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
ctx_lo = kernel_context(gt->i915, NULL);
@ -3395,6 +3392,9 @@ static int live_preempt_timeout(void *arg)
goto err_ctx_hi;
ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
if (igt_spinner_init(&spin_lo, gt))
goto err_ctx_lo;
for_each_engine(engine, gt, id) {
unsigned long saved_timeout;
struct i915_request *rq;
@ -3406,21 +3406,21 @@ static int live_preempt_timeout(void *arg)
MI_NOOP); /* preemption disabled */
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
goto err_ctx_lo;
goto err_spin_lo;
}
i915_request_add(rq);
if (!igt_wait_for_spinner(&spin_lo, rq)) {
intel_gt_set_wedged(gt);
err = -EIO;
goto err_ctx_lo;
goto err_spin_lo;
}
rq = igt_request_alloc(ctx_hi, engine);
if (IS_ERR(rq)) {
igt_spinner_end(&spin_lo);
err = PTR_ERR(rq);
goto err_ctx_lo;
goto err_spin_lo;
}
/* Flush the previous CS ack before changing timeouts */
@ -3440,7 +3440,7 @@ static int live_preempt_timeout(void *arg)
intel_gt_set_wedged(gt);
i915_request_put(rq);
err = -ETIME;
goto err_ctx_lo;
goto err_spin_lo;
}
igt_spinner_end(&spin_lo);
@ -3448,12 +3448,12 @@ static int live_preempt_timeout(void *arg)
}
err = 0;
err_spin_lo:
igt_spinner_fini(&spin_lo);
err_ctx_lo:
kernel_context_close(ctx_lo);
err_ctx_hi:
kernel_context_close(ctx_hi);
err_spin_lo:
igt_spinner_fini(&spin_lo);
return err;
}

Просмотреть файл

@ -1753,8 +1753,8 @@ static int __live_pphwsp_runtime(struct intel_engine_cs *engine)
if (IS_ERR(ce))
return PTR_ERR(ce);
ce->runtime.num_underflow = 0;
ce->runtime.max_underflow = 0;
ce->stats.runtime.num_underflow = 0;
ce->stats.runtime.max_underflow = 0;
do {
unsigned int loop = 1024;
@ -1792,11 +1792,11 @@ static int __live_pphwsp_runtime(struct intel_engine_cs *engine)
intel_context_get_avg_runtime_ns(ce));
err = 0;
if (ce->runtime.num_underflow) {
if (ce->stats.runtime.num_underflow) {
pr_err("%s: pphwsp underflow %u time(s), max %u cycles!\n",
engine->name,
ce->runtime.num_underflow,
ce->runtime.max_underflow);
ce->stats.runtime.num_underflow,
ce->stats.runtime.max_underflow);
GEM_TRACE_DUMP();
err = -EOVERFLOW;
}

Просмотреть файл

@ -132,6 +132,124 @@ err_free_src:
return err;
}
static int intel_context_copy_ccs(struct intel_context *ce,
const struct i915_deps *deps,
struct scatterlist *sg,
enum i915_cache_level cache_level,
bool write_to_ccs,
struct i915_request **out)
{
u8 src_access = write_to_ccs ? DIRECT_ACCESS : INDIRECT_ACCESS;
u8 dst_access = write_to_ccs ? INDIRECT_ACCESS : DIRECT_ACCESS;
struct sgt_dma it = sg_sgt(sg);
struct i915_request *rq;
u32 offset;
int err;
GEM_BUG_ON(ce->vm != ce->engine->gt->migrate.context->vm);
*out = NULL;
GEM_BUG_ON(ce->ring->size < SZ_64K);
offset = 0;
if (HAS_64K_PAGES(ce->engine->i915))
offset = CHUNK_SZ;
do {
int len;
rq = i915_request_create(ce);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
goto out_ce;
}
if (deps) {
err = i915_request_await_deps(rq, deps);
if (err)
goto out_rq;
if (rq->engine->emit_init_breadcrumb) {
err = rq->engine->emit_init_breadcrumb(rq);
if (err)
goto out_rq;
}
deps = NULL;
}
/* The PTE updates + clear must not be interrupted. */
err = emit_no_arbitration(rq);
if (err)
goto out_rq;
len = emit_pte(rq, &it, cache_level, true, offset, CHUNK_SZ);
if (len <= 0) {
err = len;
goto out_rq;
}
err = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
if (err)
goto out_rq;
err = emit_copy_ccs(rq, offset, dst_access,
offset, src_access, len);
if (err)
goto out_rq;
err = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
/* Arbitration is re-enabled between requests. */
out_rq:
if (*out)
i915_request_put(*out);
*out = i915_request_get(rq);
i915_request_add(rq);
if (err || !it.sg || !sg_dma_len(it.sg))
break;
cond_resched();
} while (1);
out_ce:
return err;
}
static int
intel_migrate_ccs_copy(struct intel_migrate *m,
struct i915_gem_ww_ctx *ww,
const struct i915_deps *deps,
struct scatterlist *sg,
enum i915_cache_level cache_level,
bool write_to_ccs,
struct i915_request **out)
{
struct intel_context *ce;
int err;
*out = NULL;
if (!m->context)
return -ENODEV;
ce = intel_migrate_create_context(m);
if (IS_ERR(ce))
ce = intel_context_get(m->context);
GEM_BUG_ON(IS_ERR(ce));
err = intel_context_pin_ww(ce, ww);
if (err)
goto out;
err = intel_context_copy_ccs(ce, deps, sg, cache_level,
write_to_ccs, out);
intel_context_unpin(ce);
out:
intel_context_put(ce);
return err;
}
static int clear(struct intel_migrate *migrate,
int (*fn)(struct intel_migrate *migrate,
struct i915_gem_ww_ctx *ww,
@ -144,7 +262,8 @@ static int clear(struct intel_migrate *migrate,
struct drm_i915_gem_object *obj;
struct i915_request *rq;
struct i915_gem_ww_ctx ww;
u32 *vaddr;
u32 *vaddr, val = 0;
bool ccs_cap = false;
int err = 0;
int i;
@ -152,7 +271,15 @@ static int clear(struct intel_migrate *migrate,
if (IS_ERR(obj))
return 0;
/* Consider the rounded up memory too */
sz = obj->base.size;
if (HAS_FLAT_CCS(i915) && i915_gem_object_is_lmem(obj))
ccs_cap = true;
for_i915_gem_ww(&ww, err, true) {
int ccs_bytes, ccs_bytes_per_chunk;
err = i915_gem_object_lock(obj, &ww);
if (err)
continue;
@ -167,44 +294,114 @@ static int clear(struct intel_migrate *migrate,
vaddr[i] = ~i;
i915_gem_object_flush_map(obj);
err = fn(migrate, &ww, obj, sz, &rq);
if (!err)
if (ccs_cap && !val) {
/* Write the obj data into ccs surface */
err = intel_migrate_ccs_copy(migrate, &ww, NULL,
obj->mm.pages->sgl,
obj->cache_level,
true, &rq);
if (rq && !err) {
if (i915_request_wait(rq, 0, HZ) < 0) {
pr_err("%ps timed out, size: %u\n",
fn, sz);
err = -ETIME;
}
i915_request_put(rq);
rq = NULL;
}
if (err)
continue;
}
err = fn(migrate, &ww, obj, val, &rq);
if (rq && !err) {
if (i915_request_wait(rq, 0, HZ) < 0) {
pr_err("%ps timed out, size: %u\n", fn, sz);
err = -ETIME;
}
i915_request_put(rq);
rq = NULL;
}
if (err)
continue;
if (err != -EDEADLK && err != -EINTR && err != -ERESTARTSYS)
pr_err("%ps failed, size: %u\n", fn, sz);
if (rq) {
i915_request_wait(rq, 0, HZ);
i915_request_put(rq);
i915_gem_object_flush_map(obj);
/* Verify the set/clear of the obj mem */
for (i = 0; !err && i < sz / PAGE_SIZE; i++) {
int x = i * 1024 +
i915_prandom_u32_max_state(1024, prng);
if (vaddr[x] != val) {
pr_err("%ps failed, (%u != %u), offset: %zu\n",
fn, vaddr[x], val, x * sizeof(u32));
igt_hexdump(vaddr + i * 1024, 4096);
err = -EINVAL;
}
}
if (err)
continue;
if (ccs_cap && !val) {
for (i = 0; i < sz / sizeof(u32); i++)
vaddr[i] = ~i;
i915_gem_object_flush_map(obj);
err = intel_migrate_ccs_copy(migrate, &ww, NULL,
obj->mm.pages->sgl,
obj->cache_level,
false, &rq);
if (rq && !err) {
if (i915_request_wait(rq, 0, HZ) < 0) {
pr_err("%ps timed out, size: %u\n",
fn, sz);
err = -ETIME;
}
i915_request_put(rq);
rq = NULL;
}
if (err)
continue;
ccs_bytes = GET_CCS_BYTES(i915, sz);
ccs_bytes_per_chunk = GET_CCS_BYTES(i915, CHUNK_SZ);
i915_gem_object_flush_map(obj);
for (i = 0; !err && i < DIV_ROUND_UP(ccs_bytes, PAGE_SIZE); i++) {
int offset = ((i * PAGE_SIZE) /
ccs_bytes_per_chunk) * CHUNK_SZ / sizeof(u32);
int ccs_bytes_left = (ccs_bytes - i * PAGE_SIZE) / sizeof(u32);
int x = i915_prandom_u32_max_state(min_t(int, 1024,
ccs_bytes_left), prng);
if (vaddr[offset + x]) {
pr_err("%ps ccs clearing failed, offset: %ld/%d\n",
fn, i * PAGE_SIZE + x * sizeof(u32), ccs_bytes);
igt_hexdump(vaddr + offset,
min_t(int, 4096,
ccs_bytes_left * sizeof(u32)));
err = -EINVAL;
}
}
if (err)
continue;
}
i915_gem_object_unpin_map(obj);
}
if (err)
goto err_out;
if (rq) {
if (i915_request_wait(rq, 0, HZ) < 0) {
pr_err("%ps timed out, size: %u\n", fn, sz);
err = -ETIME;
if (err) {
if (err != -EDEADLK && err != -EINTR && err != -ERESTARTSYS)
pr_err("%ps failed, size: %u\n", fn, sz);
if (rq && err != -EINVAL) {
i915_request_wait(rq, 0, HZ);
i915_request_put(rq);
}
i915_request_put(rq);
i915_gem_object_unpin_map(obj);
}
for (i = 0; !err && i < sz / PAGE_SIZE; i++) {
int x = i * 1024 + i915_prandom_u32_max_state(1024, prng);
if (vaddr[x] != sz) {
pr_err("%ps failed, size: %u, offset: %zu\n",
fn, sz, x * sizeof(u32));
igt_hexdump(vaddr + i * 1024, 4096);
err = -EINVAL;
}
}
i915_gem_object_unpin_map(obj);
err_out:
i915_gem_object_put(obj);
return err;
}
@ -621,13 +818,15 @@ static int perf_copy_blt(void *arg)
for (i = 0; i < ARRAY_SIZE(sizes); i++) {
struct drm_i915_gem_object *src, *dst;
size_t sz;
int err;
src = create_init_lmem_internal(gt, sizes[i], true);
if (IS_ERR(src))
return PTR_ERR(src);
dst = create_init_lmem_internal(gt, sizes[i], false);
sz = src->base.size;
dst = create_init_lmem_internal(gt, sz, false);
if (IS_ERR(dst)) {
err = PTR_ERR(dst);
goto err_src;
@ -640,7 +839,7 @@ static int perf_copy_blt(void *arg)
dst->mm.pages->sgl,
I915_CACHE_NONE,
i915_gem_object_is_lmem(dst),
sizes[i]);
sz);
i915_gem_object_unlock(dst);
i915_gem_object_put(dst);

Просмотреть файл

@ -122,17 +122,14 @@ enum intel_guc_action {
INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE = 0x1002,
INTEL_GUC_ACTION_SCHED_ENGINE_MODE_SET = 0x1003,
INTEL_GUC_ACTION_SCHED_ENGINE_MODE_DONE = 0x1004,
INTEL_GUC_ACTION_SET_CONTEXT_PRIORITY = 0x1005,
INTEL_GUC_ACTION_SET_CONTEXT_EXECUTION_QUANTUM = 0x1006,
INTEL_GUC_ACTION_SET_CONTEXT_PREEMPTION_TIMEOUT = 0x1007,
INTEL_GUC_ACTION_CONTEXT_RESET_NOTIFICATION = 0x1008,
INTEL_GUC_ACTION_ENGINE_FAILURE_NOTIFICATION = 0x1009,
INTEL_GUC_ACTION_HOST2GUC_UPDATE_CONTEXT_POLICIES = 0x100B,
INTEL_GUC_ACTION_SETUP_PC_GUCRC = 0x3004,
INTEL_GUC_ACTION_AUTHENTICATE_HUC = 0x4000,
INTEL_GUC_ACTION_GET_HWCONFIG = 0x4100,
INTEL_GUC_ACTION_REGISTER_CONTEXT = 0x4502,
INTEL_GUC_ACTION_DEREGISTER_CONTEXT = 0x4503,
INTEL_GUC_ACTION_REGISTER_COMMAND_TRANSPORT_BUFFER = 0x4505,
INTEL_GUC_ACTION_DEREGISTER_COMMAND_TRANSPORT_BUFFER = 0x4506,
INTEL_GUC_ACTION_DEREGISTER_CONTEXT_DONE = 0x4600,
INTEL_GUC_ACTION_REGISTER_CONTEXT_MULTI_LRC = 0x4601,
INTEL_GUC_ACTION_CLIENT_SOFT_RESET = 0x5507,
@ -173,4 +170,11 @@ enum intel_guc_sleep_state_status {
#define GUC_LOG_CONTROL_VERBOSITY_MASK (0xF << GUC_LOG_CONTROL_VERBOSITY_SHIFT)
#define GUC_LOG_CONTROL_DEFAULT_LOGGING (1 << 8)
enum intel_guc_state_capture_event_status {
INTEL_GUC_STATE_CAPTURE_EVENT_STATUS_SUCCESS = 0x0,
INTEL_GUC_STATE_CAPTURE_EVENT_STATUS_NOSPACE = 0x1,
};
#define INTEL_GUC_STATE_CAPTURE_EVENT_STATUS_MASK 0x000000FF
#endif /* _ABI_GUC_ACTIONS_ABI_H */

Просмотреть файл

@ -8,6 +8,10 @@
enum intel_guc_response_status {
INTEL_GUC_RESPONSE_STATUS_SUCCESS = 0x0,
INTEL_GUC_RESPONSE_NOT_SUPPORTED = 0x20,
INTEL_GUC_RESPONSE_NO_ATTRIBUTE_TABLE = 0x201,
INTEL_GUC_RESPONSE_NO_DECRYPTION_KEY = 0x202,
INTEL_GUC_RESPONSE_DECRYPTION_FAILED = 0x204,
INTEL_GUC_RESPONSE_STATUS_GENERIC_FAIL = 0xF000,
};

Просмотреть файл

@ -6,6 +6,8 @@
#ifndef _ABI_GUC_KLVS_ABI_H
#define _ABI_GUC_KLVS_ABI_H
#include <linux/types.h>
/**
* DOC: GuC KLV
*
@ -79,4 +81,17 @@
#define GUC_KLV_SELF_CFG_G2H_CTB_SIZE_KEY 0x0907
#define GUC_KLV_SELF_CFG_G2H_CTB_SIZE_LEN 1u
/*
* Per context scheduling policy update keys.
*/
enum {
GUC_CONTEXT_POLICIES_KLV_ID_EXECUTION_QUANTUM = 0x2001,
GUC_CONTEXT_POLICIES_KLV_ID_PREEMPTION_TIMEOUT = 0x2002,
GUC_CONTEXT_POLICIES_KLV_ID_SCHEDULING_PRIORITY = 0x2003,
GUC_CONTEXT_POLICIES_KLV_ID_PREEMPT_TO_IDLE_ON_QUANTUM_EXPIRY = 0x2004,
GUC_CONTEXT_POLICIES_KLV_ID_SLPM_GT_FREQUENCY = 0x2005,
GUC_CONTEXT_POLICIES_KLV_NUM_IDS = 5,
};
#endif /* _ABI_GUC_KLVS_ABI_H */

Просмотреть файл

@ -0,0 +1,218 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2021-2022 Intel Corporation
*/
#ifndef _INTEL_GUC_CAPTURE_FWIF_H
#define _INTEL_GUC_CAPTURE_FWIF_H
#include <linux/types.h>
#include "intel_guc_fwif.h"
struct intel_guc;
struct file;
/**
* struct __guc_capture_bufstate
*
* Book-keeping structure used to track read and write pointers
* as we extract error capture data from the GuC-log-buffer's
* error-capture region as a stream of dwords.
*/
struct __guc_capture_bufstate {
u32 size;
void *data;
u32 rd;
u32 wr;
};
/**
* struct __guc_capture_parsed_output - extracted error capture node
*
* A single unit of extracted error-capture output data grouped together
* at an engine-instance level. We keep these nodes in a linked list.
* See cachelist and outlist below.
*/
struct __guc_capture_parsed_output {
/*
* A single set of 3 capture lists: a global-list
* an engine-class-list and an engine-instance list.
* outlist in __guc_capture_parsed_output will keep
* a linked list of these nodes that will eventually
* be detached from outlist and attached into to
* i915_gpu_codedump in response to a context reset
*/
struct list_head link;
bool is_partial;
u32 eng_class;
u32 eng_inst;
u32 guc_id;
u32 lrca;
struct gcap_reg_list_info {
u32 vfid;
u32 num_regs;
struct guc_mmio_reg *regs;
} reginfo[GUC_CAPTURE_LIST_TYPE_MAX];
#define GCAP_PARSED_REGLIST_INDEX_GLOBAL BIT(GUC_CAPTURE_LIST_TYPE_GLOBAL)
#define GCAP_PARSED_REGLIST_INDEX_ENGCLASS BIT(GUC_CAPTURE_LIST_TYPE_ENGINE_CLASS)
#define GCAP_PARSED_REGLIST_INDEX_ENGINST BIT(GUC_CAPTURE_LIST_TYPE_ENGINE_INSTANCE)
};
/**
* struct guc_debug_capture_list_header / struct guc_debug_capture_list
*
* As part of ADS registration, these header structures (followed by
* an array of 'struct guc_mmio_reg' entries) are used to register with
* GuC microkernel the list of registers we want it to dump out prior
* to a engine reset.
*/
struct guc_debug_capture_list_header {
u32 info;
#define GUC_CAPTURELISTHDR_NUMDESCR GENMASK(15, 0)
} __packed;
struct guc_debug_capture_list {
struct guc_debug_capture_list_header header;
struct guc_mmio_reg regs[0];
} __packed;
/**
* struct __guc_mmio_reg_descr / struct __guc_mmio_reg_descr_group
*
* intel_guc_capture module uses these structures to maintain static
* tables (per unique platform) that consists of lists of registers
* (offsets, names, flags,...) that are used at the ADS regisration
* time as well as during runtime processing and reporting of error-
* capture states generated by GuC just prior to engine reset events.
*/
struct __guc_mmio_reg_descr {
i915_reg_t reg;
u32 flags;
u32 mask;
const char *regname;
};
struct __guc_mmio_reg_descr_group {
const struct __guc_mmio_reg_descr *list;
u32 num_regs;
u32 owner; /* see enum guc_capture_owner */
u32 type; /* see enum guc_capture_type */
u32 engine; /* as per MAX_ENGINE_CLASS */
struct __guc_mmio_reg_descr *extlist; /* only used for steered registers */
};
/**
* struct guc_state_capture_header_t / struct guc_state_capture_t /
* guc_state_capture_group_header_t / guc_state_capture_group_t
*
* Prior to resetting engines that have hung or faulted, GuC microkernel
* reports the engine error-state (register values that was read) by
* logging them into the shared GuC log buffer using these hierarchy
* of structures.
*/
struct guc_state_capture_header_t {
u32 owner;
#define CAP_HDR_CAPTURE_VFID GENMASK(7, 0)
u32 info;
#define CAP_HDR_CAPTURE_TYPE GENMASK(3, 0) /* see enum guc_capture_type */
#define CAP_HDR_ENGINE_CLASS GENMASK(7, 4) /* see GUC_MAX_ENGINE_CLASSES */
#define CAP_HDR_ENGINE_INSTANCE GENMASK(11, 8)
u32 lrca; /* if type-instance, LRCA (address) that hung, else set to ~0 */
u32 guc_id; /* if type-instance, context index of hung context, else set to ~0 */
u32 num_mmios;
#define CAP_HDR_NUM_MMIOS GENMASK(9, 0)
} __packed;
struct guc_state_capture_t {
struct guc_state_capture_header_t header;
struct guc_mmio_reg mmio_entries[0];
} __packed;
enum guc_capture_group_types {
GUC_STATE_CAPTURE_GROUP_TYPE_FULL,
GUC_STATE_CAPTURE_GROUP_TYPE_PARTIAL,
GUC_STATE_CAPTURE_GROUP_TYPE_MAX,
};
struct guc_state_capture_group_header_t {
u32 owner;
#define CAP_GRP_HDR_CAPTURE_VFID GENMASK(7, 0)
u32 info;
#define CAP_GRP_HDR_NUM_CAPTURES GENMASK(7, 0)
#define CAP_GRP_HDR_CAPTURE_TYPE GENMASK(15, 8) /* guc_capture_group_types */
} __packed;
/* this is the top level structure where an error-capture dump starts */
struct guc_state_capture_group_t {
struct guc_state_capture_group_header_t grp_header;
struct guc_state_capture_t capture_entries[0];
} __packed;
/**
* struct __guc_capture_ads_cache
*
* A structure to cache register lists that were populated and registered
* with GuC at startup during ADS registration. This allows much quicker
* GuC resets without re-parsing all the tables for the given gt.
*/
struct __guc_capture_ads_cache {
bool is_valid;
void *ptr;
size_t size;
int status;
};
/**
* struct intel_guc_state_capture
*
* Internal context of the intel_guc_capture module.
*/
struct intel_guc_state_capture {
/**
* @reglists: static table of register lists used for error-capture state.
*/
const struct __guc_mmio_reg_descr_group *reglists;
/**
* @extlists: allocated table of steered register lists used for error-capture state.
*
* NOTE: steered registers have multiple instances depending on the HW configuration
* (slices or dual-sub-slices) and thus depends on HW fuses discovered at startup
*/
struct __guc_mmio_reg_descr_group *extlists;
/**
* @ads_cache: cached register lists that is ADS format ready
*/
struct __guc_capture_ads_cache ads_cache[GUC_CAPTURE_LIST_INDEX_MAX]
[GUC_CAPTURE_LIST_TYPE_MAX]
[GUC_MAX_ENGINE_CLASSES];
void *ads_null_cache;
/**
* @cachelist: Pool of pre-allocated nodes for error capture output
*
* We need this pool of pre-allocated nodes because we cannot
* dynamically allocate new nodes when receiving the G2H notification
* because the event handlers for all G2H event-processing is called
* by the ct processing worker queue and when that queue is being
* processed, there is no absoluate guarantee that we are not in the
* midst of a GT reset operation (which doesn't allow allocations).
*/
struct list_head cachelist;
#define PREALLOC_NODES_MAX_COUNT (3 * GUC_MAX_ENGINE_CLASSES * GUC_MAX_INSTANCES_PER_CLASS)
#define PREALLOC_NODES_DEFAULT_NUMREGS 64
int max_mmio_per_node;
/**
* @outlist: Pool of pre-allocated nodes for error capture output
*
* A linked list of parsed GuC error-capture output data before
* reporting with formatting via i915_gpu_coredump. Each node in this linked list shall
* contain a single engine-capture including global, engine-class and
* engine-instance register dumps as per guc_capture_parsed_output_node
*/
struct list_head outlist;
};
#endif /* _INTEL_GUC_CAPTURE_FWIF_H */

Просмотреть файл

@ -9,8 +9,9 @@
#include "gt/intel_gt_pm_irq.h"
#include "gt/intel_gt_regs.h"
#include "intel_guc.h"
#include "intel_guc_slpc.h"
#include "intel_guc_ads.h"
#include "intel_guc_capture.h"
#include "intel_guc_slpc.h"
#include "intel_guc_submission.h"
#include "i915_drv.h"
#include "i915_irq.h"
@ -291,6 +292,41 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc)
GRAPHICS_VER_FULL(gt->i915) < IP_VER(12, 50))
flags |= GUC_WA_POLLCS;
/* Wa_16011759253:dg2_g10:a0 */
if (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_B0))
flags |= GUC_WA_GAM_CREDITS;
/* Wa_14014475959:dg2 */
if (IS_DG2(gt->i915))
flags |= GUC_WA_HOLD_CCS_SWITCHOUT;
/*
* Wa_14012197797:dg2_g10:a0,dg2_g11:a0
* Wa_22011391025:dg2_g10,dg2_g11,dg2_g12
*
* The same WA bit is used for both and 22011391025 is applicable to
* all DG2.
*/
if (IS_DG2(gt->i915))
flags |= GUC_WA_DUAL_QUEUE;
/* Wa_22011802037: graphics version 12 */
if (GRAPHICS_VER(gt->i915) == 12)
flags |= GUC_WA_PRE_PARSER;
/* Wa_16011777198:dg2 */
if (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_C0) ||
IS_DG2_GRAPHICS_STEP(gt->i915, G11, STEP_A0, STEP_B0))
flags |= GUC_WA_RCS_RESET_BEFORE_RC6;
/*
* Wa_22012727170:dg2_g10[a0-c0), dg2_g11[a0..)
* Wa_22012727685:dg2_g11[a0..)
*/
if (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_C0) ||
IS_DG2_GRAPHICS_STEP(gt->i915, G11, STEP_A0, STEP_FOREVER))
flags |= GUC_WA_CONTEXT_ISOLATION;
return flags;
}
@ -362,9 +398,14 @@ int intel_guc_init(struct intel_guc *guc)
if (ret)
goto err_fw;
ret = intel_guc_ads_create(guc);
ret = intel_guc_capture_init(guc);
if (ret)
goto err_log;
ret = intel_guc_ads_create(guc);
if (ret)
goto err_capture;
GEM_BUG_ON(!guc->ads_vma);
ret = intel_guc_ct_init(&guc->ct);
@ -403,6 +444,8 @@ err_ct:
intel_guc_ct_fini(&guc->ct);
err_ads:
intel_guc_ads_destroy(guc);
err_capture:
intel_guc_capture_destroy(guc);
err_log:
intel_guc_log_destroy(&guc->log);
err_fw:
@ -430,6 +473,7 @@ void intel_guc_fini(struct intel_guc *guc)
intel_guc_ct_fini(&guc->ct);
intel_guc_ads_destroy(guc);
intel_guc_capture_destroy(guc);
intel_guc_log_destroy(&guc->log);
intel_uc_fw_fini(&guc->fw);
}

Просмотреть файл

@ -10,18 +10,19 @@
#include <linux/iosys-map.h>
#include <linux/xarray.h>
#include "intel_uncore.h"
#include "intel_guc_ct.h"
#include "intel_guc_fw.h"
#include "intel_guc_fwif.h"
#include "intel_guc_ct.h"
#include "intel_guc_log.h"
#include "intel_guc_reg.h"
#include "intel_guc_slpc_types.h"
#include "intel_uc_fw.h"
#include "intel_uncore.h"
#include "i915_utils.h"
#include "i915_vma.h"
struct __guc_ads_blob;
struct intel_guc_state_capture;
/**
* struct intel_guc - Top level structure of GuC.
@ -38,6 +39,8 @@ struct intel_guc {
struct intel_guc_ct ct;
/** @slpc: sub-structure containing SLPC related data and objects */
struct intel_guc_slpc slpc;
/** @capture: the error-state-capture module's data and objects */
struct intel_guc_state_capture *capture;
/** @sched_engine: Global engine used to submit requests to GuC */
struct i915_sched_engine *sched_engine;
@ -138,6 +141,8 @@ struct intel_guc {
bool submission_supported;
/** @submission_selected: tracks whether the user enabled GuC submission */
bool submission_selected;
/** @submission_initialized: tracks whether GuC submission has been initialised */
bool submission_initialized;
/**
* @rc_supported: tracks whether we support GuC rc on the current platform
*/
@ -160,14 +165,11 @@ struct intel_guc {
struct guc_mmio_reg *ads_regset;
/** @ads_golden_ctxt_size: size of the golden contexts in the ADS */
u32 ads_golden_ctxt_size;
/** @ads_capture_size: size of register lists in the ADS used for error capture */
u32 ads_capture_size;
/** @ads_engine_usage_size: size of engine usage in the ADS */
u32 ads_engine_usage_size;
/** @lrc_desc_pool: object allocated to hold the GuC LRC descriptor pool */
struct i915_vma *lrc_desc_pool;
/** @lrc_desc_pool_vaddr: contents of the GuC LRC descriptor pool */
void *lrc_desc_pool_vaddr;
/**
* @context_lookup: used to resolve intel_context from guc_id, if a
* context is present in this structure it is registered with the GuC
@ -431,6 +433,9 @@ int intel_guc_engine_failure_process_msg(struct intel_guc *guc,
int intel_guc_error_capture_process_msg(struct intel_guc *guc,
const u32 *msg, u32 len);
struct intel_engine_cs *
intel_guc_lookup_engine(struct intel_guc *guc, u8 guc_class, u8 instance);
void intel_guc_find_hung_context(struct intel_engine_cs *engine);
int intel_guc_global_policies_update(struct intel_guc *guc);

Просмотреть файл

@ -11,6 +11,7 @@
#include "gt/intel_lrc.h"
#include "gt/shmem_utils.h"
#include "intel_guc_ads.h"
#include "intel_guc_capture.h"
#include "intel_guc_fwif.h"
#include "intel_uc.h"
#include "i915_drv.h"
@ -86,8 +87,7 @@ static u32 guc_ads_golden_ctxt_size(struct intel_guc *guc)
static u32 guc_ads_capture_size(struct intel_guc *guc)
{
/* FIXME: Allocate a proper capture list */
return PAGE_ALIGN(PAGE_SIZE);
return PAGE_ALIGN(guc->ads_capture_size);
}
static u32 guc_ads_private_data_size(struct intel_guc *guc)
@ -276,15 +276,24 @@ __mmio_reg_add(struct temp_regset *regset, struct guc_mmio_reg *reg)
return slot;
}
static long __must_check guc_mmio_reg_add(struct temp_regset *regset,
u32 offset, u32 flags)
#define GUC_REGSET_STEERING(group, instance) ( \
FIELD_PREP(GUC_REGSET_STEERING_GROUP, (group)) | \
FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, (instance)) | \
GUC_REGSET_NEEDS_STEERING \
)
static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
struct temp_regset *regset,
i915_reg_t reg, u32 flags)
{
u32 count = regset->storage_used - (regset->registers - regset->storage);
struct guc_mmio_reg reg = {
u32 offset = i915_mmio_reg_offset(reg);
struct guc_mmio_reg entry = {
.offset = offset,
.flags = flags,
};
struct guc_mmio_reg *slot;
u8 group, inst;
/*
* The mmio list is built using separate lists within the driver.
@ -292,11 +301,22 @@ static long __must_check guc_mmio_reg_add(struct temp_regset *regset,
* register more than once. Do not consider this an error; silently
* move on if the register is already in the list.
*/
if (bsearch(&reg, regset->registers, count,
sizeof(reg), guc_mmio_reg_cmp))
if (bsearch(&entry, regset->registers, count,
sizeof(entry), guc_mmio_reg_cmp))
return 0;
slot = __mmio_reg_add(regset, &reg);
/*
* The GuC doesn't have a default steering, so we need to explicitly
* steer all registers that need steering. However, we do not keep track
* of all the steering ranges, only of those that have a chance of using
* a non-default steering from the i915 pov. Instead of adding such
* tracking, it is easier to just program the default steering for all
* regs that don't need a non-default one.
*/
intel_gt_get_valid_steering_for_reg(gt, reg, &group, &inst);
entry.flags |= GUC_REGSET_STEERING(group, inst);
slot = __mmio_reg_add(regset, &entry);
if (IS_ERR(slot))
return PTR_ERR(slot);
@ -311,14 +331,16 @@ static long __must_check guc_mmio_reg_add(struct temp_regset *regset,
return 0;
}
#define GUC_MMIO_REG_ADD(regset, reg, masked) \
guc_mmio_reg_add(regset, \
i915_mmio_reg_offset((reg)), \
#define GUC_MMIO_REG_ADD(gt, regset, reg, masked) \
guc_mmio_reg_add(gt, \
regset, \
(reg), \
(masked) ? GUC_REGSET_MASKED : 0)
static int guc_mmio_regset_init(struct temp_regset *regset,
struct intel_engine_cs *engine)
{
struct intel_gt *gt = engine->gt;
const u32 base = engine->mmio_base;
struct i915_wa_list *wal = &engine->wa_list;
struct i915_wa *wa;
@ -331,26 +353,26 @@ static int guc_mmio_regset_init(struct temp_regset *regset,
*/
regset->registers = regset->storage + regset->storage_used;
ret |= GUC_MMIO_REG_ADD(regset, RING_MODE_GEN7(base), true);
ret |= GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false);
ret |= GUC_MMIO_REG_ADD(regset, RING_IMR(base), false);
ret |= GUC_MMIO_REG_ADD(gt, regset, RING_MODE_GEN7(base), true);
ret |= GUC_MMIO_REG_ADD(gt, regset, RING_HWS_PGA(base), false);
ret |= GUC_MMIO_REG_ADD(gt, regset, RING_IMR(base), false);
if (engine->class == RENDER_CLASS &&
if ((engine->flags & I915_ENGINE_FIRST_RENDER_COMPUTE) &&
CCS_MASK(engine->gt))
ret |= GUC_MMIO_REG_ADD(regset, GEN12_RCU_MODE, true);
ret |= GUC_MMIO_REG_ADD(gt, regset, GEN12_RCU_MODE, true);
for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
ret |= GUC_MMIO_REG_ADD(regset, wa->reg, wa->masked_reg);
ret |= GUC_MMIO_REG_ADD(gt, regset, wa->reg, wa->masked_reg);
/* Be extra paranoid and include all whitelist registers. */
for (i = 0; i < RING_MAX_NONPRIV_SLOTS; i++)
ret |= GUC_MMIO_REG_ADD(regset,
ret |= GUC_MMIO_REG_ADD(gt, regset,
RING_FORCE_TO_NONPRIV(base, i),
false);
/* add in local MOCS registers */
for (i = 0; i < GEN9_LNCFCMOCS_REG_COUNT; i++)
ret |= GUC_MMIO_REG_ADD(regset, GEN9_LNCFCMOCS(i), false);
ret |= GUC_MMIO_REG_ADD(gt, regset, GEN9_LNCFCMOCS(i), false);
return ret ? -1 : 0;
}
@ -433,7 +455,7 @@ static void guc_mmio_reg_state_init(struct intel_guc *guc)
static void fill_engine_enable_masks(struct intel_gt *gt,
struct iosys_map *info_map)
{
info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], 1);
info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], RCS_MASK(gt));
info_map_write(info_map, engine_enabled_masks[GUC_COMPUTE_CLASS], CCS_MASK(gt));
info_map_write(info_map, engine_enabled_masks[GUC_BLITTER_CLASS], 1);
info_map_write(info_map, engine_enabled_masks[GUC_VIDEO_CLASS], VDBOX_MASK(gt));
@ -589,24 +611,119 @@ static void guc_init_golden_context(struct intel_guc *guc)
GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size);
}
static void guc_capture_list_init(struct intel_guc *guc)
static int
guc_capture_prep_lists(struct intel_guc *guc)
{
struct intel_gt *gt = guc_to_gt(guc);
struct drm_i915_private *i915 = guc_to_gt(guc)->i915;
u32 ads_ggtt, capture_offset, null_ggtt, total_size = 0;
struct guc_gt_system_info local_info;
struct iosys_map info_map;
bool ads_is_mapped;
size_t size = 0;
void *ptr;
int i, j;
u32 addr_ggtt, offset;
offset = guc_ads_capture_offset(guc);
addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset;
ads_is_mapped = !iosys_map_is_null(&guc->ads_map);
if (ads_is_mapped) {
capture_offset = guc_ads_capture_offset(guc);
ads_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma);
info_map = IOSYS_MAP_INIT_OFFSET(&guc->ads_map,
offsetof(struct __guc_ads_blob, system_info));
} else {
memset(&local_info, 0, sizeof(local_info));
iosys_map_set_vaddr(&info_map, &local_info);
fill_engine_enable_masks(gt, &info_map);
}
/* FIXME: Populate a proper capture list */
/* first, set aside the first page for a capture_list with zero descriptors */
total_size = PAGE_SIZE;
if (ads_is_mapped) {
if (!intel_guc_capture_getnullheader(guc, &ptr, &size))
iosys_map_memcpy_to(&guc->ads_map, capture_offset, ptr, size);
null_ggtt = ads_ggtt + capture_offset;
capture_offset += PAGE_SIZE;
}
for (i = 0; i < GUC_CAPTURE_LIST_INDEX_MAX; i++) {
for (j = 0; j < GUC_MAX_ENGINE_CLASSES; j++) {
ads_blob_write(guc, ads.capture_instance[i][j], addr_ggtt);
ads_blob_write(guc, ads.capture_class[i][j], addr_ggtt);
}
ads_blob_write(guc, ads.capture_global[i], addr_ggtt);
/* null list if we dont have said engine or list */
if (!info_map_read(&info_map, engine_enabled_masks[j])) {
if (ads_is_mapped) {
ads_blob_write(guc, ads.capture_class[i][j], null_ggtt);
ads_blob_write(guc, ads.capture_instance[i][j], null_ggtt);
}
continue;
}
if (intel_guc_capture_getlistsize(guc, i,
GUC_CAPTURE_LIST_TYPE_ENGINE_CLASS,
j, &size)) {
if (ads_is_mapped)
ads_blob_write(guc, ads.capture_class[i][j], null_ggtt);
goto engine_instance_list;
}
total_size += size;
if (ads_is_mapped) {
if (total_size > guc->ads_capture_size ||
intel_guc_capture_getlist(guc, i,
GUC_CAPTURE_LIST_TYPE_ENGINE_CLASS,
j, &ptr)) {
ads_blob_write(guc, ads.capture_class[i][j], null_ggtt);
continue;
}
ads_blob_write(guc, ads.capture_class[i][j], ads_ggtt +
capture_offset);
iosys_map_memcpy_to(&guc->ads_map, capture_offset, ptr, size);
capture_offset += size;
}
engine_instance_list:
if (intel_guc_capture_getlistsize(guc, i,
GUC_CAPTURE_LIST_TYPE_ENGINE_INSTANCE,
j, &size)) {
if (ads_is_mapped)
ads_blob_write(guc, ads.capture_instance[i][j], null_ggtt);
continue;
}
total_size += size;
if (ads_is_mapped) {
if (total_size > guc->ads_capture_size ||
intel_guc_capture_getlist(guc, i,
GUC_CAPTURE_LIST_TYPE_ENGINE_INSTANCE,
j, &ptr)) {
ads_blob_write(guc, ads.capture_instance[i][j], null_ggtt);
continue;
}
ads_blob_write(guc, ads.capture_instance[i][j], ads_ggtt +
capture_offset);
iosys_map_memcpy_to(&guc->ads_map, capture_offset, ptr, size);
capture_offset += size;
}
}
if (intel_guc_capture_getlistsize(guc, i, GUC_CAPTURE_LIST_TYPE_GLOBAL, 0, &size)) {
if (ads_is_mapped)
ads_blob_write(guc, ads.capture_global[i], null_ggtt);
continue;
}
total_size += size;
if (ads_is_mapped) {
if (total_size > guc->ads_capture_size ||
intel_guc_capture_getlist(guc, i, GUC_CAPTURE_LIST_TYPE_GLOBAL, 0,
&ptr)) {
ads_blob_write(guc, ads.capture_global[i], null_ggtt);
continue;
}
ads_blob_write(guc, ads.capture_global[i], ads_ggtt + capture_offset);
iosys_map_memcpy_to(&guc->ads_map, capture_offset, ptr, size);
capture_offset += size;
}
}
if (guc->ads_capture_size && guc->ads_capture_size != PAGE_ALIGN(total_size))
drm_warn(&i915->drm, "GuC->ADS->Capture alloc size changed from %d to %d\n",
guc->ads_capture_size, PAGE_ALIGN(total_size));
return PAGE_ALIGN(total_size);
}
static void __guc_ads_init(struct intel_guc *guc)
@ -644,8 +761,8 @@ static void __guc_ads_init(struct intel_guc *guc)
base = intel_guc_ggtt_offset(guc, guc->ads_vma);
/* Capture list for hang debug */
guc_capture_list_init(guc);
/* Lists for error capture debug */
guc_capture_prep_lists(guc);
/* ADS */
ads_blob_write(guc, ads.scheduler_policies, base +
@ -693,6 +810,12 @@ int intel_guc_ads_create(struct intel_guc *guc)
return ret;
guc->ads_golden_ctxt_size = ret;
/* Likewise the capture lists: */
ret = guc_capture_prep_lists(guc);
if (ret < 0)
return ret;
guc->ads_capture_size = ret;
/* Now the total size can be determined: */
size = guc_ads_blob_size(guc);

Разница между файлами не показана из-за своего большого размера Загрузить разницу

Просмотреть файл

@ -0,0 +1,33 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2021-2021 Intel Corporation
*/
#ifndef _INTEL_GUC_CAPTURE_H
#define _INTEL_GUC_CAPTURE_H
#include <linux/types.h>
struct drm_i915_error_state_buf;
struct guc_gt_system_info;
struct intel_engine_coredump;
struct intel_context;
struct intel_gt;
struct intel_guc;
void intel_guc_capture_free_node(struct intel_engine_coredump *ee);
int intel_guc_capture_print_engine_node(struct drm_i915_error_state_buf *m,
const struct intel_engine_coredump *ee);
void intel_guc_capture_get_matching_node(struct intel_gt *gt, struct intel_engine_coredump *ee,
struct intel_context *ce);
void intel_guc_capture_process(struct intel_guc *guc);
int intel_guc_capture_output_min_size_est(struct intel_guc *guc);
int intel_guc_capture_getlist(struct intel_guc *guc, u32 owner, u32 type, u32 classid,
void **outptr);
int intel_guc_capture_getlistsize(struct intel_guc *guc, u32 owner, u32 type, u32 classid,
size_t *size);
int intel_guc_capture_getnullheader(struct intel_guc *guc, void **outptr, size_t *size);
void intel_guc_capture_destroy(struct intel_guc *guc);
int intel_guc_capture_init(struct intel_guc *guc);
#endif /* _INTEL_GUC_CAPTURE_H */

Просмотреть файл

@ -32,8 +32,8 @@
#define GUC_CLIENT_PRIORITY_NORMAL 3
#define GUC_CLIENT_PRIORITY_NUM 4
#define GUC_MAX_LRC_DESCRIPTORS 65535
#define GUC_INVALID_LRC_ID GUC_MAX_LRC_DESCRIPTORS
#define GUC_MAX_CONTEXT_ID 65535
#define GUC_INVALID_CONTEXT_ID GUC_MAX_CONTEXT_ID
#define GUC_RENDER_ENGINE 0
#define GUC_VIDEO_ENGINE 1
@ -98,7 +98,13 @@
#define GUC_LOG_BUF_ADDR_SHIFT 12
#define GUC_CTL_WA 1
#define GUC_WA_POLLCS BIT(18)
#define GUC_WA_GAM_CREDITS BIT(10)
#define GUC_WA_DUAL_QUEUE BIT(11)
#define GUC_WA_RCS_RESET_BEFORE_RC6 BIT(13)
#define GUC_WA_CONTEXT_ISOLATION BIT(15)
#define GUC_WA_PRE_PARSER BIT(14)
#define GUC_WA_HOLD_CCS_SWITCHOUT BIT(17)
#define GUC_WA_POLLCS BIT(18)
#define GUC_CTL_FEATURE 2
#define GUC_CTL_ENABLE_SLPC BIT(2)
@ -197,54 +203,45 @@ struct guc_wq_item {
u32 fence_id;
} __packed;
struct guc_process_desc {
u32 stage_id;
u64 db_base_addr;
struct guc_sched_wq_desc {
u32 head;
u32 tail;
u32 error_offset;
u64 wq_base_addr;
u32 wq_size_bytes;
u32 wq_status;
u32 engine_presence;
u32 priority;
u32 reserved[36];
u32 reserved[28];
} __packed;
/* Helper for context registration H2G */
struct guc_ctxt_registration_info {
u32 flags;
u32 context_idx;
u32 engine_class;
u32 engine_submit_mask;
u32 wq_desc_lo;
u32 wq_desc_hi;
u32 wq_base_lo;
u32 wq_base_hi;
u32 wq_size;
u32 hwlrca_lo;
u32 hwlrca_hi;
};
#define CONTEXT_REGISTRATION_FLAG_KMD BIT(0)
#define CONTEXT_POLICY_DEFAULT_EXECUTION_QUANTUM_US 1000000
#define CONTEXT_POLICY_DEFAULT_PREEMPTION_TIME_US 500000
/* 32-bit KLV structure as used by policy updates and others */
struct guc_klv_generic_dw_t {
u32 kl;
u32 value;
} __packed;
/* Preempt to idle on quantum expiry */
#define CONTEXT_POLICY_FLAG_PREEMPT_TO_IDLE BIT(0)
/* Format of the UPDATE_CONTEXT_POLICIES H2G data packet */
struct guc_update_context_policy_header {
u32 action;
u32 ctx_id;
} __packed;
/*
* GuC Context registration descriptor.
* FIXME: This is only required to exist during context registration.
* The current 1:1 between guc_lrc_desc and LRCs for the lifetime of the LRC
* is not required.
*/
struct guc_lrc_desc {
u32 hw_context_desc;
u32 slpm_perf_mode_hint; /* SPLC v1 only */
u32 slpm_freq_hint;
u32 engine_submit_mask; /* In logical space */
u8 engine_class;
u8 reserved0[3];
u32 priority;
u32 process_desc;
u32 wq_addr;
u32 wq_size;
u32 context_flags; /* CONTEXT_REGISTRATION_* */
/* Time for one workload to execute. (in micro seconds) */
u32 execution_quantum;
/* Time to wait for a preemption request to complete before issuing a
* reset. (in micro seconds).
*/
u32 preemption_timeout;
u32 policy_flags; /* CONTEXT_POLICY_* */
u32 reserved1[19];
struct guc_update_context_policy {
struct guc_update_context_policy_header header;
struct guc_klv_generic_dw_t klv[GUC_CONTEXT_POLICIES_KLV_NUM_IDS];
} __packed;
#define GUC_POWER_UNSPECIFIED 0
@ -285,10 +282,13 @@ struct guc_mmio_reg {
u32 offset;
u32 value;
u32 flags;
u32 mask;
#define GUC_REGSET_MASKED BIT(0)
#define GUC_REGSET_NEEDS_STEERING BIT(1)
#define GUC_REGSET_MASKED_WITH_VALUE BIT(2)
#define GUC_REGSET_RESTORE_ONLY BIT(3)
#define GUC_REGSET_STEERING_GROUP GENMASK(15, 12)
#define GUC_REGSET_STEERING_INSTANCE GENMASK(23, 20)
u32 mask;
} __packed;
/* GuC register sets */
@ -311,6 +311,14 @@ enum {
GUC_CAPTURE_LIST_INDEX_MAX = 2,
};
/*Register-types of GuC capture register lists */
enum guc_capture_type {
GUC_CAPTURE_LIST_TYPE_GLOBAL = 0,
GUC_CAPTURE_LIST_TYPE_ENGINE_CLASS,
GUC_CAPTURE_LIST_TYPE_ENGINE_INSTANCE,
GUC_CAPTURE_LIST_TYPE_MAX,
};
/* GuC Additional Data Struct */
struct guc_ads {
struct guc_mmio_reg_set reg_state_list[GUC_MAX_ENGINE_CLASSES][GUC_MAX_INSTANCES_PER_CLASS];

Просмотреть файл

@ -0,0 +1,164 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include "gt/intel_gt.h"
#include "gt/intel_hwconfig.h"
#include "i915_drv.h"
#include "i915_memcpy.h"
/*
* GuC has a blob containing hardware configuration information (HWConfig).
* This is formatted as a simple and flexible KLV (Key/Length/Value) table.
*
* For example, a minimal version could be:
* enum device_attr {
* ATTR_SOME_VALUE = 0,
* ATTR_SOME_MASK = 1,
* };
*
* static const u32 hwconfig[] = {
* ATTR_SOME_VALUE,
* 1, // Value Length in DWords
* 8, // Value
*
* ATTR_SOME_MASK,
* 3,
* 0x00FFFFFFFF, 0xFFFFFFFF, 0xFF000000,
* };
*
* The attribute ids are defined in a hardware spec.
*/
static int __guc_action_get_hwconfig(struct intel_guc *guc,
u32 ggtt_offset, u32 ggtt_size)
{
u32 action[] = {
INTEL_GUC_ACTION_GET_HWCONFIG,
lower_32_bits(ggtt_offset),
upper_32_bits(ggtt_offset),
ggtt_size,
};
int ret;
ret = intel_guc_send_mmio(guc, action, ARRAY_SIZE(action), NULL, 0);
if (ret == -ENXIO)
return -ENOENT;
return ret;
}
static int guc_hwconfig_discover_size(struct intel_guc *guc, struct intel_hwconfig *hwconfig)
{
int ret;
/*
* Sending a query with zero offset and size will return the
* size of the blob.
*/
ret = __guc_action_get_hwconfig(guc, 0, 0);
if (ret < 0)
return ret;
if (ret == 0)
return -EINVAL;
hwconfig->size = ret;
return 0;
}
static int guc_hwconfig_fill_buffer(struct intel_guc *guc, struct intel_hwconfig *hwconfig)
{
struct i915_vma *vma;
u32 ggtt_offset;
void *vaddr;
int ret;
GEM_BUG_ON(!hwconfig->size);
ret = intel_guc_allocate_and_map_vma(guc, hwconfig->size, &vma, &vaddr);
if (ret)
return ret;
ggtt_offset = intel_guc_ggtt_offset(guc, vma);
ret = __guc_action_get_hwconfig(guc, ggtt_offset, hwconfig->size);
if (ret >= 0)
memcpy(hwconfig->ptr, vaddr, hwconfig->size);
i915_vma_unpin_and_release(&vma, I915_VMA_RELEASE_MAP);
return ret;
}
static bool has_table(struct drm_i915_private *i915)
{
if (IS_ALDERLAKE_P(i915))
return true;
if (IS_DG2(i915))
return true;
return false;
}
/**
* intel_guc_hwconfig_init - Initialize the HWConfig
*
* Retrieve the HWConfig table from the GuC and save it locally.
* It can then be queried on demand by other users later on.
*/
static int guc_hwconfig_init(struct intel_gt *gt)
{
struct intel_hwconfig *hwconfig = &gt->info.hwconfig;
struct intel_guc *guc = &gt->uc.guc;
int ret;
if (!has_table(gt->i915))
return 0;
ret = guc_hwconfig_discover_size(guc, hwconfig);
if (ret)
return ret;
hwconfig->ptr = kmalloc(hwconfig->size, GFP_KERNEL);
if (!hwconfig->ptr) {
hwconfig->size = 0;
return -ENOMEM;
}
ret = guc_hwconfig_fill_buffer(guc, hwconfig);
if (ret < 0) {
intel_gt_fini_hwconfig(gt);
return ret;
}
return 0;
}
/**
* intel_gt_init_hwconfig - Initialize the HWConfig if available
*
* Retrieve the HWConfig table if available on the current platform.
*/
int intel_gt_init_hwconfig(struct intel_gt *gt)
{
if (!intel_uc_uses_guc(&gt->uc))
return 0;
return guc_hwconfig_init(gt);
}
/**
* intel_gt_fini_hwconfig - Finalize the HWConfig
*
* Free up the memory allocation holding the table.
*/
void intel_gt_fini_hwconfig(struct intel_gt *gt)
{
struct intel_hwconfig *hwconfig = &gt->info.hwconfig;
kfree(hwconfig->ptr);
hwconfig->size = 0;
hwconfig->ptr = NULL;
}

Просмотреть файл

@ -10,9 +10,10 @@
#include "i915_drv.h"
#include "i915_irq.h"
#include "i915_memcpy.h"
#include "intel_guc_capture.h"
#include "intel_guc_log.h"
static void guc_log_capture_logs(struct intel_guc_log *log);
static void guc_log_copy_debuglogs_for_relay(struct intel_guc_log *log);
/**
* DOC: GuC firmware log
@ -26,7 +27,8 @@ static void guc_log_capture_logs(struct intel_guc_log *log);
static int guc_action_flush_log_complete(struct intel_guc *guc)
{
u32 action[] = {
INTEL_GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE
INTEL_GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE,
GUC_DEBUG_LOG_BUFFER
};
return intel_guc_send(guc, action, ARRAY_SIZE(action));
@ -137,7 +139,7 @@ static void guc_move_to_next_buf(struct intel_guc_log *log)
smp_wmb();
/* All data has been written, so now move the offset of sub buffer. */
relay_reserve(log->relay.channel, log->vma->obj->base.size);
relay_reserve(log->relay.channel, log->vma->obj->base.size - CAPTURE_BUFFER_SIZE);
/* Switch to the next sub buffer */
relay_flush(log->relay.channel);
@ -157,9 +159,9 @@ static void *guc_get_write_buffer(struct intel_guc_log *log)
return relay_reserve(log->relay.channel, 0);
}
static bool guc_check_log_buf_overflow(struct intel_guc_log *log,
enum guc_log_buffer_type type,
unsigned int full_cnt)
bool intel_guc_check_log_buf_overflow(struct intel_guc_log *log,
enum guc_log_buffer_type type,
unsigned int full_cnt)
{
unsigned int prev_full_cnt = log->stats[type].sampled_overflow;
bool overflow = false;
@ -182,7 +184,7 @@ static bool guc_check_log_buf_overflow(struct intel_guc_log *log,
return overflow;
}
static unsigned int guc_get_log_buffer_size(enum guc_log_buffer_type type)
unsigned int intel_guc_get_log_buffer_size(enum guc_log_buffer_type type)
{
switch (type) {
case GUC_DEBUG_LOG_BUFFER:
@ -198,7 +200,21 @@ static unsigned int guc_get_log_buffer_size(enum guc_log_buffer_type type)
return 0;
}
static void guc_read_update_log_buffer(struct intel_guc_log *log)
size_t intel_guc_get_log_buffer_offset(enum guc_log_buffer_type type)
{
enum guc_log_buffer_type i;
size_t offset = PAGE_SIZE;/* for the log_buffer_states */
for (i = GUC_DEBUG_LOG_BUFFER; i < GUC_MAX_LOG_BUFFER; ++i) {
if (i == type)
break;
offset += intel_guc_get_log_buffer_size(i);
}
return offset;
}
static void _guc_log_copy_debuglogs_for_relay(struct intel_guc_log *log)
{
unsigned int buffer_size, read_offset, write_offset, bytes_to_copy, full_cnt;
struct guc_log_buffer_state *log_buf_state, *log_buf_snapshot_state;
@ -213,7 +229,8 @@ static void guc_read_update_log_buffer(struct intel_guc_log *log)
goto out_unlock;
/* Get the pointer to shared GuC log buffer */
log_buf_state = src_data = log->relay.buf_addr;
src_data = log->buf_addr;
log_buf_state = src_data;
/* Get the pointer to local buffer to store the logs */
log_buf_snapshot_state = dst_data = guc_get_write_buffer(log);
@ -223,7 +240,7 @@ static void guc_read_update_log_buffer(struct intel_guc_log *log)
* Used rate limited to avoid deluge of messages, logs might be
* getting consumed by User at a slow rate.
*/
DRM_ERROR_RATELIMITED("no sub-buffer to capture logs\n");
DRM_ERROR_RATELIMITED("no sub-buffer to copy general logs\n");
log->relay.full_count++;
goto out_unlock;
@ -233,7 +250,8 @@ static void guc_read_update_log_buffer(struct intel_guc_log *log)
src_data += PAGE_SIZE;
dst_data += PAGE_SIZE;
for (type = GUC_DEBUG_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) {
/* For relay logging, we exclude error state capture */
for (type = GUC_DEBUG_LOG_BUFFER; type <= GUC_CRASH_DUMP_LOG_BUFFER; type++) {
/*
* Make a copy of the state structure, inside GuC log buffer
* (which is uncached mapped), on the stack to avoid reading
@ -241,14 +259,14 @@ static void guc_read_update_log_buffer(struct intel_guc_log *log)
*/
memcpy(&log_buf_state_local, log_buf_state,
sizeof(struct guc_log_buffer_state));
buffer_size = guc_get_log_buffer_size(type);
buffer_size = intel_guc_get_log_buffer_size(type);
read_offset = log_buf_state_local.read_ptr;
write_offset = log_buf_state_local.sampled_write_ptr;
full_cnt = log_buf_state_local.buffer_full_cnt;
/* Bookkeeping stuff */
log->stats[type].flush += log_buf_state_local.flush_to_file;
new_overflow = guc_check_log_buf_overflow(log, type, full_cnt);
new_overflow = intel_guc_check_log_buf_overflow(log, type, full_cnt);
/* Update the state of shared log buffer */
log_buf_state->read_ptr = write_offset;
@ -301,49 +319,43 @@ out_unlock:
mutex_unlock(&log->relay.lock);
}
static void capture_logs_work(struct work_struct *work)
static void copy_debug_logs_work(struct work_struct *work)
{
struct intel_guc_log *log =
container_of(work, struct intel_guc_log, relay.flush_work);
guc_log_capture_logs(log);
guc_log_copy_debuglogs_for_relay(log);
}
static int guc_log_map(struct intel_guc_log *log)
static int guc_log_relay_map(struct intel_guc_log *log)
{
void *vaddr;
lockdep_assert_held(&log->relay.lock);
if (!log->vma)
if (!log->vma || !log->buf_addr)
return -ENODEV;
/*
* Create a WC (Uncached for read) vmalloc mapping of log
* buffer pages, so that we can directly get the data
* (up-to-date) from memory.
* WC vmalloc mapping of log buffer pages was done at
* GuC Log Init time, but lets keep a ref for book-keeping
*/
vaddr = i915_gem_object_pin_map_unlocked(log->vma->obj, I915_MAP_WC);
if (IS_ERR(vaddr))
return PTR_ERR(vaddr);
log->relay.buf_addr = vaddr;
i915_gem_object_get(log->vma->obj);
log->relay.buf_in_use = true;
return 0;
}
static void guc_log_unmap(struct intel_guc_log *log)
static void guc_log_relay_unmap(struct intel_guc_log *log)
{
lockdep_assert_held(&log->relay.lock);
i915_gem_object_unpin_map(log->vma->obj);
log->relay.buf_addr = NULL;
i915_gem_object_put(log->vma->obj);
log->relay.buf_in_use = false;
}
void intel_guc_log_init_early(struct intel_guc_log *log)
{
mutex_init(&log->relay.lock);
INIT_WORK(&log->relay.flush_work, capture_logs_work);
INIT_WORK(&log->relay.flush_work, copy_debug_logs_work);
log->relay.started = false;
}
@ -358,8 +370,11 @@ static int guc_log_relay_create(struct intel_guc_log *log)
lockdep_assert_held(&log->relay.lock);
GEM_BUG_ON(!log->vma);
/* Keep the size of sub buffers same as shared log buffer */
subbuf_size = log->vma->size;
/*
* Keep the size of sub buffers same as shared log buffer
* but GuC log-events excludes the error-state-capture logs
*/
subbuf_size = log->vma->size - CAPTURE_BUFFER_SIZE;
/*
* Store up to 8 snapshots, which is large enough to buffer sufficient
@ -394,13 +409,13 @@ static void guc_log_relay_destroy(struct intel_guc_log *log)
log->relay.channel = NULL;
}
static void guc_log_capture_logs(struct intel_guc_log *log)
static void guc_log_copy_debuglogs_for_relay(struct intel_guc_log *log)
{
struct intel_guc *guc = log_to_guc(log);
struct drm_i915_private *dev_priv = guc_to_gt(guc)->i915;
intel_wakeref_t wakeref;
guc_read_update_log_buffer(log);
_guc_log_copy_debuglogs_for_relay(log);
/*
* Generally device is expected to be active only at this
@ -440,6 +455,7 @@ int intel_guc_log_create(struct intel_guc_log *log)
{
struct intel_guc *guc = log_to_guc(log);
struct i915_vma *vma;
void *vaddr;
u32 guc_log_size;
int ret;
@ -447,23 +463,28 @@ int intel_guc_log_create(struct intel_guc_log *log)
/*
* GuC Log buffer Layout
* (this ordering must follow "enum guc_log_buffer_type" definition)
*
* +===============================+ 00B
* | Crash dump state header |
* +-------------------------------+ 32B
* | Debug state header |
* +-------------------------------+ 32B
* | Crash dump state header |
* +-------------------------------+ 64B
* | Capture state header |
* +-------------------------------+ 96B
* | |
* +===============================+ PAGE_SIZE (4KB)
* | Crash Dump logs |
* +===============================+ + CRASH_SIZE
* | Debug logs |
* +===============================+ + DEBUG_SIZE
* | Crash Dump logs |
* +===============================+ + CRASH_SIZE
* | Capture logs |
* +===============================+ + CAPTURE_SIZE
*/
if (intel_guc_capture_output_min_size_est(guc) > CAPTURE_BUFFER_SIZE)
DRM_WARN("GuC log buffer for state_capture maybe too small. %d < %d\n",
CAPTURE_BUFFER_SIZE, intel_guc_capture_output_min_size_est(guc));
guc_log_size = PAGE_SIZE + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE +
CAPTURE_BUFFER_SIZE;
@ -474,6 +495,17 @@ int intel_guc_log_create(struct intel_guc_log *log)
}
log->vma = vma;
/*
* Create a WC (Uncached for read) vmalloc mapping up front immediate access to
* data from memory during critical events such as error capture
*/
vaddr = i915_gem_object_pin_map_unlocked(log->vma->obj, I915_MAP_WC);
if (IS_ERR(vaddr)) {
ret = PTR_ERR(vaddr);
i915_vma_unpin_and_release(&log->vma, 0);
goto err;
}
log->buf_addr = vaddr;
log->level = __get_default_log_level(log);
DRM_DEBUG_DRIVER("guc_log_level=%d (%s, verbose:%s, verbosity:%d)\n",
@ -484,13 +516,14 @@ int intel_guc_log_create(struct intel_guc_log *log)
return 0;
err:
DRM_ERROR("Failed to allocate GuC log buffer. %d\n", ret);
DRM_ERROR("Failed to allocate or map GuC log buffer. %d\n", ret);
return ret;
}
void intel_guc_log_destroy(struct intel_guc_log *log)
{
i915_vma_unpin_and_release(&log->vma, 0);
log->buf_addr = NULL;
i915_vma_unpin_and_release(&log->vma, I915_VMA_RELEASE_MAP);
}
int intel_guc_log_set_level(struct intel_guc_log *log, u32 level)
@ -535,7 +568,7 @@ out_unlock:
bool intel_guc_log_relay_created(const struct intel_guc_log *log)
{
return log->relay.buf_addr;
return log->buf_addr;
}
int intel_guc_log_relay_open(struct intel_guc_log *log)
@ -566,7 +599,7 @@ int intel_guc_log_relay_open(struct intel_guc_log *log)
if (ret)
goto out_unlock;
ret = guc_log_map(log);
ret = guc_log_relay_map(log);
if (ret)
goto out_relay;
@ -616,8 +649,8 @@ void intel_guc_log_relay_flush(struct intel_guc_log *log)
with_intel_runtime_pm(guc_to_gt(guc)->uncore->rpm, wakeref)
guc_action_flush_log(guc);
/* GuC would have updated log buffer by now, so capture it */
guc_log_capture_logs(log);
/* GuC would have updated log buffer by now, so copy it */
guc_log_copy_debuglogs_for_relay(log);
}
/*
@ -646,7 +679,7 @@ void intel_guc_log_relay_close(struct intel_guc_log *log)
mutex_lock(&log->relay.lock);
GEM_BUG_ON(!intel_guc_log_relay_created(log));
guc_log_unmap(log);
guc_log_relay_unmap(log);
guc_log_relay_destroy(log);
mutex_unlock(&log->relay.lock);
}

Просмотреть файл

@ -49,8 +49,9 @@ struct intel_guc;
struct intel_guc_log {
u32 level;
struct i915_vma *vma;
void *buf_addr;
struct {
void *buf_addr;
bool buf_in_use;
bool started;
struct work_struct flush_work;
struct rchan *channel;
@ -66,6 +67,10 @@ struct intel_guc_log {
};
void intel_guc_log_init_early(struct intel_guc_log *log);
bool intel_guc_check_log_buf_overflow(struct intel_guc_log *log, enum guc_log_buffer_type type,
unsigned int full_cnt);
unsigned int intel_guc_get_log_buffer_size(enum guc_log_buffer_type type);
size_t intel_guc_get_log_buffer_offset(enum guc_log_buffer_type type);
int intel_guc_log_create(struct intel_guc_log *log);
void intel_guc_log_destroy(struct intel_guc_log *log);

Просмотреть файл

@ -153,8 +153,8 @@ static int slpc_query_task_state(struct intel_guc_slpc *slpc)
ret = guc_action_slpc_query(guc, offset);
if (unlikely(ret))
drm_err(&i915->drm, "Failed to query task state (%pe)\n",
ERR_PTR(ret));
i915_probe_error(i915, "Failed to query task state (%pe)\n",
ERR_PTR(ret));
drm_clflush_virt_range(slpc->vaddr, SLPC_PAGE_SIZE_BYTES);
@ -171,8 +171,8 @@ static int slpc_set_param(struct intel_guc_slpc *slpc, u8 id, u32 value)
ret = guc_action_slpc_set_param(guc, id, value);
if (ret)
drm_err(&i915->drm, "Failed to set param %d to %u (%pe)\n",
id, value, ERR_PTR(ret));
i915_probe_error(i915, "Failed to set param %d to %u (%pe)\n",
id, value, ERR_PTR(ret));
return ret;
}
@ -212,8 +212,8 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq)
SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
freq);
if (ret)
drm_err(&i915->drm, "Unable to force min freq to %u: %d",
freq, ret);
i915_probe_error(i915, "Unable to force min freq to %u: %d",
freq, ret);
}
return ret;
@ -248,9 +248,9 @@ int intel_guc_slpc_init(struct intel_guc_slpc *slpc)
err = intel_guc_allocate_and_map_vma(guc, size, &slpc->vma, (void **)&slpc->vaddr);
if (unlikely(err)) {
drm_err(&i915->drm,
"Failed to allocate SLPC struct (err=%pe)\n",
ERR_PTR(err));
i915_probe_error(i915,
"Failed to allocate SLPC struct (err=%pe)\n",
ERR_PTR(err));
return err;
}
@ -317,15 +317,15 @@ static int slpc_reset(struct intel_guc_slpc *slpc)
ret = guc_action_slpc_reset(guc, offset);
if (unlikely(ret < 0)) {
drm_err(&i915->drm, "SLPC reset action failed (%pe)\n",
ERR_PTR(ret));
i915_probe_error(i915, "SLPC reset action failed (%pe)\n",
ERR_PTR(ret));
return ret;
}
if (!ret) {
if (wait_for(slpc_is_running(slpc), SLPC_RESET_TIMEOUT_MS)) {
drm_err(&i915->drm, "SLPC not enabled! State = %s\n",
slpc_get_state_string(slpc));
i915_probe_error(i915, "SLPC not enabled! State = %s\n",
slpc_get_state_string(slpc));
return -EIO;
}
}
@ -582,16 +582,12 @@ static int slpc_use_fused_rp0(struct intel_guc_slpc *slpc)
static void slpc_get_rp_values(struct intel_guc_slpc *slpc)
{
struct intel_rps *rps = &slpc_to_gt(slpc)->rps;
u32 rp_state_cap;
struct intel_rps_freq_caps caps;
rp_state_cap = intel_rps_read_state_cap(rps);
slpc->rp0_freq = REG_FIELD_GET(RP0_CAP_MASK, rp_state_cap) *
GT_FREQUENCY_MULTIPLIER;
slpc->rp1_freq = REG_FIELD_GET(RP1_CAP_MASK, rp_state_cap) *
GT_FREQUENCY_MULTIPLIER;
slpc->min_freq = REG_FIELD_GET(RPN_CAP_MASK, rp_state_cap) *
GT_FREQUENCY_MULTIPLIER;
gen6_rps_get_freq_caps(rps, &caps);
slpc->rp0_freq = intel_gpu_freq(rps, caps.rp0_freq);
slpc->rp1_freq = intel_gpu_freq(rps, caps.rp1_freq);
slpc->min_freq = intel_gpu_freq(rps, caps.min_freq);
if (!slpc->boost_freq)
slpc->boost_freq = slpc->rp0_freq;
@ -621,8 +617,8 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)
ret = slpc_reset(slpc);
if (unlikely(ret < 0)) {
drm_err(&i915->drm, "SLPC Reset event returned (%pe)\n",
ERR_PTR(ret));
i915_probe_error(i915, "SLPC Reset event returned (%pe)\n",
ERR_PTR(ret));
return ret;
}
@ -637,24 +633,24 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)
/* Ignore efficient freq and set min to platform min */
ret = slpc_ignore_eff_freq(slpc, true);
if (unlikely(ret)) {
drm_err(&i915->drm, "Failed to set SLPC min to RPn (%pe)\n",
ERR_PTR(ret));
i915_probe_error(i915, "Failed to set SLPC min to RPn (%pe)\n",
ERR_PTR(ret));
return ret;
}
/* Set SLPC max limit to RP0 */
ret = slpc_use_fused_rp0(slpc);
if (unlikely(ret)) {
drm_err(&i915->drm, "Failed to set SLPC max to RP0 (%pe)\n",
ERR_PTR(ret));
i915_probe_error(i915, "Failed to set SLPC max to RP0 (%pe)\n",
ERR_PTR(ret));
return ret;
}
/* Revert SLPC min/max to softlimits if necessary */
ret = slpc_set_softlimits(slpc);
if (unlikely(ret)) {
drm_err(&i915->drm, "Failed to set SLPC softlimits (%pe)\n",
ERR_PTR(ret));
i915_probe_error(i915, "Failed to set SLPC softlimits (%pe)\n",
ERR_PTR(ret));
return ret;
}

Разница между файлами не показана из-за своего большого размера Загрузить разницу

Некоторые файлы не были показаны из-за слишком большого количества измененных файлов Показать больше