Merge tag 'drm-intel-gt-next-2022-06-29' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes: - Expose per tile media freq factor in sysfs (Ashutosh Dixit, Dale B Stimson) - Document memory residency and Flat-CCS capability of obj (Ramalingam C) - Disable GETPARAM lookups of I915_PARAM_[SUB]SLICE_MASK on Xe_HP+ (Matt Roper) Cross-subsystem Changes: - Rename intel-gtt symbols (Lucas De Marchi) Core Changes: Driver Changes: - Support programming the EU priority in the GuC descriptor (DG2) (Matthew Brost) - DG2 HuC loading support (Daniele Ceraolo Spurio) - Fix build error without CONFIG_PM (YueHaibing) - Enable THP on Icelake and beyond (Tvrtko Ursulin) - Only setup private tmpfs mount when needed and fix logging (Tvrtko Ursulin) - Make __guc_reset_context aware of guilty engines (Umesh Nerlige Ramappa) - DG2 small bar memory probing fixes (Nirmoy Das) - Remove unnecessary GuC err capture noise (Alan Previn) - Fix i915_gem_object_ggtt_pin_ww regression on old platforms (Maarten Lankhorst) - Fix undefined behavior in GuC backend due to shift overflowing the constant (Borislav Petkov) - New DG2 workarounds (Swathi Dhanavanthri, Anshuman Gupta) - Report no hwconfig support on ADL-N (Balasubramani Vivekanandan) - Fix error_state_read ptr + offset use (Alan Previn) - Expose per tile media freq factor in sysfs (Ashutosh Dixit, Dale B Stimson) - Fix memory leaks in per-gt sysfs (Ashutosh Dixit) - Fix dma_resv fence handling in multi-batch execbuf (Nirmoy Das) - Add extra registers to GPU error dump on Gen11+ (Stuart Summers) - More PVC+DG2 workarounds (Matt Roper) - Improve user experience and driver robustness under SIGINT or similar (Tvrtko Ursulin) - Don't show engine classes not present (Tvrtko Ursulin) - Improve on suspend / resume time with VT-d enabled (Thomas Hellström) - Add missing else (katrinzhou) - Don't leak lmem mapping in vma_evict (Juha-Pekka Heikkila) - Add smem fallback allocation for dpt (Juha-Pekka Heikkila) - Tweak the ordering in cpu_write_needs_clflush (Matthew Auld) - Do not access rq->engine without a reference (Niranjana Vishwanathapura) - Revert "drm/i915: Hold reference to intel_context over life of i915_request" (Niranjana Vishwanathapura) - Don't update engine busyness stats too frequently (Alan Previn) - Add additional steps for Wa_22011802037 for execlist backend (Umesh Nerlige Ramappa) - Fix a lockdep warning at error capture (Nirmoy Das) - Ponte Vecchio prep work and new blitter engines (Matt Roper, John Harrison, Lucas De Marchi) - Read correct RP_STATE_CAP register (PVC) (Matt Roper) - Define MOCS table for PVC (Ayaz A Siddiqui) - Driver refactor and support Ponte Vecchio forcewake handling (Matt Roper) - Remove additional 3D flags from PIPE_CONTROL (Ponte Vecchio) (Stuart Summers) - XEHPSDV and PVC do not use HuC (Daniele Ceraolo Spurio) - Extract stepping information from PCI revid (Ponte Vecchio) (Matt Roper) - Add initial PVC workarounds (Stuart Summers) - SSEU handling driver refactor and Ponte Vecchio support (Matt Roper) - GuC depriv applies to PVC (Matt Roper) - Add register steering (Ponte Vecchio) (Matt Roper) - Add recommended MMIO setting (Ponte Vecchio) (Matt Roper) - Move multicast register handling to a dedicated file (Matt Roper) - Cleanup interface for MCR operations (Matt Roper) - Extend i915_vma_pin_iomap() (CQ Tang) - Re-do the intel-gtt split (Lucas De Marchi) - Correct duplicated/misplaced GT register definitions (Matt Roper) - Prefer "XEHP_" prefix for registers (Matt Roper) - Don't use DRM_DEBUG_WARN_ON for unexpected l3bank/mslice config (Tvrtko Ursulin) - Don't use DRM_DEBUG_WARN_ON for ring unexpectedly not idle (Tvrtko Ursulin) - Make drop_pages() return bool (Lucas De Marchi) - Fix CFI violation with show_dynamic_id() (Nathan Chancellor) - Use i915_probe_error instead of drm_error in GuC code (Vinay Belgaumkar) - Fix use of static in macro mismatch (Andi Shyti) - Update tiled blits selftest (Bommu Krishnaiah) - Future-proof platform checks (Matt Roper) - Only include what's needed (Jani Nikula) - remove accidental static from a local variable (Jani Nikula) - Add global forcewake request to drpc (Vinay Belgaumkar) - Fix spelling typo in comment (pengfuyuan) - Increase timeout for live_parallel_switch selftest (Akeem G Abodunrin) - Use non-blocking H2G for waitboost (Vinay Belgaumkar) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YrwtLM081SQUG1Dc@tursulin-desk
This commit is contained in:
Коммит
c6a3d73592
|
@ -246,6 +246,18 @@ Display State Buffer
|
|||
.. kernel-doc:: drivers/gpu/drm/i915/display/intel_dsb.c
|
||||
:internal:
|
||||
|
||||
GT Programming
|
||||
==============
|
||||
|
||||
Multicast/Replicated (MCR) Registers
|
||||
------------------------------------
|
||||
|
||||
.. kernel-doc:: drivers/gpu/drm/i915/gt/intel_gt_mcr.c
|
||||
:doc: GT Multicast/Replicated (MCR) Register Support
|
||||
|
||||
.. kernel-doc:: drivers/gpu/drm/i915/gt/intel_gt_mcr.c
|
||||
:internal:
|
||||
|
||||
Memory Management and Command Submission
|
||||
========================================
|
||||
|
||||
|
|
|
@ -744,7 +744,7 @@ static void i830_write_entry(dma_addr_t addr, unsigned int entry,
|
|||
writel_relaxed(addr | pte_flags, intel_private.gtt + entry);
|
||||
}
|
||||
|
||||
bool intel_enable_gtt(void)
|
||||
bool intel_gmch_enable_gtt(void)
|
||||
{
|
||||
u8 __iomem *reg;
|
||||
|
||||
|
@ -787,7 +787,7 @@ bool intel_enable_gtt(void)
|
|||
|
||||
return true;
|
||||
}
|
||||
EXPORT_SYMBOL(intel_enable_gtt);
|
||||
EXPORT_SYMBOL(intel_gmch_enable_gtt);
|
||||
|
||||
static int i830_setup(void)
|
||||
{
|
||||
|
@ -821,8 +821,8 @@ static int intel_fake_agp_free_gatt_table(struct agp_bridge_data *bridge)
|
|||
|
||||
static int intel_fake_agp_configure(void)
|
||||
{
|
||||
if (!intel_enable_gtt())
|
||||
return -EIO;
|
||||
if (!intel_gmch_enable_gtt())
|
||||
return -EIO;
|
||||
|
||||
intel_private.clear_fake_agp = true;
|
||||
agp_bridge->gart_bus_addr = intel_private.gma_bus_addr;
|
||||
|
@ -844,20 +844,20 @@ static bool i830_check_flags(unsigned int flags)
|
|||
return false;
|
||||
}
|
||||
|
||||
void intel_gtt_insert_page(dma_addr_t addr,
|
||||
unsigned int pg,
|
||||
unsigned int flags)
|
||||
void intel_gmch_gtt_insert_page(dma_addr_t addr,
|
||||
unsigned int pg,
|
||||
unsigned int flags)
|
||||
{
|
||||
intel_private.driver->write_entry(addr, pg, flags);
|
||||
readl(intel_private.gtt + pg);
|
||||
if (intel_private.driver->chipset_flush)
|
||||
intel_private.driver->chipset_flush();
|
||||
}
|
||||
EXPORT_SYMBOL(intel_gtt_insert_page);
|
||||
EXPORT_SYMBOL(intel_gmch_gtt_insert_page);
|
||||
|
||||
void intel_gtt_insert_sg_entries(struct sg_table *st,
|
||||
unsigned int pg_start,
|
||||
unsigned int flags)
|
||||
void intel_gmch_gtt_insert_sg_entries(struct sg_table *st,
|
||||
unsigned int pg_start,
|
||||
unsigned int flags)
|
||||
{
|
||||
struct scatterlist *sg;
|
||||
unsigned int len, m;
|
||||
|
@ -879,13 +879,13 @@ void intel_gtt_insert_sg_entries(struct sg_table *st,
|
|||
if (intel_private.driver->chipset_flush)
|
||||
intel_private.driver->chipset_flush();
|
||||
}
|
||||
EXPORT_SYMBOL(intel_gtt_insert_sg_entries);
|
||||
EXPORT_SYMBOL(intel_gmch_gtt_insert_sg_entries);
|
||||
|
||||
#if IS_ENABLED(CONFIG_AGP_INTEL)
|
||||
static void intel_gtt_insert_pages(unsigned int first_entry,
|
||||
unsigned int num_entries,
|
||||
struct page **pages,
|
||||
unsigned int flags)
|
||||
static void intel_gmch_gtt_insert_pages(unsigned int first_entry,
|
||||
unsigned int num_entries,
|
||||
struct page **pages,
|
||||
unsigned int flags)
|
||||
{
|
||||
int i, j;
|
||||
|
||||
|
@ -905,7 +905,7 @@ static int intel_fake_agp_insert_entries(struct agp_memory *mem,
|
|||
if (intel_private.clear_fake_agp) {
|
||||
int start = intel_private.stolen_size / PAGE_SIZE;
|
||||
int end = intel_private.gtt_mappable_entries;
|
||||
intel_gtt_clear_range(start, end - start);
|
||||
intel_gmch_gtt_clear_range(start, end - start);
|
||||
intel_private.clear_fake_agp = false;
|
||||
}
|
||||
|
||||
|
@ -934,12 +934,12 @@ static int intel_fake_agp_insert_entries(struct agp_memory *mem,
|
|||
if (ret != 0)
|
||||
return ret;
|
||||
|
||||
intel_gtt_insert_sg_entries(&st, pg_start, type);
|
||||
intel_gmch_gtt_insert_sg_entries(&st, pg_start, type);
|
||||
mem->sg_list = st.sgl;
|
||||
mem->num_sg = st.nents;
|
||||
} else
|
||||
intel_gtt_insert_pages(pg_start, mem->page_count, mem->pages,
|
||||
type);
|
||||
intel_gmch_gtt_insert_pages(pg_start, mem->page_count, mem->pages,
|
||||
type);
|
||||
|
||||
out:
|
||||
ret = 0;
|
||||
|
@ -949,7 +949,7 @@ out_err:
|
|||
}
|
||||
#endif
|
||||
|
||||
void intel_gtt_clear_range(unsigned int first_entry, unsigned int num_entries)
|
||||
void intel_gmch_gtt_clear_range(unsigned int first_entry, unsigned int num_entries)
|
||||
{
|
||||
unsigned int i;
|
||||
|
||||
|
@ -959,7 +959,7 @@ void intel_gtt_clear_range(unsigned int first_entry, unsigned int num_entries)
|
|||
}
|
||||
wmb();
|
||||
}
|
||||
EXPORT_SYMBOL(intel_gtt_clear_range);
|
||||
EXPORT_SYMBOL(intel_gmch_gtt_clear_range);
|
||||
|
||||
#if IS_ENABLED(CONFIG_AGP_INTEL)
|
||||
static int intel_fake_agp_remove_entries(struct agp_memory *mem,
|
||||
|
@ -968,7 +968,7 @@ static int intel_fake_agp_remove_entries(struct agp_memory *mem,
|
|||
if (mem->page_count == 0)
|
||||
return 0;
|
||||
|
||||
intel_gtt_clear_range(pg_start, mem->page_count);
|
||||
intel_gmch_gtt_clear_range(pg_start, mem->page_count);
|
||||
|
||||
if (intel_private.needs_dmar) {
|
||||
intel_gtt_unmap_memory(mem->sg_list, mem->num_sg);
|
||||
|
@ -1431,22 +1431,22 @@ int intel_gmch_probe(struct pci_dev *bridge_pdev, struct pci_dev *gpu_pdev,
|
|||
}
|
||||
EXPORT_SYMBOL(intel_gmch_probe);
|
||||
|
||||
void intel_gtt_get(u64 *gtt_total,
|
||||
phys_addr_t *mappable_base,
|
||||
resource_size_t *mappable_end)
|
||||
void intel_gmch_gtt_get(u64 *gtt_total,
|
||||
phys_addr_t *mappable_base,
|
||||
resource_size_t *mappable_end)
|
||||
{
|
||||
*gtt_total = intel_private.gtt_total_entries << PAGE_SHIFT;
|
||||
*mappable_base = intel_private.gma_bus_addr;
|
||||
*mappable_end = intel_private.gtt_mappable_entries << PAGE_SHIFT;
|
||||
}
|
||||
EXPORT_SYMBOL(intel_gtt_get);
|
||||
EXPORT_SYMBOL(intel_gmch_gtt_get);
|
||||
|
||||
void intel_gtt_chipset_flush(void)
|
||||
void intel_gmch_gtt_flush(void)
|
||||
{
|
||||
if (intel_private.driver->chipset_flush)
|
||||
intel_private.driver->chipset_flush();
|
||||
}
|
||||
EXPORT_SYMBOL(intel_gtt_chipset_flush);
|
||||
EXPORT_SYMBOL(intel_gmch_gtt_flush);
|
||||
|
||||
void intel_gmch_remove(void)
|
||||
{
|
||||
|
|
|
@ -103,6 +103,7 @@ gt-y += \
|
|||
gt/intel_gt_debugfs.o \
|
||||
gt/intel_gt_engines_debugfs.o \
|
||||
gt/intel_gt_irq.o \
|
||||
gt/intel_gt_mcr.o \
|
||||
gt/intel_gt_pm.o \
|
||||
gt/intel_gt_pm_debugfs.o \
|
||||
gt/intel_gt_pm_irq.o \
|
||||
|
@ -129,7 +130,7 @@ gt-y += \
|
|||
gt/shmem_utils.o \
|
||||
gt/sysfs_engines.o
|
||||
# x86 intel-gtt module support
|
||||
gt-$(CONFIG_X86) += gt/intel_gt_gmch.o
|
||||
gt-$(CONFIG_X86) += gt/intel_ggtt_gmch.o
|
||||
# autogenerated null render state
|
||||
gt-y += \
|
||||
gt/gen6_renderstate.o \
|
||||
|
|
|
@ -4,6 +4,7 @@
|
|||
*/
|
||||
|
||||
#include "gem/i915_gem_domain.h"
|
||||
#include "gem/i915_gem_internal.h"
|
||||
#include "gt/gen8_ppgtt.h"
|
||||
|
||||
#include "i915_drv.h"
|
||||
|
@ -127,8 +128,12 @@ struct i915_vma *intel_dpt_pin(struct i915_address_space *vm)
|
|||
struct i915_vma *vma;
|
||||
void __iomem *iomem;
|
||||
struct i915_gem_ww_ctx ww;
|
||||
u64 pin_flags = 0;
|
||||
int err;
|
||||
|
||||
if (i915_gem_object_is_stolen(dpt->obj))
|
||||
pin_flags |= PIN_MAPPABLE;
|
||||
|
||||
wakeref = intel_runtime_pm_get(&i915->runtime_pm);
|
||||
atomic_inc(&i915->gpu_error.pending_fb_pin);
|
||||
|
||||
|
@ -138,7 +143,7 @@ struct i915_vma *intel_dpt_pin(struct i915_address_space *vm)
|
|||
continue;
|
||||
|
||||
vma = i915_gem_object_ggtt_pin_ww(dpt->obj, &ww, NULL, 0, 4096,
|
||||
HAS_LMEM(i915) ? 0 : PIN_MAPPABLE);
|
||||
pin_flags);
|
||||
if (IS_ERR(vma)) {
|
||||
err = PTR_ERR(vma);
|
||||
continue;
|
||||
|
@ -248,10 +253,13 @@ intel_dpt_create(struct intel_framebuffer *fb)
|
|||
|
||||
size = round_up(size * sizeof(gen8_pte_t), I915_GTT_PAGE_SIZE);
|
||||
|
||||
if (HAS_LMEM(i915))
|
||||
dpt_obj = i915_gem_object_create_lmem(i915, size, I915_BO_ALLOC_CONTIGUOUS);
|
||||
else
|
||||
dpt_obj = i915_gem_object_create_lmem(i915, size, I915_BO_ALLOC_CONTIGUOUS);
|
||||
if (IS_ERR(dpt_obj) && i915_ggtt_has_aperture(to_gt(i915)->ggtt))
|
||||
dpt_obj = i915_gem_object_create_stolen(i915, size);
|
||||
if (IS_ERR(dpt_obj) && !HAS_LMEM(i915)) {
|
||||
drm_dbg_kms(&i915->drm, "Allocating dpt from smem\n");
|
||||
dpt_obj = i915_gem_object_create_internal(i915, size);
|
||||
}
|
||||
if (IS_ERR(dpt_obj))
|
||||
return ERR_CAST(dpt_obj);
|
||||
|
||||
|
|
|
@ -933,8 +933,9 @@ static int set_proto_ctx_param(struct drm_i915_file_private *fpriv,
|
|||
case I915_CONTEXT_PARAM_PERSISTENCE:
|
||||
if (args->size)
|
||||
ret = -EINVAL;
|
||||
ret = proto_context_set_persistence(fpriv->dev_priv, pc,
|
||||
args->value);
|
||||
else
|
||||
ret = proto_context_set_persistence(fpriv->dev_priv, pc,
|
||||
args->value);
|
||||
break;
|
||||
|
||||
case I915_CONTEXT_PARAM_PROTECTED_CONTENT:
|
||||
|
@ -1367,7 +1368,8 @@ static struct intel_engine_cs *active_engine(struct intel_context *ce)
|
|||
return engine;
|
||||
}
|
||||
|
||||
static void kill_engines(struct i915_gem_engines *engines, bool ban)
|
||||
static void
|
||||
kill_engines(struct i915_gem_engines *engines, bool exit, bool persistent)
|
||||
{
|
||||
struct i915_gem_engines_iter it;
|
||||
struct intel_context *ce;
|
||||
|
@ -1381,9 +1383,15 @@ static void kill_engines(struct i915_gem_engines *engines, bool ban)
|
|||
*/
|
||||
for_each_gem_engine(ce, engines, it) {
|
||||
struct intel_engine_cs *engine;
|
||||
bool skip = false;
|
||||
|
||||
if (ban && intel_context_ban(ce, NULL))
|
||||
continue;
|
||||
if (exit)
|
||||
skip = intel_context_set_exiting(ce);
|
||||
else if (!persistent)
|
||||
skip = intel_context_exit_nonpersistent(ce, NULL);
|
||||
|
||||
if (skip)
|
||||
continue; /* Already marked. */
|
||||
|
||||
/*
|
||||
* Check the current active state of this context; if we
|
||||
|
@ -1395,7 +1403,7 @@ static void kill_engines(struct i915_gem_engines *engines, bool ban)
|
|||
engine = active_engine(ce);
|
||||
|
||||
/* First attempt to gracefully cancel the context */
|
||||
if (engine && !__cancel_engine(engine) && ban)
|
||||
if (engine && !__cancel_engine(engine) && (exit || !persistent))
|
||||
/*
|
||||
* If we are unable to send a preemptive pulse to bump
|
||||
* the context from the GPU, we have to resort to a full
|
||||
|
@ -1407,8 +1415,6 @@ static void kill_engines(struct i915_gem_engines *engines, bool ban)
|
|||
|
||||
static void kill_context(struct i915_gem_context *ctx)
|
||||
{
|
||||
bool ban = (!i915_gem_context_is_persistent(ctx) ||
|
||||
!ctx->i915->params.enable_hangcheck);
|
||||
struct i915_gem_engines *pos, *next;
|
||||
|
||||
spin_lock_irq(&ctx->stale.lock);
|
||||
|
@ -1421,7 +1427,8 @@ static void kill_context(struct i915_gem_context *ctx)
|
|||
|
||||
spin_unlock_irq(&ctx->stale.lock);
|
||||
|
||||
kill_engines(pos, ban);
|
||||
kill_engines(pos, !ctx->i915->params.enable_hangcheck,
|
||||
i915_gem_context_is_persistent(ctx));
|
||||
|
||||
spin_lock_irq(&ctx->stale.lock);
|
||||
GEM_BUG_ON(i915_sw_fence_signaled(&pos->fence));
|
||||
|
@ -1467,7 +1474,8 @@ static void engines_idle_release(struct i915_gem_context *ctx,
|
|||
|
||||
kill:
|
||||
if (list_empty(&engines->link)) /* raced, already closed */
|
||||
kill_engines(engines, true);
|
||||
kill_engines(engines, true,
|
||||
i915_gem_context_is_persistent(ctx));
|
||||
|
||||
i915_sw_fence_commit(&engines->fence);
|
||||
}
|
||||
|
@ -1875,6 +1883,7 @@ i915_gem_user_to_context_sseu(struct intel_gt *gt,
|
|||
{
|
||||
const struct sseu_dev_info *device = >->info.sseu;
|
||||
struct drm_i915_private *i915 = gt->i915;
|
||||
unsigned int dev_subslice_mask = intel_sseu_get_hsw_subslices(device, 0);
|
||||
|
||||
/* No zeros in any field. */
|
||||
if (!user->slice_mask || !user->subslice_mask ||
|
||||
|
@ -1901,7 +1910,7 @@ i915_gem_user_to_context_sseu(struct intel_gt *gt,
|
|||
if (user->slice_mask & ~device->slice_mask)
|
||||
return -EINVAL;
|
||||
|
||||
if (user->subslice_mask & ~device->subslice_mask[0])
|
||||
if (user->subslice_mask & ~dev_subslice_mask)
|
||||
return -EINVAL;
|
||||
|
||||
if (user->max_eus_per_subslice > device->max_eus_per_subslice)
|
||||
|
@ -1915,7 +1924,7 @@ i915_gem_user_to_context_sseu(struct intel_gt *gt,
|
|||
/* Part specific restrictions. */
|
||||
if (GRAPHICS_VER(i915) == 11) {
|
||||
unsigned int hw_s = hweight8(device->slice_mask);
|
||||
unsigned int hw_ss_per_s = hweight8(device->subslice_mask[0]);
|
||||
unsigned int hw_ss_per_s = hweight8(dev_subslice_mask);
|
||||
unsigned int req_s = hweight8(context->slice_mask);
|
||||
unsigned int req_ss = hweight8(context->subslice_mask);
|
||||
|
||||
|
|
|
@ -35,12 +35,12 @@ bool i915_gem_cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
|
|||
if (obj->cache_dirty)
|
||||
return false;
|
||||
|
||||
if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
|
||||
return true;
|
||||
|
||||
if (IS_DGFX(i915))
|
||||
return false;
|
||||
|
||||
if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
|
||||
return true;
|
||||
|
||||
/* Currently in use by HW (display engine)? Keep flushed. */
|
||||
return i915_gem_object_is_framebuffer(obj);
|
||||
}
|
||||
|
|
|
@ -999,7 +999,8 @@ static int eb_validate_vmas(struct i915_execbuffer *eb)
|
|||
}
|
||||
}
|
||||
|
||||
err = dma_resv_reserve_fences(vma->obj->base.resv, 1);
|
||||
/* Reserve enough slots to accommodate composite fences */
|
||||
err = dma_resv_reserve_fences(vma->obj->base.resv, eb->num_batches);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
|
|
|
@ -670,17 +670,10 @@ fail:
|
|||
|
||||
static int init_shmem(struct intel_memory_region *mem)
|
||||
{
|
||||
int err;
|
||||
|
||||
err = i915_gemfs_init(mem->i915);
|
||||
if (err) {
|
||||
DRM_NOTE("Unable to create a private tmpfs mount, hugepage support will be disabled(%d).\n",
|
||||
err);
|
||||
}
|
||||
|
||||
i915_gemfs_init(mem->i915);
|
||||
intel_memory_region_set_name(mem, "system");
|
||||
|
||||
return 0; /* Don't error, we can simply fallback to the kernel mnt */
|
||||
return 0; /* We have fallback to the kernel mnt if gemfs init failed. */
|
||||
}
|
||||
|
||||
static int release_shmem(struct intel_memory_region *mem)
|
||||
|
|
|
@ -36,7 +36,7 @@ static bool can_release_pages(struct drm_i915_gem_object *obj)
|
|||
return swap_available() || obj->mm.madv == I915_MADV_DONTNEED;
|
||||
}
|
||||
|
||||
static int drop_pages(struct drm_i915_gem_object *obj,
|
||||
static bool drop_pages(struct drm_i915_gem_object *obj,
|
||||
unsigned long shrink, bool trylock_vm)
|
||||
{
|
||||
unsigned long flags;
|
||||
|
|
|
@ -13,6 +13,8 @@
|
|||
#include "gem/i915_gem_lmem.h"
|
||||
#include "gem/i915_gem_region.h"
|
||||
#include "gt/intel_gt.h"
|
||||
#include "gt/intel_gt_mcr.h"
|
||||
#include "gt/intel_gt_regs.h"
|
||||
#include "gt/intel_region_lmem.h"
|
||||
#include "i915_drv.h"
|
||||
#include "i915_gem_stolen.h"
|
||||
|
@ -834,8 +836,8 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private *i915, u16 type,
|
|||
} else {
|
||||
resource_size_t lmem_range;
|
||||
|
||||
lmem_range = intel_gt_read_register(&i915->gt0, XEHPSDV_TILE0_ADDR_RANGE) & 0xFFFF;
|
||||
lmem_size = lmem_range >> XEHPSDV_TILE_LMEM_RANGE_SHIFT;
|
||||
lmem_range = intel_gt_mcr_read_any(&i915->gt0, XEHP_TILE0_ADDR_RANGE) & 0xFFFF;
|
||||
lmem_size = lmem_range >> XEHP_TILE_LMEM_RANGE_SHIFT;
|
||||
lmem_size *= SZ_1G;
|
||||
}
|
||||
|
||||
|
|
|
@ -114,7 +114,7 @@ u32 i915_gem_fence_alignment(struct drm_i915_private *i915, u32 size,
|
|||
return i915_gem_fence_size(i915, size, tiling, stride);
|
||||
}
|
||||
|
||||
/* Check pitch constriants for all chips & tiling formats */
|
||||
/* Check pitch constraints for all chips & tiling formats */
|
||||
static bool
|
||||
i915_tiling_ok(struct drm_i915_gem_object *obj,
|
||||
unsigned int tiling, unsigned int stride)
|
||||
|
|
|
@ -11,16 +11,11 @@
|
|||
#include "i915_gemfs.h"
|
||||
#include "i915_utils.h"
|
||||
|
||||
int i915_gemfs_init(struct drm_i915_private *i915)
|
||||
void i915_gemfs_init(struct drm_i915_private *i915)
|
||||
{
|
||||
char huge_opt[] = "huge=within_size"; /* r/w */
|
||||
struct file_system_type *type;
|
||||
struct vfsmount *gemfs;
|
||||
char *opts;
|
||||
|
||||
type = get_fs_type("tmpfs");
|
||||
if (!type)
|
||||
return -ENODEV;
|
||||
|
||||
/*
|
||||
* By creating our own shmemfs mountpoint, we can pass in
|
||||
|
@ -28,30 +23,35 @@ int i915_gemfs_init(struct drm_i915_private *i915)
|
|||
*
|
||||
* One example, although it is probably better with a per-file
|
||||
* control, is selecting huge page allocations ("huge=within_size").
|
||||
* However, we only do so to offset the overhead of iommu lookups
|
||||
* due to bandwidth issues (slow reads) on Broadwell+.
|
||||
* However, we only do so on platforms which benefit from it, or to
|
||||
* offset the overhead of iommu lookups, where with latter it is a net
|
||||
* win even on platforms which would otherwise see some performance
|
||||
* regressions such a slow reads issue on Broadwell and Skylake.
|
||||
*/
|
||||
|
||||
opts = NULL;
|
||||
if (i915_vtd_active(i915)) {
|
||||
if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
|
||||
opts = huge_opt;
|
||||
drm_info(&i915->drm,
|
||||
"Transparent Hugepage mode '%s'\n",
|
||||
opts);
|
||||
} else {
|
||||
drm_notice(&i915->drm,
|
||||
"Transparent Hugepage support is recommended for optimal performance when IOMMU is enabled!\n");
|
||||
}
|
||||
}
|
||||
if (GRAPHICS_VER(i915) < 11 && !i915_vtd_active(i915))
|
||||
return;
|
||||
|
||||
gemfs = vfs_kern_mount(type, SB_KERNMOUNT, type->name, opts);
|
||||
if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
|
||||
goto err;
|
||||
|
||||
type = get_fs_type("tmpfs");
|
||||
if (!type)
|
||||
goto err;
|
||||
|
||||
gemfs = vfs_kern_mount(type, SB_KERNMOUNT, type->name, huge_opt);
|
||||
if (IS_ERR(gemfs))
|
||||
return PTR_ERR(gemfs);
|
||||
goto err;
|
||||
|
||||
i915->mm.gemfs = gemfs;
|
||||
drm_info(&i915->drm, "Using Transparent Hugepages\n");
|
||||
return;
|
||||
|
||||
return 0;
|
||||
err:
|
||||
drm_notice(&i915->drm,
|
||||
"Transparent Hugepage support is recommended for optimal performance%s\n",
|
||||
GRAPHICS_VER(i915) >= 11 ? " on this platform!" :
|
||||
" when IOMMU is enabled!");
|
||||
}
|
||||
|
||||
void i915_gemfs_fini(struct drm_i915_private *i915)
|
||||
|
|
|
@ -9,8 +9,7 @@
|
|||
|
||||
struct drm_i915_private;
|
||||
|
||||
int i915_gemfs_init(struct drm_i915_private *i915);
|
||||
|
||||
void i915_gemfs_init(struct drm_i915_private *i915);
|
||||
void i915_gemfs_fini(struct drm_i915_private *i915);
|
||||
|
||||
#endif
|
||||
|
|
|
@ -6,6 +6,7 @@
|
|||
#include "i915_selftest.h"
|
||||
|
||||
#include "gt/intel_context.h"
|
||||
#include "gt/intel_engine_regs.h"
|
||||
#include "gt/intel_engine_user.h"
|
||||
#include "gt/intel_gpu_commands.h"
|
||||
#include "gt/intel_gt.h"
|
||||
|
@ -18,10 +19,71 @@
|
|||
#include "huge_gem_object.h"
|
||||
#include "mock_context.h"
|
||||
|
||||
#define OW_SIZE 16 /* in bytes */
|
||||
#define F_SUBTILE_SIZE 64 /* in bytes */
|
||||
#define F_TILE_WIDTH 128 /* in bytes */
|
||||
#define F_TILE_HEIGHT 32 /* in pixels */
|
||||
#define F_SUBTILE_WIDTH OW_SIZE /* in bytes */
|
||||
#define F_SUBTILE_HEIGHT 4 /* in pixels */
|
||||
|
||||
static int linear_x_y_to_ftiled_pos(int x, int y, u32 stride, int bpp)
|
||||
{
|
||||
int tile_base;
|
||||
int tile_x, tile_y;
|
||||
int swizzle, subtile;
|
||||
int pixel_size = bpp / 8;
|
||||
int pos;
|
||||
|
||||
/*
|
||||
* Subtile remapping for F tile. Note that map[a]==b implies map[b]==a
|
||||
* so we can use the same table to tile and until.
|
||||
*/
|
||||
static const u8 f_subtile_map[] = {
|
||||
0, 1, 2, 3, 8, 9, 10, 11,
|
||||
4, 5, 6, 7, 12, 13, 14, 15,
|
||||
16, 17, 18, 19, 24, 25, 26, 27,
|
||||
20, 21, 22, 23, 28, 29, 30, 31,
|
||||
32, 33, 34, 35, 40, 41, 42, 43,
|
||||
36, 37, 38, 39, 44, 45, 46, 47,
|
||||
48, 49, 50, 51, 56, 57, 58, 59,
|
||||
52, 53, 54, 55, 60, 61, 62, 63
|
||||
};
|
||||
|
||||
x *= pixel_size;
|
||||
/*
|
||||
* Where does the 4k tile start (in bytes)? This is the same for Y and
|
||||
* F so we can use the Y-tile algorithm to get to that point.
|
||||
*/
|
||||
tile_base =
|
||||
y / F_TILE_HEIGHT * stride * F_TILE_HEIGHT +
|
||||
x / F_TILE_WIDTH * 4096;
|
||||
|
||||
/* Find pixel within tile */
|
||||
tile_x = x % F_TILE_WIDTH;
|
||||
tile_y = y % F_TILE_HEIGHT;
|
||||
|
||||
/* And figure out the subtile within the 4k tile */
|
||||
subtile = tile_y / F_SUBTILE_HEIGHT * 8 + tile_x / F_SUBTILE_WIDTH;
|
||||
|
||||
/* Swizzle the subtile number according to the bspec diagram */
|
||||
swizzle = f_subtile_map[subtile];
|
||||
|
||||
/* Calculate new position */
|
||||
pos = tile_base +
|
||||
swizzle * F_SUBTILE_SIZE +
|
||||
tile_y % F_SUBTILE_HEIGHT * OW_SIZE +
|
||||
tile_x % F_SUBTILE_WIDTH;
|
||||
|
||||
GEM_BUG_ON(!IS_ALIGNED(pos, pixel_size));
|
||||
|
||||
return pos / pixel_size * 4;
|
||||
}
|
||||
|
||||
enum client_tiling {
|
||||
CLIENT_TILING_LINEAR,
|
||||
CLIENT_TILING_X,
|
||||
CLIENT_TILING_Y,
|
||||
CLIENT_TILING_4,
|
||||
CLIENT_NUM_TILING_TYPES
|
||||
};
|
||||
|
||||
|
@ -45,6 +107,36 @@ struct tiled_blits {
|
|||
u32 height;
|
||||
};
|
||||
|
||||
static bool supports_x_tiling(const struct drm_i915_private *i915)
|
||||
{
|
||||
int gen = GRAPHICS_VER(i915);
|
||||
|
||||
if (gen < 12)
|
||||
return true;
|
||||
|
||||
if (!HAS_LMEM(i915) || IS_DG1(i915))
|
||||
return false;
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
static bool fast_blit_ok(const struct blit_buffer *buf)
|
||||
{
|
||||
int gen = GRAPHICS_VER(buf->vma->vm->i915);
|
||||
|
||||
if (gen < 9)
|
||||
return false;
|
||||
|
||||
if (gen < 12)
|
||||
return true;
|
||||
|
||||
/* filter out platforms with unsupported X-tile support in fastblit */
|
||||
if (buf->tiling == CLIENT_TILING_X && !supports_x_tiling(buf->vma->vm->i915))
|
||||
return false;
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
static int prepare_blit(const struct tiled_blits *t,
|
||||
struct blit_buffer *dst,
|
||||
struct blit_buffer *src,
|
||||
|
@ -59,51 +151,103 @@ static int prepare_blit(const struct tiled_blits *t,
|
|||
if (IS_ERR(cs))
|
||||
return PTR_ERR(cs);
|
||||
|
||||
*cs++ = MI_LOAD_REGISTER_IMM(1);
|
||||
*cs++ = i915_mmio_reg_offset(BCS_SWCTRL);
|
||||
cmd = (BCS_SRC_Y | BCS_DST_Y) << 16;
|
||||
if (src->tiling == CLIENT_TILING_Y)
|
||||
cmd |= BCS_SRC_Y;
|
||||
if (dst->tiling == CLIENT_TILING_Y)
|
||||
cmd |= BCS_DST_Y;
|
||||
*cs++ = cmd;
|
||||
if (fast_blit_ok(dst) && fast_blit_ok(src)) {
|
||||
struct intel_gt *gt = t->ce->engine->gt;
|
||||
u32 src_tiles = 0, dst_tiles = 0;
|
||||
u32 src_4t = 0, dst_4t = 0;
|
||||
|
||||
cmd = MI_FLUSH_DW;
|
||||
if (ver >= 8)
|
||||
cmd++;
|
||||
*cs++ = cmd;
|
||||
*cs++ = 0;
|
||||
*cs++ = 0;
|
||||
*cs++ = 0;
|
||||
/* Need to program BLIT_CCTL if it is not done previously
|
||||
* before using XY_FAST_COPY_BLT
|
||||
*/
|
||||
*cs++ = MI_LOAD_REGISTER_IMM(1);
|
||||
*cs++ = i915_mmio_reg_offset(BLIT_CCTL(t->ce->engine->mmio_base));
|
||||
*cs++ = (BLIT_CCTL_SRC_MOCS(gt->mocs.uc_index) |
|
||||
BLIT_CCTL_DST_MOCS(gt->mocs.uc_index));
|
||||
|
||||
cmd = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (8 - 2);
|
||||
if (ver >= 8)
|
||||
cmd += 2;
|
||||
src_pitch = t->width; /* in dwords */
|
||||
if (src->tiling == CLIENT_TILING_4) {
|
||||
src_tiles = XY_FAST_COPY_BLT_D0_SRC_TILE_MODE(YMAJOR);
|
||||
src_4t = XY_FAST_COPY_BLT_D1_SRC_TILE4;
|
||||
} else if (src->tiling == CLIENT_TILING_Y) {
|
||||
src_tiles = XY_FAST_COPY_BLT_D0_SRC_TILE_MODE(YMAJOR);
|
||||
} else if (src->tiling == CLIENT_TILING_X) {
|
||||
src_tiles = XY_FAST_COPY_BLT_D0_SRC_TILE_MODE(TILE_X);
|
||||
} else {
|
||||
src_pitch *= 4; /* in bytes */
|
||||
}
|
||||
|
||||
src_pitch = t->width * 4;
|
||||
if (src->tiling) {
|
||||
cmd |= XY_SRC_COPY_BLT_SRC_TILED;
|
||||
src_pitch /= 4;
|
||||
}
|
||||
dst_pitch = t->width; /* in dwords */
|
||||
if (dst->tiling == CLIENT_TILING_4) {
|
||||
dst_tiles = XY_FAST_COPY_BLT_D0_DST_TILE_MODE(YMAJOR);
|
||||
dst_4t = XY_FAST_COPY_BLT_D1_DST_TILE4;
|
||||
} else if (dst->tiling == CLIENT_TILING_Y) {
|
||||
dst_tiles = XY_FAST_COPY_BLT_D0_DST_TILE_MODE(YMAJOR);
|
||||
} else if (dst->tiling == CLIENT_TILING_X) {
|
||||
dst_tiles = XY_FAST_COPY_BLT_D0_DST_TILE_MODE(TILE_X);
|
||||
} else {
|
||||
dst_pitch *= 4; /* in bytes */
|
||||
}
|
||||
|
||||
dst_pitch = t->width * 4;
|
||||
if (dst->tiling) {
|
||||
cmd |= XY_SRC_COPY_BLT_DST_TILED;
|
||||
dst_pitch /= 4;
|
||||
}
|
||||
|
||||
*cs++ = cmd;
|
||||
*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | dst_pitch;
|
||||
*cs++ = 0;
|
||||
*cs++ = t->height << 16 | t->width;
|
||||
*cs++ = lower_32_bits(dst->vma->node.start);
|
||||
if (use_64b_reloc)
|
||||
*cs++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2) |
|
||||
src_tiles | dst_tiles;
|
||||
*cs++ = src_4t | dst_4t | BLT_DEPTH_32 | dst_pitch;
|
||||
*cs++ = 0;
|
||||
*cs++ = t->height << 16 | t->width;
|
||||
*cs++ = lower_32_bits(dst->vma->node.start);
|
||||
*cs++ = upper_32_bits(dst->vma->node.start);
|
||||
*cs++ = 0;
|
||||
*cs++ = src_pitch;
|
||||
*cs++ = lower_32_bits(src->vma->node.start);
|
||||
if (use_64b_reloc)
|
||||
*cs++ = 0;
|
||||
*cs++ = src_pitch;
|
||||
*cs++ = lower_32_bits(src->vma->node.start);
|
||||
*cs++ = upper_32_bits(src->vma->node.start);
|
||||
} else {
|
||||
if (ver >= 6) {
|
||||
*cs++ = MI_LOAD_REGISTER_IMM(1);
|
||||
*cs++ = i915_mmio_reg_offset(BCS_SWCTRL);
|
||||
cmd = (BCS_SRC_Y | BCS_DST_Y) << 16;
|
||||
if (src->tiling == CLIENT_TILING_Y)
|
||||
cmd |= BCS_SRC_Y;
|
||||
if (dst->tiling == CLIENT_TILING_Y)
|
||||
cmd |= BCS_DST_Y;
|
||||
*cs++ = cmd;
|
||||
|
||||
cmd = MI_FLUSH_DW;
|
||||
if (ver >= 8)
|
||||
cmd++;
|
||||
*cs++ = cmd;
|
||||
*cs++ = 0;
|
||||
*cs++ = 0;
|
||||
*cs++ = 0;
|
||||
}
|
||||
|
||||
cmd = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (8 - 2);
|
||||
if (ver >= 8)
|
||||
cmd += 2;
|
||||
|
||||
src_pitch = t->width * 4;
|
||||
if (src->tiling) {
|
||||
cmd |= XY_SRC_COPY_BLT_SRC_TILED;
|
||||
src_pitch /= 4;
|
||||
}
|
||||
|
||||
dst_pitch = t->width * 4;
|
||||
if (dst->tiling) {
|
||||
cmd |= XY_SRC_COPY_BLT_DST_TILED;
|
||||
dst_pitch /= 4;
|
||||
}
|
||||
|
||||
*cs++ = cmd;
|
||||
*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | dst_pitch;
|
||||
*cs++ = 0;
|
||||
*cs++ = t->height << 16 | t->width;
|
||||
*cs++ = lower_32_bits(dst->vma->node.start);
|
||||
if (use_64b_reloc)
|
||||
*cs++ = upper_32_bits(dst->vma->node.start);
|
||||
*cs++ = 0;
|
||||
*cs++ = src_pitch;
|
||||
*cs++ = lower_32_bits(src->vma->node.start);
|
||||
if (use_64b_reloc)
|
||||
*cs++ = upper_32_bits(src->vma->node.start);
|
||||
}
|
||||
|
||||
*cs++ = MI_BATCH_BUFFER_END;
|
||||
|
||||
|
@ -181,7 +325,13 @@ static int tiled_blits_create_buffers(struct tiled_blits *t,
|
|||
|
||||
t->buffers[i].vma = vma;
|
||||
t->buffers[i].tiling =
|
||||
i915_prandom_u32_max_state(CLIENT_TILING_Y + 1, prng);
|
||||
i915_prandom_u32_max_state(CLIENT_NUM_TILING_TYPES, prng);
|
||||
|
||||
/* Platforms support either TileY or Tile4, not both */
|
||||
if (HAS_4TILE(i915) && t->buffers[i].tiling == CLIENT_TILING_Y)
|
||||
t->buffers[i].tiling = CLIENT_TILING_4;
|
||||
else if (!HAS_4TILE(i915) && t->buffers[i].tiling == CLIENT_TILING_4)
|
||||
t->buffers[i].tiling = CLIENT_TILING_Y;
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
@ -206,7 +356,8 @@ static u64 swizzle_bit(unsigned int bit, u64 offset)
|
|||
static u64 tiled_offset(const struct intel_gt *gt,
|
||||
u64 v,
|
||||
unsigned int stride,
|
||||
enum client_tiling tiling)
|
||||
enum client_tiling tiling,
|
||||
int x_pos, int y_pos)
|
||||
{
|
||||
unsigned int swizzle;
|
||||
u64 x, y;
|
||||
|
@ -216,7 +367,12 @@ static u64 tiled_offset(const struct intel_gt *gt,
|
|||
|
||||
y = div64_u64_rem(v, stride, &x);
|
||||
|
||||
if (tiling == CLIENT_TILING_X) {
|
||||
if (tiling == CLIENT_TILING_4) {
|
||||
v = linear_x_y_to_ftiled_pos(x_pos, y_pos, stride, 32);
|
||||
|
||||
/* no swizzling for f-tiling */
|
||||
swizzle = I915_BIT_6_SWIZZLE_NONE;
|
||||
} else if (tiling == CLIENT_TILING_X) {
|
||||
v = div64_u64_rem(y, 8, &y) * stride * 8;
|
||||
v += y * 512;
|
||||
v += div64_u64_rem(x, 512, &x) << 12;
|
||||
|
@ -259,6 +415,7 @@ static const char *repr_tiling(enum client_tiling tiling)
|
|||
case CLIENT_TILING_LINEAR: return "linear";
|
||||
case CLIENT_TILING_X: return "X";
|
||||
case CLIENT_TILING_Y: return "Y";
|
||||
case CLIENT_TILING_4: return "F";
|
||||
default: return "unknown";
|
||||
}
|
||||
}
|
||||
|
@ -284,7 +441,7 @@ static int verify_buffer(const struct tiled_blits *t,
|
|||
} else {
|
||||
u64 v = tiled_offset(buf->vma->vm->gt,
|
||||
p * 4, t->width * 4,
|
||||
buf->tiling);
|
||||
buf->tiling, x, y);
|
||||
|
||||
if (vaddr[v / sizeof(*vaddr)] != buf->start_val + p)
|
||||
ret = -EINVAL;
|
||||
|
@ -504,6 +661,9 @@ static int tiled_blits_bounce(struct tiled_blits *t, struct rnd_state *prng)
|
|||
if (err)
|
||||
return err;
|
||||
|
||||
/* Simulating GTT eviction of the same buffer / layout */
|
||||
t->buffers[2].tiling = t->buffers[0].tiling;
|
||||
|
||||
/* Reposition so that we overlap the old addresses, and slightly off */
|
||||
err = tiled_blit(t,
|
||||
&t->buffers[2], t->hole + t->align,
|
||||
|
|
|
@ -212,7 +212,7 @@ static int __live_parallel_switch1(void *data)
|
|||
|
||||
i915_request_add(rq);
|
||||
}
|
||||
if (i915_request_wait(rq, 0, HZ / 5) < 0)
|
||||
if (i915_request_wait(rq, 0, HZ) < 0)
|
||||
err = -ETIME;
|
||||
i915_request_put(rq);
|
||||
if (err)
|
||||
|
|
|
@ -197,8 +197,10 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
|
|||
|
||||
flags |= PIPE_CONTROL_CS_STALL;
|
||||
|
||||
if (engine->class == COMPUTE_CLASS)
|
||||
flags &= ~PIPE_CONTROL_3D_FLAGS;
|
||||
if (!HAS_3D_PIPELINE(engine->i915))
|
||||
flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
|
||||
else if (engine->class == COMPUTE_CLASS)
|
||||
flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
|
||||
|
||||
cs = intel_ring_begin(rq, 6);
|
||||
if (IS_ERR(cs))
|
||||
|
@ -227,8 +229,10 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
|
|||
|
||||
flags |= PIPE_CONTROL_CS_STALL;
|
||||
|
||||
if (engine->class == COMPUTE_CLASS)
|
||||
flags &= ~PIPE_CONTROL_3D_FLAGS;
|
||||
if (!HAS_3D_PIPELINE(engine->i915))
|
||||
flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
|
||||
else if (engine->class == COMPUTE_CLASS)
|
||||
flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
|
||||
|
||||
if (!HAS_FLAT_CCS(rq->engine->i915))
|
||||
count = 8 + 4;
|
||||
|
@ -272,7 +276,8 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
|
|||
if (!HAS_FLAT_CCS(rq->engine->i915) &&
|
||||
(rq->engine->class == VIDEO_DECODE_CLASS ||
|
||||
rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
|
||||
aux_inv = rq->engine->mask & ~BIT(BCS0);
|
||||
aux_inv = rq->engine->mask &
|
||||
~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
|
||||
if (aux_inv)
|
||||
cmd += 4;
|
||||
}
|
||||
|
@ -716,8 +721,10 @@ u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
|
|||
/* Wa_1409600907 */
|
||||
flags |= PIPE_CONTROL_DEPTH_STALL;
|
||||
|
||||
if (rq->engine->class == COMPUTE_CLASS)
|
||||
flags &= ~PIPE_CONTROL_3D_FLAGS;
|
||||
if (!HAS_3D_PIPELINE(rq->engine->i915))
|
||||
flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
|
||||
else if (rq->engine->class == COMPUTE_CLASS)
|
||||
flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
|
||||
|
||||
cs = gen12_emit_ggtt_write_rcs(cs,
|
||||
rq->fence.seqno,
|
||||
|
|
|
@ -601,6 +601,30 @@ u64 intel_context_get_avg_runtime_ns(struct intel_context *ce)
|
|||
return avg;
|
||||
}
|
||||
|
||||
bool intel_context_ban(struct intel_context *ce, struct i915_request *rq)
|
||||
{
|
||||
bool ret = intel_context_set_banned(ce);
|
||||
|
||||
trace_intel_context_ban(ce);
|
||||
|
||||
if (ce->ops->revoke)
|
||||
ce->ops->revoke(ce, rq,
|
||||
INTEL_CONTEXT_BANNED_PREEMPT_TIMEOUT_MS);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
bool intel_context_exit_nonpersistent(struct intel_context *ce,
|
||||
struct i915_request *rq)
|
||||
{
|
||||
bool ret = intel_context_set_exiting(ce);
|
||||
|
||||
if (ce->ops->revoke)
|
||||
ce->ops->revoke(ce, rq, ce->engine->props.preempt_timeout_ms);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
|
||||
#include "selftest_context.c"
|
||||
#endif
|
||||
|
|
|
@ -25,6 +25,8 @@
|
|||
##__VA_ARGS__); \
|
||||
} while (0)
|
||||
|
||||
#define INTEL_CONTEXT_BANNED_PREEMPT_TIMEOUT_MS (1)
|
||||
|
||||
struct i915_gem_ww_ctx;
|
||||
|
||||
void intel_context_init(struct intel_context *ce,
|
||||
|
@ -309,18 +311,27 @@ static inline bool intel_context_set_banned(struct intel_context *ce)
|
|||
return test_and_set_bit(CONTEXT_BANNED, &ce->flags);
|
||||
}
|
||||
|
||||
static inline bool intel_context_ban(struct intel_context *ce,
|
||||
struct i915_request *rq)
|
||||
bool intel_context_ban(struct intel_context *ce, struct i915_request *rq);
|
||||
|
||||
static inline bool intel_context_is_schedulable(const struct intel_context *ce)
|
||||
{
|
||||
bool ret = intel_context_set_banned(ce);
|
||||
|
||||
trace_intel_context_ban(ce);
|
||||
if (ce->ops->ban)
|
||||
ce->ops->ban(ce, rq);
|
||||
|
||||
return ret;
|
||||
return !test_bit(CONTEXT_EXITING, &ce->flags) &&
|
||||
!test_bit(CONTEXT_BANNED, &ce->flags);
|
||||
}
|
||||
|
||||
static inline bool intel_context_is_exiting(const struct intel_context *ce)
|
||||
{
|
||||
return test_bit(CONTEXT_EXITING, &ce->flags);
|
||||
}
|
||||
|
||||
static inline bool intel_context_set_exiting(struct intel_context *ce)
|
||||
{
|
||||
return test_and_set_bit(CONTEXT_EXITING, &ce->flags);
|
||||
}
|
||||
|
||||
bool intel_context_exit_nonpersistent(struct intel_context *ce,
|
||||
struct i915_request *rq);
|
||||
|
||||
static inline bool
|
||||
intel_context_force_single_submission(const struct intel_context *ce)
|
||||
{
|
||||
|
|
|
@ -40,7 +40,8 @@ struct intel_context_ops {
|
|||
|
||||
int (*alloc)(struct intel_context *ce);
|
||||
|
||||
void (*ban)(struct intel_context *ce, struct i915_request *rq);
|
||||
void (*revoke)(struct intel_context *ce, struct i915_request *rq,
|
||||
unsigned int preempt_timeout_ms);
|
||||
|
||||
int (*pre_pin)(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void **vaddr);
|
||||
int (*pin)(struct intel_context *ce, void *vaddr);
|
||||
|
@ -122,6 +123,7 @@ struct intel_context {
|
|||
#define CONTEXT_GUC_INIT 10
|
||||
#define CONTEXT_PERMA_PIN 11
|
||||
#define CONTEXT_IS_PARKING 12
|
||||
#define CONTEXT_EXITING 13
|
||||
|
||||
struct {
|
||||
u64 timeout_us;
|
||||
|
|
|
@ -201,6 +201,8 @@ int intel_ring_submission_setup(struct intel_engine_cs *engine);
|
|||
int intel_engine_stop_cs(struct intel_engine_cs *engine);
|
||||
void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine);
|
||||
|
||||
void intel_engine_wait_for_pending_mi_fw(struct intel_engine_cs *engine);
|
||||
|
||||
void intel_engine_set_hwsp_writemask(struct intel_engine_cs *engine, u32 mask);
|
||||
|
||||
u64 intel_engine_get_active_head(const struct intel_engine_cs *engine);
|
||||
|
|
|
@ -21,8 +21,9 @@
|
|||
#include "intel_engine_user.h"
|
||||
#include "intel_execlists_submission.h"
|
||||
#include "intel_gt.h"
|
||||
#include "intel_gt_requests.h"
|
||||
#include "intel_gt_mcr.h"
|
||||
#include "intel_gt_pm.h"
|
||||
#include "intel_gt_requests.h"
|
||||
#include "intel_lrc.h"
|
||||
#include "intel_lrc_reg.h"
|
||||
#include "intel_reset.h"
|
||||
|
@ -71,6 +72,62 @@ static const struct engine_info intel_engines[] = {
|
|||
{ .graphics_ver = 6, .base = BLT_RING_BASE }
|
||||
},
|
||||
},
|
||||
[BCS1] = {
|
||||
.class = COPY_ENGINE_CLASS,
|
||||
.instance = 1,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 12, .base = XEHPC_BCS1_RING_BASE }
|
||||
},
|
||||
},
|
||||
[BCS2] = {
|
||||
.class = COPY_ENGINE_CLASS,
|
||||
.instance = 2,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 12, .base = XEHPC_BCS2_RING_BASE }
|
||||
},
|
||||
},
|
||||
[BCS3] = {
|
||||
.class = COPY_ENGINE_CLASS,
|
||||
.instance = 3,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 12, .base = XEHPC_BCS3_RING_BASE }
|
||||
},
|
||||
},
|
||||
[BCS4] = {
|
||||
.class = COPY_ENGINE_CLASS,
|
||||
.instance = 4,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 12, .base = XEHPC_BCS4_RING_BASE }
|
||||
},
|
||||
},
|
||||
[BCS5] = {
|
||||
.class = COPY_ENGINE_CLASS,
|
||||
.instance = 5,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 12, .base = XEHPC_BCS5_RING_BASE }
|
||||
},
|
||||
},
|
||||
[BCS6] = {
|
||||
.class = COPY_ENGINE_CLASS,
|
||||
.instance = 6,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 12, .base = XEHPC_BCS6_RING_BASE }
|
||||
},
|
||||
},
|
||||
[BCS7] = {
|
||||
.class = COPY_ENGINE_CLASS,
|
||||
.instance = 7,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 12, .base = XEHPC_BCS7_RING_BASE }
|
||||
},
|
||||
},
|
||||
[BCS8] = {
|
||||
.class = COPY_ENGINE_CLASS,
|
||||
.instance = 8,
|
||||
.mmio_bases = {
|
||||
{ .graphics_ver = 12, .base = XEHPC_BCS8_RING_BASE }
|
||||
},
|
||||
},
|
||||
[VCS0] = {
|
||||
.class = VIDEO_DECODE_CLASS,
|
||||
.instance = 0,
|
||||
|
@ -334,6 +391,14 @@ static u32 get_reset_domain(u8 ver, enum intel_engine_id id)
|
|||
static const u32 engine_reset_domains[] = {
|
||||
[RCS0] = GEN11_GRDOM_RENDER,
|
||||
[BCS0] = GEN11_GRDOM_BLT,
|
||||
[BCS1] = XEHPC_GRDOM_BLT1,
|
||||
[BCS2] = XEHPC_GRDOM_BLT2,
|
||||
[BCS3] = XEHPC_GRDOM_BLT3,
|
||||
[BCS4] = XEHPC_GRDOM_BLT4,
|
||||
[BCS5] = XEHPC_GRDOM_BLT5,
|
||||
[BCS6] = XEHPC_GRDOM_BLT6,
|
||||
[BCS7] = XEHPC_GRDOM_BLT7,
|
||||
[BCS8] = XEHPC_GRDOM_BLT8,
|
||||
[VCS0] = GEN11_GRDOM_MEDIA,
|
||||
[VCS1] = GEN11_GRDOM_MEDIA2,
|
||||
[VCS2] = GEN11_GRDOM_MEDIA3,
|
||||
|
@ -610,8 +675,8 @@ static void engine_mask_apply_compute_fuses(struct intel_gt *gt)
|
|||
if (GRAPHICS_VER_FULL(i915) < IP_VER(12, 50))
|
||||
return;
|
||||
|
||||
ccs_mask = intel_slicemask_from_dssmask(intel_sseu_get_compute_subslices(&info->sseu),
|
||||
ss_per_ccs);
|
||||
ccs_mask = intel_slicemask_from_xehp_dssmask(info->sseu.compute_subslice_mask,
|
||||
ss_per_ccs);
|
||||
/*
|
||||
* If all DSS in a quadrant are fused off, the corresponding CCS
|
||||
* engine is not available for use.
|
||||
|
@ -622,6 +687,34 @@ static void engine_mask_apply_compute_fuses(struct intel_gt *gt)
|
|||
}
|
||||
}
|
||||
|
||||
static void engine_mask_apply_copy_fuses(struct intel_gt *gt)
|
||||
{
|
||||
struct drm_i915_private *i915 = gt->i915;
|
||||
struct intel_gt_info *info = >->info;
|
||||
unsigned long meml3_mask;
|
||||
unsigned long quad;
|
||||
|
||||
meml3_mask = intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3);
|
||||
meml3_mask = REG_FIELD_GET(GEN12_MEML3_EN_MASK, meml3_mask);
|
||||
|
||||
/*
|
||||
* Link Copy engines may be fused off according to meml3_mask. Each
|
||||
* bit is a quad that houses 2 Link Copy and two Sub Copy engines.
|
||||
*/
|
||||
for_each_clear_bit(quad, &meml3_mask, GEN12_MAX_MSLICES) {
|
||||
unsigned int instance = quad * 2 + 1;
|
||||
intel_engine_mask_t mask = GENMASK(_BCS(instance + 1),
|
||||
_BCS(instance));
|
||||
|
||||
if (mask & info->engine_mask) {
|
||||
drm_dbg(&i915->drm, "bcs%u fused off\n", instance);
|
||||
drm_dbg(&i915->drm, "bcs%u fused off\n", instance + 1);
|
||||
|
||||
info->engine_mask &= ~mask;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Determine which engines are fused off in our particular hardware.
|
||||
* Note that we have a catch-22 situation where we need to be able to access
|
||||
|
@ -704,6 +797,7 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt)
|
|||
GEM_BUG_ON(vebox_mask != VEBOX_MASK(gt));
|
||||
|
||||
engine_mask_apply_compute_fuses(gt);
|
||||
engine_mask_apply_copy_fuses(gt);
|
||||
|
||||
return info->engine_mask;
|
||||
}
|
||||
|
@ -1282,10 +1376,10 @@ static int __intel_engine_stop_cs(struct intel_engine_cs *engine,
|
|||
intel_uncore_write_fw(uncore, mode, _MASKED_BIT_ENABLE(STOP_RING));
|
||||
|
||||
/*
|
||||
* Wa_22011802037 : gen12, Prior to doing a reset, ensure CS is
|
||||
* Wa_22011802037 : gen11, gen12, Prior to doing a reset, ensure CS is
|
||||
* stopped, set ring stop bit and prefetch disable bit to halt CS
|
||||
*/
|
||||
if (GRAPHICS_VER(engine->i915) == 12)
|
||||
if (IS_GRAPHICS_VER(engine->i915, 11, 12))
|
||||
intel_uncore_write_fw(uncore, RING_MODE_GEN7(engine->mmio_base),
|
||||
_MASKED_BIT_ENABLE(GEN12_GFX_PREFETCH_DISABLE));
|
||||
|
||||
|
@ -1308,6 +1402,18 @@ int intel_engine_stop_cs(struct intel_engine_cs *engine)
|
|||
return -ENODEV;
|
||||
|
||||
ENGINE_TRACE(engine, "\n");
|
||||
/*
|
||||
* TODO: Find out why occasionally stopping the CS times out. Seen
|
||||
* especially with gem_eio tests.
|
||||
*
|
||||
* Occasionally trying to stop the cs times out, but does not adversely
|
||||
* affect functionality. The timeout is set as a config parameter that
|
||||
* defaults to 100ms. In most cases the follow up operation is to wait
|
||||
* for pending MI_FORCE_WAKES. The assumption is that this timeout is
|
||||
* sufficient for any pending MI_FORCEWAKEs to complete. Once root
|
||||
* caused, the caller must check and handle the return from this
|
||||
* function.
|
||||
*/
|
||||
if (__intel_engine_stop_cs(engine, 1000, stop_timeout(engine))) {
|
||||
ENGINE_TRACE(engine,
|
||||
"timed out on STOP_RING -> IDLE; HEAD:%04x, TAIL:%04x\n",
|
||||
|
@ -1334,12 +1440,76 @@ void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine)
|
|||
ENGINE_WRITE_FW(engine, RING_MI_MODE, _MASKED_BIT_DISABLE(STOP_RING));
|
||||
}
|
||||
|
||||
static u32
|
||||
read_subslice_reg(const struct intel_engine_cs *engine,
|
||||
int slice, int subslice, i915_reg_t reg)
|
||||
static u32 __cs_pending_mi_force_wakes(struct intel_engine_cs *engine)
|
||||
{
|
||||
return intel_uncore_read_with_mcr_steering(engine->uncore, reg,
|
||||
slice, subslice);
|
||||
static const i915_reg_t _reg[I915_NUM_ENGINES] = {
|
||||
[RCS0] = MSG_IDLE_CS,
|
||||
[BCS0] = MSG_IDLE_BCS,
|
||||
[VCS0] = MSG_IDLE_VCS0,
|
||||
[VCS1] = MSG_IDLE_VCS1,
|
||||
[VCS2] = MSG_IDLE_VCS2,
|
||||
[VCS3] = MSG_IDLE_VCS3,
|
||||
[VCS4] = MSG_IDLE_VCS4,
|
||||
[VCS5] = MSG_IDLE_VCS5,
|
||||
[VCS6] = MSG_IDLE_VCS6,
|
||||
[VCS7] = MSG_IDLE_VCS7,
|
||||
[VECS0] = MSG_IDLE_VECS0,
|
||||
[VECS1] = MSG_IDLE_VECS1,
|
||||
[VECS2] = MSG_IDLE_VECS2,
|
||||
[VECS3] = MSG_IDLE_VECS3,
|
||||
[CCS0] = MSG_IDLE_CS,
|
||||
[CCS1] = MSG_IDLE_CS,
|
||||
[CCS2] = MSG_IDLE_CS,
|
||||
[CCS3] = MSG_IDLE_CS,
|
||||
};
|
||||
u32 val;
|
||||
|
||||
if (!_reg[engine->id].reg) {
|
||||
drm_err(&engine->i915->drm,
|
||||
"MSG IDLE undefined for engine id %u\n", engine->id);
|
||||
return 0;
|
||||
}
|
||||
|
||||
val = intel_uncore_read(engine->uncore, _reg[engine->id]);
|
||||
|
||||
/* bits[29:25] & bits[13:9] >> shift */
|
||||
return (val & (val >> 16) & MSG_IDLE_FW_MASK) >> MSG_IDLE_FW_SHIFT;
|
||||
}
|
||||
|
||||
static void __gpm_wait_for_fw_complete(struct intel_gt *gt, u32 fw_mask)
|
||||
{
|
||||
int ret;
|
||||
|
||||
/* Ensure GPM receives fw up/down after CS is stopped */
|
||||
udelay(1);
|
||||
|
||||
/* Wait for forcewake request to complete in GPM */
|
||||
ret = __intel_wait_for_register_fw(gt->uncore,
|
||||
GEN9_PWRGT_DOMAIN_STATUS,
|
||||
fw_mask, fw_mask, 5000, 0, NULL);
|
||||
|
||||
/* Ensure CS receives fw ack from GPM */
|
||||
udelay(1);
|
||||
|
||||
if (ret)
|
||||
GT_TRACE(gt, "Failed to complete pending forcewake %d\n", ret);
|
||||
}
|
||||
|
||||
/*
|
||||
* Wa_22011802037:gen12: In addition to stopping the cs, we need to wait for any
|
||||
* pending MI_FORCE_WAKEUP requests that the CS has initiated to complete. The
|
||||
* pending status is indicated by bits[13:9] (masked by bits[29:25]) in the
|
||||
* MSG_IDLE register. There's one MSG_IDLE register per reset domain. Since we
|
||||
* are concerned only with the gt reset here, we use a logical OR of pending
|
||||
* forcewakeups from all reset domains and then wait for them to complete by
|
||||
* querying PWRGT_DOMAIN_STATUS.
|
||||
*/
|
||||
void intel_engine_wait_for_pending_mi_fw(struct intel_engine_cs *engine)
|
||||
{
|
||||
u32 fw_pending = __cs_pending_mi_force_wakes(engine);
|
||||
|
||||
if (fw_pending)
|
||||
__gpm_wait_for_fw_complete(engine->gt, fw_pending);
|
||||
}
|
||||
|
||||
/* NB: please notice the memset */
|
||||
|
@ -1375,28 +1545,33 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
|
|||
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
|
||||
for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice) {
|
||||
instdone->sampler[slice][subslice] =
|
||||
read_subslice_reg(engine, slice, subslice,
|
||||
GEN7_SAMPLER_INSTDONE);
|
||||
intel_gt_mcr_read(engine->gt,
|
||||
GEN7_SAMPLER_INSTDONE,
|
||||
slice, subslice);
|
||||
instdone->row[slice][subslice] =
|
||||
read_subslice_reg(engine, slice, subslice,
|
||||
GEN7_ROW_INSTDONE);
|
||||
intel_gt_mcr_read(engine->gt,
|
||||
GEN7_ROW_INSTDONE,
|
||||
slice, subslice);
|
||||
}
|
||||
} else {
|
||||
for_each_instdone_slice_subslice(i915, sseu, slice, subslice) {
|
||||
instdone->sampler[slice][subslice] =
|
||||
read_subslice_reg(engine, slice, subslice,
|
||||
GEN7_SAMPLER_INSTDONE);
|
||||
intel_gt_mcr_read(engine->gt,
|
||||
GEN7_SAMPLER_INSTDONE,
|
||||
slice, subslice);
|
||||
instdone->row[slice][subslice] =
|
||||
read_subslice_reg(engine, slice, subslice,
|
||||
GEN7_ROW_INSTDONE);
|
||||
intel_gt_mcr_read(engine->gt,
|
||||
GEN7_ROW_INSTDONE,
|
||||
slice, subslice);
|
||||
}
|
||||
}
|
||||
|
||||
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
|
||||
for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice)
|
||||
instdone->geom_svg[slice][subslice] =
|
||||
read_subslice_reg(engine, slice, subslice,
|
||||
XEHPG_INSTDONE_GEOM_SVG);
|
||||
intel_gt_mcr_read(engine->gt,
|
||||
XEHPG_INSTDONE_GEOM_SVG,
|
||||
slice, subslice);
|
||||
}
|
||||
} else if (GRAPHICS_VER(i915) >= 7) {
|
||||
instdone->instdone =
|
||||
|
|
|
@ -8,6 +8,7 @@
|
|||
|
||||
#include "i915_reg_defs.h"
|
||||
|
||||
#define RING_EXCC(base) _MMIO((base) + 0x28)
|
||||
#define RING_TAIL(base) _MMIO((base) + 0x30)
|
||||
#define TAIL_ADDR 0x001FFFF8
|
||||
#define RING_HEAD(base) _MMIO((base) + 0x34)
|
||||
|
@ -133,6 +134,8 @@
|
|||
(REG_FIELD_PREP(BLIT_CCTL_DST_MOCS_MASK, (dst) << 1) | \
|
||||
REG_FIELD_PREP(BLIT_CCTL_SRC_MOCS_MASK, (src) << 1))
|
||||
|
||||
#define RING_CSCMDOP(base) _MMIO((base) + 0x20c)
|
||||
|
||||
/*
|
||||
* CMD_CCTL read/write fields take a MOCS value and _not_ a table index.
|
||||
* The lsb of each can be considered a separate enabling bit for encryption.
|
||||
|
@ -149,6 +152,7 @@
|
|||
REG_FIELD_PREP(CMD_CCTL_READ_OVERRIDE_MASK, (read) << 1))
|
||||
|
||||
#define RING_PREDICATE_RESULT(base) _MMIO((base) + 0x3b8) /* gen12+ */
|
||||
|
||||
#define MI_PREDICATE_RESULT_2(base) _MMIO((base) + 0x3bc)
|
||||
#define LOWER_SLICE_ENABLED (1 << 0)
|
||||
#define LOWER_SLICE_DISABLED (0 << 0)
|
||||
|
@ -172,6 +176,7 @@
|
|||
#define CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT REG_BIT(2)
|
||||
#define CTX_CTRL_INHIBIT_SYN_CTX_SWITCH REG_BIT(3)
|
||||
#define GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE REG_BIT(8)
|
||||
#define RING_CTX_SR_CTL(base) _MMIO((base) + 0x244)
|
||||
#define RING_SEMA_WAIT_POLL(base) _MMIO((base) + 0x24c)
|
||||
#define GEN8_RING_PDP_UDW(base, n) _MMIO((base) + 0x270 + (n) * 8 + 4)
|
||||
#define GEN8_RING_PDP_LDW(base, n) _MMIO((base) + 0x270 + (n) * 8)
|
||||
|
@ -196,6 +201,7 @@
|
|||
#define RING_CTX_TIMESTAMP(base) _MMIO((base) + 0x3a8) /* gen8+ */
|
||||
#define RING_PREDICATE_RESULT(base) _MMIO((base) + 0x3b8)
|
||||
#define RING_FORCE_TO_NONPRIV(base, i) _MMIO(((base) + 0x4D0) + (i) * 4)
|
||||
#define RING_FORCE_TO_NONPRIV_DENY REG_BIT(30)
|
||||
#define RING_FORCE_TO_NONPRIV_ADDRESS_MASK REG_GENMASK(25, 2)
|
||||
#define RING_FORCE_TO_NONPRIV_ACCESS_RW (0 << 28) /* CFL+ & Gen11+ */
|
||||
#define RING_FORCE_TO_NONPRIV_ACCESS_RD (1 << 28)
|
||||
|
@ -208,7 +214,9 @@
|
|||
#define RING_FORCE_TO_NONPRIV_RANGE_64 (3 << 0)
|
||||
#define RING_FORCE_TO_NONPRIV_RANGE_MASK (3 << 0)
|
||||
#define RING_FORCE_TO_NONPRIV_MASK_VALID \
|
||||
(RING_FORCE_TO_NONPRIV_RANGE_MASK | RING_FORCE_TO_NONPRIV_ACCESS_MASK)
|
||||
(RING_FORCE_TO_NONPRIV_RANGE_MASK | \
|
||||
RING_FORCE_TO_NONPRIV_ACCESS_MASK | \
|
||||
RING_FORCE_TO_NONPRIV_DENY)
|
||||
#define RING_MAX_NONPRIV_SLOTS 12
|
||||
|
||||
#define RING_EXECLIST_SQ_CONTENTS(base) _MMIO((base) + 0x510)
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
#define OTHER_CLASS 4
|
||||
#define COMPUTE_CLASS 5
|
||||
#define MAX_ENGINE_CLASS 5
|
||||
#define MAX_ENGINE_INSTANCE 7
|
||||
#define MAX_ENGINE_INSTANCE 8
|
||||
|
||||
#define I915_MAX_SLICES 3
|
||||
#define I915_MAX_SUBSLICES 8
|
||||
|
@ -99,6 +99,7 @@ struct i915_ctx_workarounds {
|
|||
#define I915_MAX_SFC (I915_MAX_VCS / 2)
|
||||
#define I915_MAX_CCS 4
|
||||
#define I915_MAX_RCS 1
|
||||
#define I915_MAX_BCS 9
|
||||
|
||||
/*
|
||||
* Engine IDs definitions.
|
||||
|
@ -107,6 +108,15 @@ struct i915_ctx_workarounds {
|
|||
enum intel_engine_id {
|
||||
RCS0 = 0,
|
||||
BCS0,
|
||||
BCS1,
|
||||
BCS2,
|
||||
BCS3,
|
||||
BCS4,
|
||||
BCS5,
|
||||
BCS6,
|
||||
BCS7,
|
||||
BCS8,
|
||||
#define _BCS(n) (BCS0 + (n))
|
||||
VCS0,
|
||||
VCS1,
|
||||
VCS2,
|
||||
|
|
|
@ -480,9 +480,9 @@ __execlists_schedule_in(struct i915_request *rq)
|
|||
|
||||
if (unlikely(intel_context_is_closed(ce) &&
|
||||
!intel_engine_has_heartbeat(engine)))
|
||||
intel_context_set_banned(ce);
|
||||
intel_context_set_exiting(ce);
|
||||
|
||||
if (unlikely(intel_context_is_banned(ce) || bad_request(rq)))
|
||||
if (unlikely(!intel_context_is_schedulable(ce) || bad_request(rq)))
|
||||
reset_active(rq, engine);
|
||||
|
||||
if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
|
||||
|
@ -661,6 +661,16 @@ static inline void execlists_schedule_out(struct i915_request *rq)
|
|||
i915_request_put(rq);
|
||||
}
|
||||
|
||||
static u32 map_i915_prio_to_lrc_desc_prio(int prio)
|
||||
{
|
||||
if (prio > I915_PRIORITY_NORMAL)
|
||||
return GEN12_CTX_PRIORITY_HIGH;
|
||||
else if (prio < I915_PRIORITY_NORMAL)
|
||||
return GEN12_CTX_PRIORITY_LOW;
|
||||
else
|
||||
return GEN12_CTX_PRIORITY_NORMAL;
|
||||
}
|
||||
|
||||
static u64 execlists_update_context(struct i915_request *rq)
|
||||
{
|
||||
struct intel_context *ce = rq->context;
|
||||
|
@ -669,7 +679,7 @@ static u64 execlists_update_context(struct i915_request *rq)
|
|||
|
||||
desc = ce->lrc.desc;
|
||||
if (rq->engine->flags & I915_ENGINE_HAS_EU_PRIORITY)
|
||||
desc |= lrc_desc_priority(rq_prio(rq));
|
||||
desc |= map_i915_prio_to_lrc_desc_prio(rq_prio(rq));
|
||||
|
||||
/*
|
||||
* WaIdleLiteRestore:bdw,skl
|
||||
|
@ -1233,7 +1243,7 @@ static unsigned long active_preempt_timeout(struct intel_engine_cs *engine,
|
|||
|
||||
/* Force a fast reset for terminated contexts (ignoring sysfs!) */
|
||||
if (unlikely(intel_context_is_banned(rq->context) || bad_request(rq)))
|
||||
return 1;
|
||||
return INTEL_CONTEXT_BANNED_PREEMPT_TIMEOUT_MS;
|
||||
|
||||
return READ_ONCE(engine->props.preempt_timeout_ms);
|
||||
}
|
||||
|
@ -2958,6 +2968,13 @@ static void execlists_reset_prepare(struct intel_engine_cs *engine)
|
|||
ring_set_paused(engine, 1);
|
||||
intel_engine_stop_cs(engine);
|
||||
|
||||
/*
|
||||
* Wa_22011802037:gen11/gen12: In addition to stopping the cs, we need
|
||||
* to wait for any pending mi force wakeups
|
||||
*/
|
||||
if (IS_GRAPHICS_VER(engine->i915, 11, 12))
|
||||
intel_engine_wait_for_pending_mi_fw(engine);
|
||||
|
||||
engine->execlists.reset_ccid = active_ccid(engine);
|
||||
}
|
||||
|
||||
|
|
|
@ -3,16 +3,18 @@
|
|||
* Copyright © 2020 Intel Corporation
|
||||
*/
|
||||
|
||||
#include <linux/types.h>
|
||||
#include <asm/set_memory.h>
|
||||
#include <asm/smp.h>
|
||||
#include <linux/types.h>
|
||||
#include <linux/stop_machine.h>
|
||||
|
||||
#include <drm/i915_drm.h>
|
||||
#include <drm/intel-gtt.h>
|
||||
|
||||
#include "gem/i915_gem_lmem.h"
|
||||
|
||||
#include "intel_ggtt_gmch.h"
|
||||
#include "intel_gt.h"
|
||||
#include "intel_gt_gmch.h"
|
||||
#include "intel_gt_regs.h"
|
||||
#include "i915_drv.h"
|
||||
#include "i915_scatterlist.h"
|
||||
|
@ -22,6 +24,13 @@
|
|||
#include "intel_gtt.h"
|
||||
#include "gen8_ppgtt.h"
|
||||
|
||||
static inline bool suspend_retains_ptes(struct i915_address_space *vm)
|
||||
{
|
||||
return GRAPHICS_VER(vm->i915) >= 8 &&
|
||||
!HAS_LMEM(vm->i915) &&
|
||||
vm->is_ggtt;
|
||||
}
|
||||
|
||||
static void i915_ggtt_color_adjust(const struct drm_mm_node *node,
|
||||
unsigned long color,
|
||||
u64 *start,
|
||||
|
@ -93,6 +102,23 @@ int i915_ggtt_init_hw(struct drm_i915_private *i915)
|
|||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return the value of the last GGTT pte cast to an u64, if
|
||||
* the system is supposed to retain ptes across resume. 0 otherwise.
|
||||
*/
|
||||
static u64 read_last_pte(struct i915_address_space *vm)
|
||||
{
|
||||
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
|
||||
gen8_pte_t __iomem *ptep;
|
||||
|
||||
if (!suspend_retains_ptes(vm))
|
||||
return 0;
|
||||
|
||||
GEM_BUG_ON(GRAPHICS_VER(vm->i915) < 8);
|
||||
ptep = (typeof(ptep))ggtt->gsm + (ggtt_total_entries(ggtt) - 1);
|
||||
return readq(ptep);
|
||||
}
|
||||
|
||||
/**
|
||||
* i915_ggtt_suspend_vm - Suspend the memory mappings for a GGTT or DPT VM
|
||||
* @vm: The VM to suspend the mappings for
|
||||
|
@ -156,7 +182,10 @@ retry:
|
|||
i915_gem_object_unlock(obj);
|
||||
}
|
||||
|
||||
vm->clear_range(vm, 0, vm->total);
|
||||
if (!suspend_retains_ptes(vm))
|
||||
vm->clear_range(vm, 0, vm->total);
|
||||
else
|
||||
i915_vm_to_ggtt(vm)->probed_pte = read_last_pte(vm);
|
||||
|
||||
vm->skip_pte_rewrite = save_skip_rewrite;
|
||||
|
||||
|
@ -181,7 +210,7 @@ void gen6_ggtt_invalidate(struct i915_ggtt *ggtt)
|
|||
spin_unlock_irq(&uncore->lock);
|
||||
}
|
||||
|
||||
void gen8_ggtt_invalidate(struct i915_ggtt *ggtt)
|
||||
static void gen8_ggtt_invalidate(struct i915_ggtt *ggtt)
|
||||
{
|
||||
struct intel_uncore *uncore = ggtt->vm.gt->uncore;
|
||||
|
||||
|
@ -218,11 +247,232 @@ u64 gen8_ggtt_pte_encode(dma_addr_t addr,
|
|||
return pte;
|
||||
}
|
||||
|
||||
static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
|
||||
{
|
||||
writeq(pte, addr);
|
||||
}
|
||||
|
||||
static void gen8_ggtt_insert_page(struct i915_address_space *vm,
|
||||
dma_addr_t addr,
|
||||
u64 offset,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
|
||||
gen8_pte_t __iomem *pte =
|
||||
(gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
|
||||
|
||||
gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, flags));
|
||||
|
||||
ggtt->invalidate(ggtt);
|
||||
}
|
||||
|
||||
static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
|
||||
struct i915_vma_resource *vma_res,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, flags);
|
||||
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
|
||||
gen8_pte_t __iomem *gte;
|
||||
gen8_pte_t __iomem *end;
|
||||
struct sgt_iter iter;
|
||||
dma_addr_t addr;
|
||||
|
||||
/*
|
||||
* Note that we ignore PTE_READ_ONLY here. The caller must be careful
|
||||
* not to allow the user to override access to a read only page.
|
||||
*/
|
||||
|
||||
gte = (gen8_pte_t __iomem *)ggtt->gsm;
|
||||
gte += vma_res->start / I915_GTT_PAGE_SIZE;
|
||||
end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
|
||||
|
||||
for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
|
||||
gen8_set_pte(gte++, pte_encode | addr);
|
||||
GEM_BUG_ON(gte > end);
|
||||
|
||||
/* Fill the allocated but "unused" space beyond the end of the buffer */
|
||||
while (gte < end)
|
||||
gen8_set_pte(gte++, vm->scratch[0]->encode);
|
||||
|
||||
/*
|
||||
* We want to flush the TLBs only after we're certain all the PTE
|
||||
* updates have finished.
|
||||
*/
|
||||
ggtt->invalidate(ggtt);
|
||||
}
|
||||
|
||||
static void gen6_ggtt_insert_page(struct i915_address_space *vm,
|
||||
dma_addr_t addr,
|
||||
u64 offset,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
|
||||
gen6_pte_t __iomem *pte =
|
||||
(gen6_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
|
||||
|
||||
iowrite32(vm->pte_encode(addr, level, flags), pte);
|
||||
|
||||
ggtt->invalidate(ggtt);
|
||||
}
|
||||
|
||||
/*
|
||||
* Binds an object into the global gtt with the specified cache level.
|
||||
* The object will be accessible to the GPU via commands whose operands
|
||||
* reference offsets within the global GTT as well as accessible by the GPU
|
||||
* through the GMADR mapped BAR (i915->mm.gtt->gtt).
|
||||
*/
|
||||
static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
|
||||
struct i915_vma_resource *vma_res,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
|
||||
gen6_pte_t __iomem *gte;
|
||||
gen6_pte_t __iomem *end;
|
||||
struct sgt_iter iter;
|
||||
dma_addr_t addr;
|
||||
|
||||
gte = (gen6_pte_t __iomem *)ggtt->gsm;
|
||||
gte += vma_res->start / I915_GTT_PAGE_SIZE;
|
||||
end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
|
||||
|
||||
for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
|
||||
iowrite32(vm->pte_encode(addr, level, flags), gte++);
|
||||
GEM_BUG_ON(gte > end);
|
||||
|
||||
/* Fill the allocated but "unused" space beyond the end of the buffer */
|
||||
while (gte < end)
|
||||
iowrite32(vm->scratch[0]->encode, gte++);
|
||||
|
||||
/*
|
||||
* We want to flush the TLBs only after we're certain all the PTE
|
||||
* updates have finished.
|
||||
*/
|
||||
ggtt->invalidate(ggtt);
|
||||
}
|
||||
|
||||
static void nop_clear_range(struct i915_address_space *vm,
|
||||
u64 start, u64 length)
|
||||
{
|
||||
}
|
||||
|
||||
static void gen8_ggtt_clear_range(struct i915_address_space *vm,
|
||||
u64 start, u64 length)
|
||||
{
|
||||
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
|
||||
unsigned int first_entry = start / I915_GTT_PAGE_SIZE;
|
||||
unsigned int num_entries = length / I915_GTT_PAGE_SIZE;
|
||||
const gen8_pte_t scratch_pte = vm->scratch[0]->encode;
|
||||
gen8_pte_t __iomem *gtt_base =
|
||||
(gen8_pte_t __iomem *)ggtt->gsm + first_entry;
|
||||
const int max_entries = ggtt_total_entries(ggtt) - first_entry;
|
||||
int i;
|
||||
|
||||
if (WARN(num_entries > max_entries,
|
||||
"First entry = %d; Num entries = %d (max=%d)\n",
|
||||
first_entry, num_entries, max_entries))
|
||||
num_entries = max_entries;
|
||||
|
||||
for (i = 0; i < num_entries; i++)
|
||||
gen8_set_pte(>t_base[i], scratch_pte);
|
||||
}
|
||||
|
||||
static void bxt_vtd_ggtt_wa(struct i915_address_space *vm)
|
||||
{
|
||||
/*
|
||||
* Make sure the internal GAM fifo has been cleared of all GTT
|
||||
* writes before exiting stop_machine(). This guarantees that
|
||||
* any aperture accesses waiting to start in another process
|
||||
* cannot back up behind the GTT writes causing a hang.
|
||||
* The register can be any arbitrary GAM register.
|
||||
*/
|
||||
intel_uncore_posting_read_fw(vm->gt->uncore, GFX_FLSH_CNTL_GEN6);
|
||||
}
|
||||
|
||||
struct insert_page {
|
||||
struct i915_address_space *vm;
|
||||
dma_addr_t addr;
|
||||
u64 offset;
|
||||
enum i915_cache_level level;
|
||||
};
|
||||
|
||||
static int bxt_vtd_ggtt_insert_page__cb(void *_arg)
|
||||
{
|
||||
struct insert_page *arg = _arg;
|
||||
|
||||
gen8_ggtt_insert_page(arg->vm, arg->addr, arg->offset, arg->level, 0);
|
||||
bxt_vtd_ggtt_wa(arg->vm);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
|
||||
dma_addr_t addr,
|
||||
u64 offset,
|
||||
enum i915_cache_level level,
|
||||
u32 unused)
|
||||
{
|
||||
struct insert_page arg = { vm, addr, offset, level };
|
||||
|
||||
stop_machine(bxt_vtd_ggtt_insert_page__cb, &arg, NULL);
|
||||
}
|
||||
|
||||
struct insert_entries {
|
||||
struct i915_address_space *vm;
|
||||
struct i915_vma_resource *vma_res;
|
||||
enum i915_cache_level level;
|
||||
u32 flags;
|
||||
};
|
||||
|
||||
static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
|
||||
{
|
||||
struct insert_entries *arg = _arg;
|
||||
|
||||
gen8_ggtt_insert_entries(arg->vm, arg->vma_res, arg->level, arg->flags);
|
||||
bxt_vtd_ggtt_wa(arg->vm);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm,
|
||||
struct i915_vma_resource *vma_res,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
struct insert_entries arg = { vm, vma_res, level, flags };
|
||||
|
||||
stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL);
|
||||
}
|
||||
|
||||
static void gen6_ggtt_clear_range(struct i915_address_space *vm,
|
||||
u64 start, u64 length)
|
||||
{
|
||||
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
|
||||
unsigned int first_entry = start / I915_GTT_PAGE_SIZE;
|
||||
unsigned int num_entries = length / I915_GTT_PAGE_SIZE;
|
||||
gen6_pte_t scratch_pte, __iomem *gtt_base =
|
||||
(gen6_pte_t __iomem *)ggtt->gsm + first_entry;
|
||||
const int max_entries = ggtt_total_entries(ggtt) - first_entry;
|
||||
int i;
|
||||
|
||||
if (WARN(num_entries > max_entries,
|
||||
"First entry = %d; Num entries = %d (max=%d)\n",
|
||||
first_entry, num_entries, max_entries))
|
||||
num_entries = max_entries;
|
||||
|
||||
scratch_pte = vm->scratch[0]->encode;
|
||||
for (i = 0; i < num_entries; i++)
|
||||
iowrite32(scratch_pte, >t_base[i]);
|
||||
}
|
||||
|
||||
void intel_ggtt_bind_vma(struct i915_address_space *vm,
|
||||
struct i915_vm_pt_stash *stash,
|
||||
struct i915_vma_resource *vma_res,
|
||||
enum i915_cache_level cache_level,
|
||||
u32 flags)
|
||||
struct i915_vm_pt_stash *stash,
|
||||
struct i915_vma_resource *vma_res,
|
||||
enum i915_cache_level cache_level,
|
||||
u32 flags)
|
||||
{
|
||||
u32 pte_flags;
|
||||
|
||||
|
@ -243,7 +493,7 @@ void intel_ggtt_bind_vma(struct i915_address_space *vm,
|
|||
}
|
||||
|
||||
void intel_ggtt_unbind_vma(struct i915_address_space *vm,
|
||||
struct i915_vma_resource *vma_res)
|
||||
struct i915_vma_resource *vma_res)
|
||||
{
|
||||
vm->clear_range(vm, vma_res->start, vma_res->vma_size);
|
||||
}
|
||||
|
@ -299,6 +549,8 @@ static int init_ggtt(struct i915_ggtt *ggtt)
|
|||
struct drm_mm_node *entry;
|
||||
int ret;
|
||||
|
||||
ggtt->pte_lost = true;
|
||||
|
||||
/*
|
||||
* GuC requires all resources that we're sharing with it to be placed in
|
||||
* non-WOPCM memory. If GuC is not present or not in use we still need a
|
||||
|
@ -560,12 +812,326 @@ void i915_ggtt_driver_late_release(struct drm_i915_private *i915)
|
|||
dma_resv_fini(&ggtt->vm._resv);
|
||||
}
|
||||
|
||||
struct resource intel_pci_resource(struct pci_dev *pdev, int bar)
|
||||
static unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
|
||||
{
|
||||
snb_gmch_ctl >>= SNB_GMCH_GGMS_SHIFT;
|
||||
snb_gmch_ctl &= SNB_GMCH_GGMS_MASK;
|
||||
return snb_gmch_ctl << 20;
|
||||
}
|
||||
|
||||
static unsigned int gen8_get_total_gtt_size(u16 bdw_gmch_ctl)
|
||||
{
|
||||
bdw_gmch_ctl >>= BDW_GMCH_GGMS_SHIFT;
|
||||
bdw_gmch_ctl &= BDW_GMCH_GGMS_MASK;
|
||||
if (bdw_gmch_ctl)
|
||||
bdw_gmch_ctl = 1 << bdw_gmch_ctl;
|
||||
|
||||
#ifdef CONFIG_X86_32
|
||||
/* Limit 32b platforms to a 2GB GGTT: 4 << 20 / pte size * I915_GTT_PAGE_SIZE */
|
||||
if (bdw_gmch_ctl > 4)
|
||||
bdw_gmch_ctl = 4;
|
||||
#endif
|
||||
|
||||
return bdw_gmch_ctl << 20;
|
||||
}
|
||||
|
||||
static unsigned int chv_get_total_gtt_size(u16 gmch_ctrl)
|
||||
{
|
||||
gmch_ctrl >>= SNB_GMCH_GGMS_SHIFT;
|
||||
gmch_ctrl &= SNB_GMCH_GGMS_MASK;
|
||||
|
||||
if (gmch_ctrl)
|
||||
return 1 << (20 + gmch_ctrl);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static unsigned int gen6_gttmmadr_size(struct drm_i915_private *i915)
|
||||
{
|
||||
/*
|
||||
* GEN6: GTTMMADR size is 4MB and GTTADR starts at 2MB offset
|
||||
* GEN8: GTTMMADR size is 16MB and GTTADR starts at 8MB offset
|
||||
*/
|
||||
GEM_BUG_ON(GRAPHICS_VER(i915) < 6);
|
||||
return (GRAPHICS_VER(i915) < 8) ? SZ_4M : SZ_16M;
|
||||
}
|
||||
|
||||
static unsigned int gen6_gttadr_offset(struct drm_i915_private *i915)
|
||||
{
|
||||
return gen6_gttmmadr_size(i915) / 2;
|
||||
}
|
||||
|
||||
static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
|
||||
{
|
||||
struct drm_i915_private *i915 = ggtt->vm.i915;
|
||||
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
|
||||
phys_addr_t phys_addr;
|
||||
u32 pte_flags;
|
||||
int ret;
|
||||
|
||||
GEM_WARN_ON(pci_resource_len(pdev, 0) != gen6_gttmmadr_size(i915));
|
||||
phys_addr = pci_resource_start(pdev, 0) + gen6_gttadr_offset(i915);
|
||||
|
||||
/*
|
||||
* On BXT+/ICL+ writes larger than 64 bit to the GTT pagetable range
|
||||
* will be dropped. For WC mappings in general we have 64 byte burst
|
||||
* writes when the WC buffer is flushed, so we can't use it, but have to
|
||||
* resort to an uncached mapping. The WC issue is easily caught by the
|
||||
* readback check when writing GTT PTE entries.
|
||||
*/
|
||||
if (IS_GEN9_LP(i915) || GRAPHICS_VER(i915) >= 11)
|
||||
ggtt->gsm = ioremap(phys_addr, size);
|
||||
else
|
||||
ggtt->gsm = ioremap_wc(phys_addr, size);
|
||||
if (!ggtt->gsm) {
|
||||
drm_err(&i915->drm, "Failed to map the ggtt page table\n");
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
kref_init(&ggtt->vm.resv_ref);
|
||||
ret = setup_scratch_page(&ggtt->vm);
|
||||
if (ret) {
|
||||
drm_err(&i915->drm, "Scratch setup failed\n");
|
||||
/* iounmap will also get called at remove, but meh */
|
||||
iounmap(ggtt->gsm);
|
||||
return ret;
|
||||
}
|
||||
|
||||
pte_flags = 0;
|
||||
if (i915_gem_object_is_lmem(ggtt->vm.scratch[0]))
|
||||
pte_flags |= PTE_LM;
|
||||
|
||||
ggtt->vm.scratch[0]->encode =
|
||||
ggtt->vm.pte_encode(px_dma(ggtt->vm.scratch[0]),
|
||||
I915_CACHE_NONE, pte_flags);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void gen6_gmch_remove(struct i915_address_space *vm)
|
||||
{
|
||||
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
|
||||
|
||||
iounmap(ggtt->gsm);
|
||||
free_scratch(vm);
|
||||
}
|
||||
|
||||
static struct resource pci_resource(struct pci_dev *pdev, int bar)
|
||||
{
|
||||
return (struct resource)DEFINE_RES_MEM(pci_resource_start(pdev, bar),
|
||||
pci_resource_len(pdev, bar));
|
||||
}
|
||||
|
||||
static int gen8_gmch_probe(struct i915_ggtt *ggtt)
|
||||
{
|
||||
struct drm_i915_private *i915 = ggtt->vm.i915;
|
||||
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
|
||||
unsigned int size;
|
||||
u16 snb_gmch_ctl;
|
||||
|
||||
if (!HAS_LMEM(i915)) {
|
||||
ggtt->gmadr = pci_resource(pdev, 2);
|
||||
ggtt->mappable_end = resource_size(&ggtt->gmadr);
|
||||
}
|
||||
|
||||
pci_read_config_word(pdev, SNB_GMCH_CTRL, &snb_gmch_ctl);
|
||||
if (IS_CHERRYVIEW(i915))
|
||||
size = chv_get_total_gtt_size(snb_gmch_ctl);
|
||||
else
|
||||
size = gen8_get_total_gtt_size(snb_gmch_ctl);
|
||||
|
||||
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
|
||||
ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
|
||||
ggtt->vm.lmem_pt_obj_flags = I915_BO_ALLOC_PM_EARLY;
|
||||
|
||||
ggtt->vm.total = (size / sizeof(gen8_pte_t)) * I915_GTT_PAGE_SIZE;
|
||||
ggtt->vm.cleanup = gen6_gmch_remove;
|
||||
ggtt->vm.insert_page = gen8_ggtt_insert_page;
|
||||
ggtt->vm.clear_range = nop_clear_range;
|
||||
if (intel_scanout_needs_vtd_wa(i915))
|
||||
ggtt->vm.clear_range = gen8_ggtt_clear_range;
|
||||
|
||||
ggtt->vm.insert_entries = gen8_ggtt_insert_entries;
|
||||
|
||||
/*
|
||||
* Serialize GTT updates with aperture access on BXT if VT-d is on,
|
||||
* and always on CHV.
|
||||
*/
|
||||
if (intel_vm_no_concurrent_access_wa(i915)) {
|
||||
ggtt->vm.insert_entries = bxt_vtd_ggtt_insert_entries__BKL;
|
||||
ggtt->vm.insert_page = bxt_vtd_ggtt_insert_page__BKL;
|
||||
|
||||
/*
|
||||
* Calling stop_machine() version of GGTT update function
|
||||
* at error capture/reset path will raise lockdep warning.
|
||||
* Allow calling gen8_ggtt_insert_* directly at reset path
|
||||
* which is safe from parallel GGTT updates.
|
||||
*/
|
||||
ggtt->vm.raw_insert_page = gen8_ggtt_insert_page;
|
||||
ggtt->vm.raw_insert_entries = gen8_ggtt_insert_entries;
|
||||
|
||||
ggtt->vm.bind_async_flags =
|
||||
I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
|
||||
}
|
||||
|
||||
ggtt->invalidate = gen8_ggtt_invalidate;
|
||||
|
||||
ggtt->vm.vma_ops.bind_vma = intel_ggtt_bind_vma;
|
||||
ggtt->vm.vma_ops.unbind_vma = intel_ggtt_unbind_vma;
|
||||
|
||||
ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
|
||||
|
||||
setup_private_pat(ggtt->vm.gt->uncore);
|
||||
|
||||
return ggtt_probe_common(ggtt, size);
|
||||
}
|
||||
|
||||
static u64 snb_pte_encode(dma_addr_t addr,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
gen6_pte_t pte = GEN6_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
|
||||
|
||||
switch (level) {
|
||||
case I915_CACHE_L3_LLC:
|
||||
case I915_CACHE_LLC:
|
||||
pte |= GEN6_PTE_CACHE_LLC;
|
||||
break;
|
||||
case I915_CACHE_NONE:
|
||||
pte |= GEN6_PTE_UNCACHED;
|
||||
break;
|
||||
default:
|
||||
MISSING_CASE(level);
|
||||
}
|
||||
|
||||
return pte;
|
||||
}
|
||||
|
||||
static u64 ivb_pte_encode(dma_addr_t addr,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
gen6_pte_t pte = GEN6_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
|
||||
|
||||
switch (level) {
|
||||
case I915_CACHE_L3_LLC:
|
||||
pte |= GEN7_PTE_CACHE_L3_LLC;
|
||||
break;
|
||||
case I915_CACHE_LLC:
|
||||
pte |= GEN6_PTE_CACHE_LLC;
|
||||
break;
|
||||
case I915_CACHE_NONE:
|
||||
pte |= GEN6_PTE_UNCACHED;
|
||||
break;
|
||||
default:
|
||||
MISSING_CASE(level);
|
||||
}
|
||||
|
||||
return pte;
|
||||
}
|
||||
|
||||
static u64 byt_pte_encode(dma_addr_t addr,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
gen6_pte_t pte = GEN6_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
|
||||
|
||||
if (!(flags & PTE_READ_ONLY))
|
||||
pte |= BYT_PTE_WRITEABLE;
|
||||
|
||||
if (level != I915_CACHE_NONE)
|
||||
pte |= BYT_PTE_SNOOPED_BY_CPU_CACHES;
|
||||
|
||||
return pte;
|
||||
}
|
||||
|
||||
static u64 hsw_pte_encode(dma_addr_t addr,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
gen6_pte_t pte = HSW_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
|
||||
|
||||
if (level != I915_CACHE_NONE)
|
||||
pte |= HSW_WB_LLC_AGE3;
|
||||
|
||||
return pte;
|
||||
}
|
||||
|
||||
static u64 iris_pte_encode(dma_addr_t addr,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
gen6_pte_t pte = HSW_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
|
||||
|
||||
switch (level) {
|
||||
case I915_CACHE_NONE:
|
||||
break;
|
||||
case I915_CACHE_WT:
|
||||
pte |= HSW_WT_ELLC_LLC_AGE3;
|
||||
break;
|
||||
default:
|
||||
pte |= HSW_WB_ELLC_LLC_AGE3;
|
||||
break;
|
||||
}
|
||||
|
||||
return pte;
|
||||
}
|
||||
|
||||
static int gen6_gmch_probe(struct i915_ggtt *ggtt)
|
||||
{
|
||||
struct drm_i915_private *i915 = ggtt->vm.i915;
|
||||
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
|
||||
unsigned int size;
|
||||
u16 snb_gmch_ctl;
|
||||
|
||||
ggtt->gmadr = pci_resource(pdev, 2);
|
||||
ggtt->mappable_end = resource_size(&ggtt->gmadr);
|
||||
|
||||
/*
|
||||
* 64/512MB is the current min/max we actually know of, but this is
|
||||
* just a coarse sanity check.
|
||||
*/
|
||||
if (ggtt->mappable_end < (64 << 20) ||
|
||||
ggtt->mappable_end > (512 << 20)) {
|
||||
drm_err(&i915->drm, "Unknown GMADR size (%pa)\n",
|
||||
&ggtt->mappable_end);
|
||||
return -ENXIO;
|
||||
}
|
||||
|
||||
pci_read_config_word(pdev, SNB_GMCH_CTRL, &snb_gmch_ctl);
|
||||
|
||||
size = gen6_get_total_gtt_size(snb_gmch_ctl);
|
||||
ggtt->vm.total = (size / sizeof(gen6_pte_t)) * I915_GTT_PAGE_SIZE;
|
||||
|
||||
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
|
||||
ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
|
||||
|
||||
ggtt->vm.clear_range = nop_clear_range;
|
||||
if (!HAS_FULL_PPGTT(i915) || intel_scanout_needs_vtd_wa(i915))
|
||||
ggtt->vm.clear_range = gen6_ggtt_clear_range;
|
||||
ggtt->vm.insert_page = gen6_ggtt_insert_page;
|
||||
ggtt->vm.insert_entries = gen6_ggtt_insert_entries;
|
||||
ggtt->vm.cleanup = gen6_gmch_remove;
|
||||
|
||||
ggtt->invalidate = gen6_ggtt_invalidate;
|
||||
|
||||
if (HAS_EDRAM(i915))
|
||||
ggtt->vm.pte_encode = iris_pte_encode;
|
||||
else if (IS_HASWELL(i915))
|
||||
ggtt->vm.pte_encode = hsw_pte_encode;
|
||||
else if (IS_VALLEYVIEW(i915))
|
||||
ggtt->vm.pte_encode = byt_pte_encode;
|
||||
else if (GRAPHICS_VER(i915) >= 7)
|
||||
ggtt->vm.pte_encode = ivb_pte_encode;
|
||||
else
|
||||
ggtt->vm.pte_encode = snb_pte_encode;
|
||||
|
||||
ggtt->vm.vma_ops.bind_vma = intel_ggtt_bind_vma;
|
||||
ggtt->vm.vma_ops.unbind_vma = intel_ggtt_unbind_vma;
|
||||
|
||||
return ggtt_probe_common(ggtt, size);
|
||||
}
|
||||
|
||||
static int ggtt_probe_hw(struct i915_ggtt *ggtt, struct intel_gt *gt)
|
||||
{
|
||||
struct drm_i915_private *i915 = gt->i915;
|
||||
|
@ -576,12 +1142,13 @@ static int ggtt_probe_hw(struct i915_ggtt *ggtt, struct intel_gt *gt)
|
|||
ggtt->vm.dma = i915->drm.dev;
|
||||
dma_resv_init(&ggtt->vm._resv);
|
||||
|
||||
if (GRAPHICS_VER(i915) <= 5)
|
||||
ret = intel_gt_gmch_gen5_probe(ggtt);
|
||||
else if (GRAPHICS_VER(i915) < 8)
|
||||
ret = intel_gt_gmch_gen6_probe(ggtt);
|
||||
if (GRAPHICS_VER(i915) >= 8)
|
||||
ret = gen8_gmch_probe(ggtt);
|
||||
else if (GRAPHICS_VER(i915) >= 6)
|
||||
ret = gen6_gmch_probe(ggtt);
|
||||
else
|
||||
ret = intel_gt_gmch_gen8_probe(ggtt);
|
||||
ret = intel_ggtt_gmch_probe(ggtt);
|
||||
|
||||
if (ret) {
|
||||
dma_resv_fini(&ggtt->vm._resv);
|
||||
return ret;
|
||||
|
@ -635,7 +1202,10 @@ int i915_ggtt_probe_hw(struct drm_i915_private *i915)
|
|||
|
||||
int i915_ggtt_enable_hw(struct drm_i915_private *i915)
|
||||
{
|
||||
return intel_gt_gmch_gen5_enable_hw(i915);
|
||||
if (GRAPHICS_VER(i915) < 6)
|
||||
return intel_ggtt_gmch_enable_hw(i915);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
void i915_ggtt_enable_guc(struct i915_ggtt *ggtt)
|
||||
|
@ -675,11 +1245,20 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)
|
|||
{
|
||||
struct i915_vma *vma;
|
||||
bool write_domain_objs = false;
|
||||
bool retained_ptes;
|
||||
|
||||
drm_WARN_ON(&vm->i915->drm, !vm->is_ggtt && !vm->is_dpt);
|
||||
|
||||
/* First fill our portion of the GTT with scratch pages */
|
||||
vm->clear_range(vm, 0, vm->total);
|
||||
/*
|
||||
* First fill our portion of the GTT with scratch pages if
|
||||
* they were not retained across suspend.
|
||||
*/
|
||||
retained_ptes = suspend_retains_ptes(vm) &&
|
||||
!i915_vm_to_ggtt(vm)->pte_lost &&
|
||||
!GEM_WARN_ON(i915_vm_to_ggtt(vm)->probed_pte != read_last_pte(vm));
|
||||
|
||||
if (!retained_ptes)
|
||||
vm->clear_range(vm, 0, vm->total);
|
||||
|
||||
/* clflush objects bound into the GGTT and rebind them. */
|
||||
list_for_each_entry(vma, &vm->bound_list, vm_link) {
|
||||
|
@ -688,9 +1267,10 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)
|
|||
atomic_read(&vma->flags) & I915_VMA_BIND_MASK;
|
||||
|
||||
GEM_BUG_ON(!was_bound);
|
||||
vma->ops->bind_vma(vm, NULL, vma->resource,
|
||||
obj ? obj->cache_level : 0,
|
||||
was_bound);
|
||||
if (!retained_ptes)
|
||||
vma->ops->bind_vma(vm, NULL, vma->resource,
|
||||
obj ? obj->cache_level : 0,
|
||||
was_bound);
|
||||
if (obj) { /* only used during resume => exclusive access */
|
||||
write_domain_objs |= fetch_and_zero(&obj->write_domain);
|
||||
obj->read_domains |= I915_GEM_DOMAIN_GTT;
|
||||
|
@ -718,3 +1298,8 @@ void i915_ggtt_resume(struct i915_ggtt *ggtt)
|
|||
|
||||
intel_ggtt_restore_fences(ggtt);
|
||||
}
|
||||
|
||||
void i915_ggtt_mark_pte_lost(struct drm_i915_private *i915, bool val)
|
||||
{
|
||||
to_gt(i915)->ggtt->pte_lost = val;
|
||||
}
|
||||
|
|
|
@ -0,0 +1,132 @@
|
|||
// SPDX-License-Identifier: MIT
|
||||
/*
|
||||
* Copyright © 2022 Intel Corporation
|
||||
*/
|
||||
|
||||
#include "intel_ggtt_gmch.h"
|
||||
|
||||
#include <drm/intel-gtt.h>
|
||||
#include <drm/i915_drm.h>
|
||||
|
||||
#include <linux/agp_backend.h>
|
||||
|
||||
#include "i915_drv.h"
|
||||
#include "i915_utils.h"
|
||||
#include "intel_gtt.h"
|
||||
#include "intel_gt_regs.h"
|
||||
#include "intel_gt.h"
|
||||
|
||||
static void gmch_ggtt_insert_page(struct i915_address_space *vm,
|
||||
dma_addr_t addr,
|
||||
u64 offset,
|
||||
enum i915_cache_level cache_level,
|
||||
u32 unused)
|
||||
{
|
||||
unsigned int flags = (cache_level == I915_CACHE_NONE) ?
|
||||
AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
|
||||
|
||||
intel_gmch_gtt_insert_page(addr, offset >> PAGE_SHIFT, flags);
|
||||
}
|
||||
|
||||
static void gmch_ggtt_insert_entries(struct i915_address_space *vm,
|
||||
struct i915_vma_resource *vma_res,
|
||||
enum i915_cache_level cache_level,
|
||||
u32 unused)
|
||||
{
|
||||
unsigned int flags = (cache_level == I915_CACHE_NONE) ?
|
||||
AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
|
||||
|
||||
intel_gmch_gtt_insert_sg_entries(vma_res->bi.pages, vma_res->start >> PAGE_SHIFT,
|
||||
flags);
|
||||
}
|
||||
|
||||
static void gmch_ggtt_invalidate(struct i915_ggtt *ggtt)
|
||||
{
|
||||
intel_gmch_gtt_flush();
|
||||
}
|
||||
|
||||
static void gmch_ggtt_clear_range(struct i915_address_space *vm,
|
||||
u64 start, u64 length)
|
||||
{
|
||||
intel_gmch_gtt_clear_range(start >> PAGE_SHIFT, length >> PAGE_SHIFT);
|
||||
}
|
||||
|
||||
static void gmch_ggtt_remove(struct i915_address_space *vm)
|
||||
{
|
||||
intel_gmch_remove();
|
||||
}
|
||||
|
||||
/*
|
||||
* Certain Gen5 chipsets require idling the GPU before unmapping anything from
|
||||
* the GTT when VT-d is enabled.
|
||||
*/
|
||||
static bool needs_idle_maps(struct drm_i915_private *i915)
|
||||
{
|
||||
/*
|
||||
* Query intel_iommu to see if we need the workaround. Presumably that
|
||||
* was loaded first.
|
||||
*/
|
||||
if (!i915_vtd_active(i915))
|
||||
return false;
|
||||
|
||||
if (GRAPHICS_VER(i915) == 5 && IS_MOBILE(i915))
|
||||
return true;
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
int intel_ggtt_gmch_probe(struct i915_ggtt *ggtt)
|
||||
{
|
||||
struct drm_i915_private *i915 = ggtt->vm.i915;
|
||||
phys_addr_t gmadr_base;
|
||||
int ret;
|
||||
|
||||
ret = intel_gmch_probe(i915->bridge_dev, to_pci_dev(i915->drm.dev), NULL);
|
||||
if (!ret) {
|
||||
drm_err(&i915->drm, "failed to set up gmch\n");
|
||||
return -EIO;
|
||||
}
|
||||
|
||||
intel_gmch_gtt_get(&ggtt->vm.total, &gmadr_base, &ggtt->mappable_end);
|
||||
|
||||
ggtt->gmadr =
|
||||
(struct resource)DEFINE_RES_MEM(gmadr_base, ggtt->mappable_end);
|
||||
|
||||
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
|
||||
ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
|
||||
|
||||
if (needs_idle_maps(i915)) {
|
||||
drm_notice(&i915->drm,
|
||||
"Flushing DMA requests before IOMMU unmaps; performance may be degraded\n");
|
||||
ggtt->do_idle_maps = true;
|
||||
}
|
||||
|
||||
ggtt->vm.insert_page = gmch_ggtt_insert_page;
|
||||
ggtt->vm.insert_entries = gmch_ggtt_insert_entries;
|
||||
ggtt->vm.clear_range = gmch_ggtt_clear_range;
|
||||
ggtt->vm.cleanup = gmch_ggtt_remove;
|
||||
|
||||
ggtt->invalidate = gmch_ggtt_invalidate;
|
||||
|
||||
ggtt->vm.vma_ops.bind_vma = intel_ggtt_bind_vma;
|
||||
ggtt->vm.vma_ops.unbind_vma = intel_ggtt_unbind_vma;
|
||||
|
||||
if (unlikely(ggtt->do_idle_maps))
|
||||
drm_notice(&i915->drm,
|
||||
"Applying Ironlake quirks for intel_iommu\n");
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int intel_ggtt_gmch_enable_hw(struct drm_i915_private *i915)
|
||||
{
|
||||
if (!intel_gmch_enable_gtt())
|
||||
return -EIO;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
void intel_ggtt_gmch_flush(void)
|
||||
{
|
||||
intel_gmch_gtt_flush();
|
||||
}
|
|
@ -0,0 +1,27 @@
|
|||
/* SPDX-License-Identifier: MIT */
|
||||
/*
|
||||
* Copyright © 2022 Intel Corporation
|
||||
*/
|
||||
|
||||
#ifndef __INTEL_GGTT_GMCH_H__
|
||||
#define __INTEL_GGTT_GMCH_H__
|
||||
|
||||
#include "intel_gtt.h"
|
||||
|
||||
/* For x86 platforms */
|
||||
#if IS_ENABLED(CONFIG_X86)
|
||||
|
||||
void intel_ggtt_gmch_flush(void);
|
||||
int intel_ggtt_gmch_enable_hw(struct drm_i915_private *i915);
|
||||
int intel_ggtt_gmch_probe(struct i915_ggtt *ggtt);
|
||||
|
||||
/* Stubs for non-x86 platforms */
|
||||
#else
|
||||
|
||||
static inline void intel_ggtt_gmch_flush(void) { }
|
||||
static inline int intel_ggtt_gmch_enable_hw(struct drm_i915_private *i915) { return -ENODEV; }
|
||||
static inline int intel_ggtt_gmch_probe(struct i915_ggtt *ggtt) { return -ENODEV; }
|
||||
|
||||
#endif
|
||||
|
||||
#endif /* __INTEL_GGTT_GMCH_H__ */
|
|
@ -236,6 +236,28 @@
|
|||
#define XY_FAST_COLOR_BLT_DW 16
|
||||
#define XY_FAST_COLOR_BLT_MOCS_MASK GENMASK(27, 21)
|
||||
#define XY_FAST_COLOR_BLT_MEM_TYPE_SHIFT 31
|
||||
|
||||
#define XY_FAST_COPY_BLT_D0_SRC_TILING_MASK REG_GENMASK(21, 20)
|
||||
#define XY_FAST_COPY_BLT_D0_DST_TILING_MASK REG_GENMASK(14, 13)
|
||||
#define XY_FAST_COPY_BLT_D0_SRC_TILE_MODE(mode) \
|
||||
REG_FIELD_PREP(XY_FAST_COPY_BLT_D0_SRC_TILING_MASK, mode)
|
||||
#define XY_FAST_COPY_BLT_D0_DST_TILE_MODE(mode) \
|
||||
REG_FIELD_PREP(XY_FAST_COPY_BLT_D0_DST_TILING_MASK, mode)
|
||||
#define LINEAR 0
|
||||
#define TILE_X 0x1
|
||||
#define XMAJOR 0x1
|
||||
#define YMAJOR 0x2
|
||||
#define TILE_64 0x3
|
||||
#define XY_FAST_COPY_BLT_D1_SRC_TILE4 REG_BIT(31)
|
||||
#define XY_FAST_COPY_BLT_D1_DST_TILE4 REG_BIT(30)
|
||||
#define BLIT_CCTL_SRC_MOCS_MASK REG_GENMASK(6, 0)
|
||||
#define BLIT_CCTL_DST_MOCS_MASK REG_GENMASK(14, 8)
|
||||
/* Note: MOCS value = (index << 1) */
|
||||
#define BLIT_CCTL_SRC_MOCS(idx) \
|
||||
REG_FIELD_PREP(BLIT_CCTL_SRC_MOCS_MASK, (idx) << 1)
|
||||
#define BLIT_CCTL_DST_MOCS(idx) \
|
||||
REG_FIELD_PREP(BLIT_CCTL_DST_MOCS_MASK, (idx) << 1)
|
||||
|
||||
#define SRC_COPY_BLT_CMD (2 << 29 | 0x43 << 22)
|
||||
#define GEN9_XY_FAST_COPY_BLT_CMD (2 << 29 | 0x42 << 22)
|
||||
#define XY_SRC_COPY_BLT_CMD (2 << 29 | 0x53 << 22)
|
||||
|
@ -288,8 +310,11 @@
|
|||
#define PIPE_CONTROL_DEPTH_CACHE_FLUSH (1<<0)
|
||||
#define PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */
|
||||
|
||||
/* 3D-related flags can't be set on compute engine */
|
||||
#define PIPE_CONTROL_3D_FLAGS (\
|
||||
/*
|
||||
* 3D-related flags that can't be set on _engines_ that lack access to the 3D
|
||||
* pipeline (i.e., CCS engines).
|
||||
*/
|
||||
#define PIPE_CONTROL_3D_ENGINE_FLAGS (\
|
||||
PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH | \
|
||||
PIPE_CONTROL_DEPTH_CACHE_FLUSH | \
|
||||
PIPE_CONTROL_TILE_CACHE_FLUSH | \
|
||||
|
@ -300,6 +325,14 @@
|
|||
PIPE_CONTROL_VF_CACHE_INVALIDATE | \
|
||||
PIPE_CONTROL_GLOBAL_SNAPSHOT_RESET)
|
||||
|
||||
/* 3D-related flags that can't be set on _platforms_ that lack a 3D pipeline */
|
||||
#define PIPE_CONTROL_3D_ARCH_FLAGS ( \
|
||||
PIPE_CONTROL_3D_ENGINE_FLAGS | \
|
||||
PIPE_CONTROL_INDIRECT_STATE_DISABLE | \
|
||||
PIPE_CONTROL_FLUSH_ENABLE | \
|
||||
PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE | \
|
||||
PIPE_CONTROL_DC_FLUSH_ENABLE)
|
||||
|
||||
#define MI_MATH(x) MI_INSTR(0x1a, (x) - 1)
|
||||
#define MI_MATH_INSTR(opcode, op1, op2) ((opcode) << 20 | (op1) << 10 | (op2))
|
||||
/* Opcodes for MI_MATH_INSTR */
|
||||
|
|
|
@ -4,6 +4,7 @@
|
|||
*/
|
||||
|
||||
#include <drm/drm_managed.h>
|
||||
#include <drm/intel-gtt.h>
|
||||
|
||||
#include "gem/i915_gem_internal.h"
|
||||
#include "gem/i915_gem_lmem.h"
|
||||
|
@ -12,11 +13,12 @@
|
|||
#include "i915_drv.h"
|
||||
#include "intel_context.h"
|
||||
#include "intel_engine_regs.h"
|
||||
#include "intel_ggtt_gmch.h"
|
||||
#include "intel_gt.h"
|
||||
#include "intel_gt_buffer_pool.h"
|
||||
#include "intel_gt_clock_utils.h"
|
||||
#include "intel_gt_debugfs.h"
|
||||
#include "intel_gt_gmch.h"
|
||||
#include "intel_gt_mcr.h"
|
||||
#include "intel_gt_pm.h"
|
||||
#include "intel_gt_regs.h"
|
||||
#include "intel_gt_requests.h"
|
||||
|
@ -102,78 +104,13 @@ int intel_gt_assign_ggtt(struct intel_gt *gt)
|
|||
return gt->ggtt ? 0 : -ENOMEM;
|
||||
}
|
||||
|
||||
static const char * const intel_steering_types[] = {
|
||||
"L3BANK",
|
||||
"MSLICE",
|
||||
"LNCF",
|
||||
};
|
||||
|
||||
static const struct intel_mmio_range icl_l3bank_steering_table[] = {
|
||||
{ 0x00B100, 0x00B3FF },
|
||||
{},
|
||||
};
|
||||
|
||||
static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
|
||||
{ 0x004000, 0x004AFF },
|
||||
{ 0x00C800, 0x00CFFF },
|
||||
{ 0x00DD00, 0x00DDFF },
|
||||
{ 0x00E900, 0x00FFFF }, /* 0xEA00 - OxEFFF is unused */
|
||||
{},
|
||||
};
|
||||
|
||||
static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = {
|
||||
{ 0x00B000, 0x00B0FF },
|
||||
{ 0x00D800, 0x00D8FF },
|
||||
{},
|
||||
};
|
||||
|
||||
static const struct intel_mmio_range dg2_lncf_steering_table[] = {
|
||||
{ 0x00B000, 0x00B0FF },
|
||||
{ 0x00D880, 0x00D8FF },
|
||||
{},
|
||||
};
|
||||
|
||||
static u16 slicemask(struct intel_gt *gt, int count)
|
||||
{
|
||||
u64 dss_mask = intel_sseu_get_subslices(>->info.sseu, 0);
|
||||
|
||||
return intel_slicemask_from_dssmask(dss_mask, count);
|
||||
}
|
||||
|
||||
int intel_gt_init_mmio(struct intel_gt *gt)
|
||||
{
|
||||
struct drm_i915_private *i915 = gt->i915;
|
||||
|
||||
intel_gt_init_clock_frequency(gt);
|
||||
|
||||
intel_uc_init_mmio(>->uc);
|
||||
intel_sseu_info_init(gt);
|
||||
|
||||
/*
|
||||
* An mslice is unavailable only if both the meml3 for the slice is
|
||||
* disabled *and* all of the DSS in the slice (quadrant) are disabled.
|
||||
*/
|
||||
if (HAS_MSLICES(i915))
|
||||
gt->info.mslice_mask =
|
||||
slicemask(gt, GEN_DSS_PER_MSLICE) |
|
||||
(intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
|
||||
GEN12_MEML3_EN_MASK);
|
||||
|
||||
if (IS_DG2(i915)) {
|
||||
gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
|
||||
gt->steering_table[LNCF] = dg2_lncf_steering_table;
|
||||
} else if (IS_XEHPSDV(i915)) {
|
||||
gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
|
||||
gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
|
||||
} else if (GRAPHICS_VER(i915) >= 11 &&
|
||||
GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) {
|
||||
gt->steering_table[L3BANK] = icl_l3bank_steering_table;
|
||||
gt->info.l3bank_mask =
|
||||
~intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
|
||||
GEN10_L3BANK_MASK;
|
||||
} else if (HAS_MSLICES(i915)) {
|
||||
MISSING_CASE(INTEL_INFO(i915)->platform);
|
||||
}
|
||||
intel_gt_mcr_init(gt);
|
||||
|
||||
return intel_engines_init_mmio(gt);
|
||||
}
|
||||
|
@ -451,7 +388,7 @@ void intel_gt_chipset_flush(struct intel_gt *gt)
|
|||
{
|
||||
wmb();
|
||||
if (GRAPHICS_VER(gt->i915) < 6)
|
||||
intel_gt_gmch_gen5_chipset_flush(gt);
|
||||
intel_ggtt_gmch_flush();
|
||||
}
|
||||
|
||||
void intel_gt_driver_register(struct intel_gt *gt)
|
||||
|
@ -785,6 +722,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
|
|||
{
|
||||
intel_wakeref_t wakeref;
|
||||
|
||||
intel_gt_sysfs_unregister(gt);
|
||||
intel_rps_driver_unregister(>->rps);
|
||||
intel_gsc_fini(>->gsc);
|
||||
|
||||
|
@ -834,200 +772,6 @@ void intel_gt_driver_late_release_all(struct drm_i915_private *i915)
|
|||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_gt_reg_needs_read_steering - determine whether a register read
|
||||
* requires explicit steering
|
||||
* @gt: GT structure
|
||||
* @reg: the register to check steering requirements for
|
||||
* @type: type of multicast steering to check
|
||||
*
|
||||
* Determines whether @reg needs explicit steering of a specific type for
|
||||
* reads.
|
||||
*
|
||||
* Returns false if @reg does not belong to a register range of the given
|
||||
* steering type, or if the default (subslice-based) steering IDs are suitable
|
||||
* for @type steering too.
|
||||
*/
|
||||
static bool intel_gt_reg_needs_read_steering(struct intel_gt *gt,
|
||||
i915_reg_t reg,
|
||||
enum intel_steering_type type)
|
||||
{
|
||||
const u32 offset = i915_mmio_reg_offset(reg);
|
||||
const struct intel_mmio_range *entry;
|
||||
|
||||
if (likely(!intel_gt_needs_read_steering(gt, type)))
|
||||
return false;
|
||||
|
||||
for (entry = gt->steering_table[type]; entry->end; entry++) {
|
||||
if (offset >= entry->start && offset <= entry->end)
|
||||
return true;
|
||||
}
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_gt_get_valid_steering - determines valid IDs for a class of MCR steering
|
||||
* @gt: GT structure
|
||||
* @type: multicast register type
|
||||
* @sliceid: Slice ID returned
|
||||
* @subsliceid: Subslice ID returned
|
||||
*
|
||||
* Determines sliceid and subsliceid values that will steer reads
|
||||
* of a specific multicast register class to a valid value.
|
||||
*/
|
||||
static void intel_gt_get_valid_steering(struct intel_gt *gt,
|
||||
enum intel_steering_type type,
|
||||
u8 *sliceid, u8 *subsliceid)
|
||||
{
|
||||
switch (type) {
|
||||
case L3BANK:
|
||||
GEM_DEBUG_WARN_ON(!gt->info.l3bank_mask); /* should be impossible! */
|
||||
|
||||
*sliceid = 0; /* unused */
|
||||
*subsliceid = __ffs(gt->info.l3bank_mask);
|
||||
break;
|
||||
case MSLICE:
|
||||
GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
|
||||
|
||||
*sliceid = __ffs(gt->info.mslice_mask);
|
||||
*subsliceid = 0; /* unused */
|
||||
break;
|
||||
case LNCF:
|
||||
GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
|
||||
|
||||
/*
|
||||
* An LNCF is always present if its mslice is present, so we
|
||||
* can safely just steer to LNCF 0 in all cases.
|
||||
*/
|
||||
*sliceid = __ffs(gt->info.mslice_mask) << 1;
|
||||
*subsliceid = 0; /* unused */
|
||||
break;
|
||||
default:
|
||||
MISSING_CASE(type);
|
||||
*sliceid = 0;
|
||||
*subsliceid = 0;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_gt_read_register_fw - reads a GT register with support for multicast
|
||||
* @gt: GT structure
|
||||
* @reg: register to read
|
||||
*
|
||||
* This function will read a GT register. If the register is a multicast
|
||||
* register, the read will be steered to a valid instance (i.e., one that
|
||||
* isn't fused off or powered down by power gating).
|
||||
*
|
||||
* Returns the value from a valid instance of @reg.
|
||||
*/
|
||||
u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg)
|
||||
{
|
||||
int type;
|
||||
u8 sliceid, subsliceid;
|
||||
|
||||
for (type = 0; type < NUM_STEERING_TYPES; type++) {
|
||||
if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
|
||||
intel_gt_get_valid_steering(gt, type, &sliceid,
|
||||
&subsliceid);
|
||||
return intel_uncore_read_with_mcr_steering_fw(gt->uncore,
|
||||
reg,
|
||||
sliceid,
|
||||
subsliceid);
|
||||
}
|
||||
}
|
||||
|
||||
return intel_uncore_read_fw(gt->uncore, reg);
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_gt_get_valid_steering_for_reg - get a valid steering for a register
|
||||
* @gt: GT structure
|
||||
* @reg: register for which the steering is required
|
||||
* @sliceid: return variable for slice steering
|
||||
* @subsliceid: return variable for subslice steering
|
||||
*
|
||||
* This function returns a slice/subslice pair that is guaranteed to work for
|
||||
* read steering of the given register. Note that a value will be returned even
|
||||
* if the register is not replicated and therefore does not actually require
|
||||
* steering.
|
||||
*/
|
||||
void intel_gt_get_valid_steering_for_reg(struct intel_gt *gt, i915_reg_t reg,
|
||||
u8 *sliceid, u8 *subsliceid)
|
||||
{
|
||||
int type;
|
||||
|
||||
for (type = 0; type < NUM_STEERING_TYPES; type++) {
|
||||
if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
|
||||
intel_gt_get_valid_steering(gt, type, sliceid,
|
||||
subsliceid);
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
*sliceid = gt->default_steering.groupid;
|
||||
*subsliceid = gt->default_steering.instanceid;
|
||||
}
|
||||
|
||||
u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg)
|
||||
{
|
||||
int type;
|
||||
u8 sliceid, subsliceid;
|
||||
|
||||
for (type = 0; type < NUM_STEERING_TYPES; type++) {
|
||||
if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
|
||||
intel_gt_get_valid_steering(gt, type, &sliceid,
|
||||
&subsliceid);
|
||||
return intel_uncore_read_with_mcr_steering(gt->uncore,
|
||||
reg,
|
||||
sliceid,
|
||||
subsliceid);
|
||||
}
|
||||
}
|
||||
|
||||
return intel_uncore_read(gt->uncore, reg);
|
||||
}
|
||||
|
||||
static void report_steering_type(struct drm_printer *p,
|
||||
struct intel_gt *gt,
|
||||
enum intel_steering_type type,
|
||||
bool dump_table)
|
||||
{
|
||||
const struct intel_mmio_range *entry;
|
||||
u8 slice, subslice;
|
||||
|
||||
BUILD_BUG_ON(ARRAY_SIZE(intel_steering_types) != NUM_STEERING_TYPES);
|
||||
|
||||
if (!gt->steering_table[type]) {
|
||||
drm_printf(p, "%s steering: uses default steering\n",
|
||||
intel_steering_types[type]);
|
||||
return;
|
||||
}
|
||||
|
||||
intel_gt_get_valid_steering(gt, type, &slice, &subslice);
|
||||
drm_printf(p, "%s steering: sliceid=0x%x, subsliceid=0x%x\n",
|
||||
intel_steering_types[type], slice, subslice);
|
||||
|
||||
if (!dump_table)
|
||||
return;
|
||||
|
||||
for (entry = gt->steering_table[type]; entry->end; entry++)
|
||||
drm_printf(p, "\t0x%06x - 0x%06x\n", entry->start, entry->end);
|
||||
}
|
||||
|
||||
void intel_gt_report_steering(struct drm_printer *p, struct intel_gt *gt,
|
||||
bool dump_table)
|
||||
{
|
||||
drm_printf(p, "Default steering: sliceid=0x%x, subsliceid=0x%x\n",
|
||||
gt->default_steering.groupid,
|
||||
gt->default_steering.instanceid);
|
||||
|
||||
if (HAS_MSLICES(gt->i915)) {
|
||||
report_steering_type(p, gt, MSLICE, dump_table);
|
||||
report_steering_type(p, gt, LNCF, dump_table);
|
||||
}
|
||||
}
|
||||
|
||||
static int intel_gt_tile_setup(struct intel_gt *gt, phys_addr_t phys_addr)
|
||||
{
|
||||
int ret;
|
||||
|
|
|
@ -13,13 +13,6 @@
|
|||
struct drm_i915_private;
|
||||
struct drm_printer;
|
||||
|
||||
struct insert_entries {
|
||||
struct i915_address_space *vm;
|
||||
struct i915_vma_resource *vma_res;
|
||||
enum i915_cache_level level;
|
||||
u32 flags;
|
||||
};
|
||||
|
||||
#define GT_TRACE(gt, fmt, ...) do { \
|
||||
const struct intel_gt *gt__ __maybe_unused = (gt); \
|
||||
GEM_TRACE("%s " fmt, dev_name(gt__->i915->drm.dev), \
|
||||
|
@ -93,21 +86,6 @@ static inline bool intel_gt_is_wedged(const struct intel_gt *gt)
|
|||
return unlikely(test_bit(I915_WEDGED, >->reset.flags));
|
||||
}
|
||||
|
||||
static inline bool intel_gt_needs_read_steering(struct intel_gt *gt,
|
||||
enum intel_steering_type type)
|
||||
{
|
||||
return gt->steering_table[type];
|
||||
}
|
||||
|
||||
void intel_gt_get_valid_steering_for_reg(struct intel_gt *gt, i915_reg_t reg,
|
||||
u8 *sliceid, u8 *subsliceid);
|
||||
|
||||
u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg);
|
||||
u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg);
|
||||
|
||||
void intel_gt_report_steering(struct drm_printer *p, struct intel_gt *gt,
|
||||
bool dump_table);
|
||||
|
||||
int intel_gt_probe_all(struct drm_i915_private *i915);
|
||||
int intel_gt_tiles_init(struct drm_i915_private *i915);
|
||||
void intel_gt_release_all(struct drm_i915_private *i915);
|
||||
|
@ -125,6 +103,4 @@ void intel_gt_watchdog_work(struct work_struct *work);
|
|||
|
||||
void intel_gt_invalidate_tlbs(struct intel_gt *gt);
|
||||
|
||||
struct resource intel_pci_resource(struct pci_dev *pdev, int bar);
|
||||
|
||||
#endif /* __INTEL_GT_H__ */
|
||||
|
|
|
@ -9,6 +9,7 @@
|
|||
#include "intel_gt.h"
|
||||
#include "intel_gt_debugfs.h"
|
||||
#include "intel_gt_engines_debugfs.h"
|
||||
#include "intel_gt_mcr.h"
|
||||
#include "intel_gt_pm_debugfs.h"
|
||||
#include "intel_sseu_debugfs.h"
|
||||
#include "pxp/intel_pxp_debugfs.h"
|
||||
|
@ -64,7 +65,7 @@ static int steering_show(struct seq_file *m, void *data)
|
|||
struct drm_printer p = drm_seq_file_printer(m);
|
||||
struct intel_gt *gt = m->private;
|
||||
|
||||
intel_gt_report_steering(&p, gt, true);
|
||||
intel_gt_mcr_report_steering(&p, gt, true);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
|
|
@ -1,654 +0,0 @@
|
|||
// SPDX-License-Identifier: MIT
|
||||
/*
|
||||
* Copyright © 2022 Intel Corporation
|
||||
*/
|
||||
|
||||
#include <drm/intel-gtt.h>
|
||||
#include <drm/i915_drm.h>
|
||||
|
||||
#include <linux/agp_backend.h>
|
||||
#include <linux/stop_machine.h>
|
||||
|
||||
#include "i915_drv.h"
|
||||
#include "intel_gt_gmch.h"
|
||||
#include "intel_gt_regs.h"
|
||||
#include "intel_gt.h"
|
||||
#include "i915_utils.h"
|
||||
|
||||
#include "gen8_ppgtt.h"
|
||||
|
||||
struct insert_page {
|
||||
struct i915_address_space *vm;
|
||||
dma_addr_t addr;
|
||||
u64 offset;
|
||||
enum i915_cache_level level;
|
||||
};
|
||||
|
||||
static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
|
||||
{
|
||||
writeq(pte, addr);
|
||||
}
|
||||
|
||||
static void nop_clear_range(struct i915_address_space *vm,
|
||||
u64 start, u64 length)
|
||||
{
|
||||
}
|
||||
|
||||
static u64 snb_pte_encode(dma_addr_t addr,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
gen6_pte_t pte = GEN6_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
|
||||
|
||||
switch (level) {
|
||||
case I915_CACHE_L3_LLC:
|
||||
case I915_CACHE_LLC:
|
||||
pte |= GEN6_PTE_CACHE_LLC;
|
||||
break;
|
||||
case I915_CACHE_NONE:
|
||||
pte |= GEN6_PTE_UNCACHED;
|
||||
break;
|
||||
default:
|
||||
MISSING_CASE(level);
|
||||
}
|
||||
|
||||
return pte;
|
||||
}
|
||||
|
||||
static u64 ivb_pte_encode(dma_addr_t addr,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
gen6_pte_t pte = GEN6_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
|
||||
|
||||
switch (level) {
|
||||
case I915_CACHE_L3_LLC:
|
||||
pte |= GEN7_PTE_CACHE_L3_LLC;
|
||||
break;
|
||||
case I915_CACHE_LLC:
|
||||
pte |= GEN6_PTE_CACHE_LLC;
|
||||
break;
|
||||
case I915_CACHE_NONE:
|
||||
pte |= GEN6_PTE_UNCACHED;
|
||||
break;
|
||||
default:
|
||||
MISSING_CASE(level);
|
||||
}
|
||||
|
||||
return pte;
|
||||
}
|
||||
|
||||
static u64 byt_pte_encode(dma_addr_t addr,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
gen6_pte_t pte = GEN6_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
|
||||
|
||||
if (!(flags & PTE_READ_ONLY))
|
||||
pte |= BYT_PTE_WRITEABLE;
|
||||
|
||||
if (level != I915_CACHE_NONE)
|
||||
pte |= BYT_PTE_SNOOPED_BY_CPU_CACHES;
|
||||
|
||||
return pte;
|
||||
}
|
||||
|
||||
static u64 hsw_pte_encode(dma_addr_t addr,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
gen6_pte_t pte = HSW_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
|
||||
|
||||
if (level != I915_CACHE_NONE)
|
||||
pte |= HSW_WB_LLC_AGE3;
|
||||
|
||||
return pte;
|
||||
}
|
||||
|
||||
static u64 iris_pte_encode(dma_addr_t addr,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
gen6_pte_t pte = HSW_PTE_ADDR_ENCODE(addr) | GEN6_PTE_VALID;
|
||||
|
||||
switch (level) {
|
||||
case I915_CACHE_NONE:
|
||||
break;
|
||||
case I915_CACHE_WT:
|
||||
pte |= HSW_WT_ELLC_LLC_AGE3;
|
||||
break;
|
||||
default:
|
||||
pte |= HSW_WB_ELLC_LLC_AGE3;
|
||||
break;
|
||||
}
|
||||
|
||||
return pte;
|
||||
}
|
||||
|
||||
static void gen5_ggtt_insert_page(struct i915_address_space *vm,
|
||||
dma_addr_t addr,
|
||||
u64 offset,
|
||||
enum i915_cache_level cache_level,
|
||||
u32 unused)
|
||||
{
|
||||
unsigned int flags = (cache_level == I915_CACHE_NONE) ?
|
||||
AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
|
||||
|
||||
intel_gtt_insert_page(addr, offset >> PAGE_SHIFT, flags);
|
||||
}
|
||||
|
||||
static void gen6_ggtt_insert_page(struct i915_address_space *vm,
|
||||
dma_addr_t addr,
|
||||
u64 offset,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
|
||||
gen6_pte_t __iomem *pte =
|
||||
(gen6_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
|
||||
|
||||
iowrite32(vm->pte_encode(addr, level, flags), pte);
|
||||
|
||||
ggtt->invalidate(ggtt);
|
||||
}
|
||||
|
||||
static void gen8_ggtt_insert_page(struct i915_address_space *vm,
|
||||
dma_addr_t addr,
|
||||
u64 offset,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
|
||||
gen8_pte_t __iomem *pte =
|
||||
(gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
|
||||
|
||||
gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, flags));
|
||||
|
||||
ggtt->invalidate(ggtt);
|
||||
}
|
||||
|
||||
static void gen5_ggtt_insert_entries(struct i915_address_space *vm,
|
||||
struct i915_vma_resource *vma_res,
|
||||
enum i915_cache_level cache_level,
|
||||
u32 unused)
|
||||
{
|
||||
unsigned int flags = (cache_level == I915_CACHE_NONE) ?
|
||||
AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
|
||||
|
||||
intel_gtt_insert_sg_entries(vma_res->bi.pages, vma_res->start >> PAGE_SHIFT,
|
||||
flags);
|
||||
}
|
||||
|
||||
/*
|
||||
* Binds an object into the global gtt with the specified cache level.
|
||||
* The object will be accessible to the GPU via commands whose operands
|
||||
* reference offsets within the global GTT as well as accessible by the GPU
|
||||
* through the GMADR mapped BAR (i915->mm.gtt->gtt).
|
||||
*/
|
||||
static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
|
||||
struct i915_vma_resource *vma_res,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
|
||||
gen6_pte_t __iomem *gte;
|
||||
gen6_pte_t __iomem *end;
|
||||
struct sgt_iter iter;
|
||||
dma_addr_t addr;
|
||||
|
||||
gte = (gen6_pte_t __iomem *)ggtt->gsm;
|
||||
gte += vma_res->start / I915_GTT_PAGE_SIZE;
|
||||
end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
|
||||
|
||||
for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
|
||||
iowrite32(vm->pte_encode(addr, level, flags), gte++);
|
||||
GEM_BUG_ON(gte > end);
|
||||
|
||||
/* Fill the allocated but "unused" space beyond the end of the buffer */
|
||||
while (gte < end)
|
||||
iowrite32(vm->scratch[0]->encode, gte++);
|
||||
|
||||
/*
|
||||
* We want to flush the TLBs only after we're certain all the PTE
|
||||
* updates have finished.
|
||||
*/
|
||||
ggtt->invalidate(ggtt);
|
||||
}
|
||||
|
||||
static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
|
||||
struct i915_vma_resource *vma_res,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, flags);
|
||||
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
|
||||
gen8_pte_t __iomem *gte;
|
||||
gen8_pte_t __iomem *end;
|
||||
struct sgt_iter iter;
|
||||
dma_addr_t addr;
|
||||
|
||||
/*
|
||||
* Note that we ignore PTE_READ_ONLY here. The caller must be careful
|
||||
* not to allow the user to override access to a read only page.
|
||||
*/
|
||||
|
||||
gte = (gen8_pte_t __iomem *)ggtt->gsm;
|
||||
gte += vma_res->start / I915_GTT_PAGE_SIZE;
|
||||
end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
|
||||
|
||||
for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
|
||||
gen8_set_pte(gte++, pte_encode | addr);
|
||||
GEM_BUG_ON(gte > end);
|
||||
|
||||
/* Fill the allocated but "unused" space beyond the end of the buffer */
|
||||
while (gte < end)
|
||||
gen8_set_pte(gte++, vm->scratch[0]->encode);
|
||||
|
||||
/*
|
||||
* We want to flush the TLBs only after we're certain all the PTE
|
||||
* updates have finished.
|
||||
*/
|
||||
ggtt->invalidate(ggtt);
|
||||
}
|
||||
|
||||
static void bxt_vtd_ggtt_wa(struct i915_address_space *vm)
|
||||
{
|
||||
/*
|
||||
* Make sure the internal GAM fifo has been cleared of all GTT
|
||||
* writes before exiting stop_machine(). This guarantees that
|
||||
* any aperture accesses waiting to start in another process
|
||||
* cannot back up behind the GTT writes causing a hang.
|
||||
* The register can be any arbitrary GAM register.
|
||||
*/
|
||||
intel_uncore_posting_read_fw(vm->gt->uncore, GFX_FLSH_CNTL_GEN6);
|
||||
}
|
||||
|
||||
static int bxt_vtd_ggtt_insert_page__cb(void *_arg)
|
||||
{
|
||||
struct insert_page *arg = _arg;
|
||||
|
||||
gen8_ggtt_insert_page(arg->vm, arg->addr, arg->offset, arg->level, 0);
|
||||
bxt_vtd_ggtt_wa(arg->vm);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
|
||||
dma_addr_t addr,
|
||||
u64 offset,
|
||||
enum i915_cache_level level,
|
||||
u32 unused)
|
||||
{
|
||||
struct insert_page arg = { vm, addr, offset, level };
|
||||
|
||||
stop_machine(bxt_vtd_ggtt_insert_page__cb, &arg, NULL);
|
||||
}
|
||||
|
||||
static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
|
||||
{
|
||||
struct insert_entries *arg = _arg;
|
||||
|
||||
gen8_ggtt_insert_entries(arg->vm, arg->vma_res, arg->level, arg->flags);
|
||||
bxt_vtd_ggtt_wa(arg->vm);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm,
|
||||
struct i915_vma_resource *vma_res,
|
||||
enum i915_cache_level level,
|
||||
u32 flags)
|
||||
{
|
||||
struct insert_entries arg = { vm, vma_res, level, flags };
|
||||
|
||||
stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL);
|
||||
}
|
||||
|
||||
void intel_gt_gmch_gen5_chipset_flush(struct intel_gt *gt)
|
||||
{
|
||||
intel_gtt_chipset_flush();
|
||||
}
|
||||
|
||||
static void gmch_ggtt_invalidate(struct i915_ggtt *ggtt)
|
||||
{
|
||||
intel_gtt_chipset_flush();
|
||||
}
|
||||
|
||||
static void gen5_ggtt_clear_range(struct i915_address_space *vm,
|
||||
u64 start, u64 length)
|
||||
{
|
||||
intel_gtt_clear_range(start >> PAGE_SHIFT, length >> PAGE_SHIFT);
|
||||
}
|
||||
|
||||
static void gen6_ggtt_clear_range(struct i915_address_space *vm,
|
||||
u64 start, u64 length)
|
||||
{
|
||||
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
|
||||
unsigned int first_entry = start / I915_GTT_PAGE_SIZE;
|
||||
unsigned int num_entries = length / I915_GTT_PAGE_SIZE;
|
||||
gen6_pte_t scratch_pte, __iomem *gtt_base =
|
||||
(gen6_pte_t __iomem *)ggtt->gsm + first_entry;
|
||||
const int max_entries = ggtt_total_entries(ggtt) - first_entry;
|
||||
int i;
|
||||
|
||||
if (WARN(num_entries > max_entries,
|
||||
"First entry = %d; Num entries = %d (max=%d)\n",
|
||||
first_entry, num_entries, max_entries))
|
||||
num_entries = max_entries;
|
||||
|
||||
scratch_pte = vm->scratch[0]->encode;
|
||||
for (i = 0; i < num_entries; i++)
|
||||
iowrite32(scratch_pte, >t_base[i]);
|
||||
}
|
||||
|
||||
static void gen8_ggtt_clear_range(struct i915_address_space *vm,
|
||||
u64 start, u64 length)
|
||||
{
|
||||
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
|
||||
unsigned int first_entry = start / I915_GTT_PAGE_SIZE;
|
||||
unsigned int num_entries = length / I915_GTT_PAGE_SIZE;
|
||||
const gen8_pte_t scratch_pte = vm->scratch[0]->encode;
|
||||
gen8_pte_t __iomem *gtt_base =
|
||||
(gen8_pte_t __iomem *)ggtt->gsm + first_entry;
|
||||
const int max_entries = ggtt_total_entries(ggtt) - first_entry;
|
||||
int i;
|
||||
|
||||
if (WARN(num_entries > max_entries,
|
||||
"First entry = %d; Num entries = %d (max=%d)\n",
|
||||
first_entry, num_entries, max_entries))
|
||||
num_entries = max_entries;
|
||||
|
||||
for (i = 0; i < num_entries; i++)
|
||||
gen8_set_pte(>t_base[i], scratch_pte);
|
||||
}
|
||||
|
||||
static void gen5_gmch_remove(struct i915_address_space *vm)
|
||||
{
|
||||
intel_gmch_remove();
|
||||
}
|
||||
|
||||
static void gen6_gmch_remove(struct i915_address_space *vm)
|
||||
{
|
||||
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
|
||||
|
||||
iounmap(ggtt->gsm);
|
||||
free_scratch(vm);
|
||||
}
|
||||
|
||||
/*
|
||||
* Certain Gen5 chipsets require idling the GPU before
|
||||
* unmapping anything from the GTT when VT-d is enabled.
|
||||
*/
|
||||
static bool needs_idle_maps(struct drm_i915_private *i915)
|
||||
{
|
||||
/*
|
||||
* Query intel_iommu to see if we need the workaround. Presumably that
|
||||
* was loaded first.
|
||||
*/
|
||||
if (!i915_vtd_active(i915))
|
||||
return false;
|
||||
|
||||
if (GRAPHICS_VER(i915) == 5 && IS_MOBILE(i915))
|
||||
return true;
|
||||
|
||||
if (GRAPHICS_VER(i915) == 12)
|
||||
return true; /* XXX DMAR fault reason 7 */
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
static unsigned int gen6_gttmmadr_size(struct drm_i915_private *i915)
|
||||
{
|
||||
/*
|
||||
* GEN6: GTTMMADR size is 4MB and GTTADR starts at 2MB offset
|
||||
* GEN8: GTTMMADR size is 16MB and GTTADR starts at 8MB offset
|
||||
*/
|
||||
GEM_BUG_ON(GRAPHICS_VER(i915) < 6);
|
||||
return (GRAPHICS_VER(i915) < 8) ? SZ_4M : SZ_16M;
|
||||
}
|
||||
|
||||
static unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
|
||||
{
|
||||
snb_gmch_ctl >>= SNB_GMCH_GGMS_SHIFT;
|
||||
snb_gmch_ctl &= SNB_GMCH_GGMS_MASK;
|
||||
return snb_gmch_ctl << 20;
|
||||
}
|
||||
|
||||
static unsigned int gen8_get_total_gtt_size(u16 bdw_gmch_ctl)
|
||||
{
|
||||
bdw_gmch_ctl >>= BDW_GMCH_GGMS_SHIFT;
|
||||
bdw_gmch_ctl &= BDW_GMCH_GGMS_MASK;
|
||||
if (bdw_gmch_ctl)
|
||||
bdw_gmch_ctl = 1 << bdw_gmch_ctl;
|
||||
|
||||
#ifdef CONFIG_X86_32
|
||||
/* Limit 32b platforms to a 2GB GGTT: 4 << 20 / pte size * I915_GTT_PAGE_SIZE */
|
||||
if (bdw_gmch_ctl > 4)
|
||||
bdw_gmch_ctl = 4;
|
||||
#endif
|
||||
|
||||
return bdw_gmch_ctl << 20;
|
||||
}
|
||||
|
||||
static unsigned int gen6_gttadr_offset(struct drm_i915_private *i915)
|
||||
{
|
||||
return gen6_gttmmadr_size(i915) / 2;
|
||||
}
|
||||
|
||||
static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
|
||||
{
|
||||
struct drm_i915_private *i915 = ggtt->vm.i915;
|
||||
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
|
||||
phys_addr_t phys_addr;
|
||||
u32 pte_flags;
|
||||
int ret;
|
||||
|
||||
GEM_WARN_ON(pci_resource_len(pdev, 0) != gen6_gttmmadr_size(i915));
|
||||
phys_addr = pci_resource_start(pdev, 0) + gen6_gttadr_offset(i915);
|
||||
|
||||
/*
|
||||
* On BXT+/ICL+ writes larger than 64 bit to the GTT pagetable range
|
||||
* will be dropped. For WC mappings in general we have 64 byte burst
|
||||
* writes when the WC buffer is flushed, so we can't use it, but have to
|
||||
* resort to an uncached mapping. The WC issue is easily caught by the
|
||||
* readback check when writing GTT PTE entries.
|
||||
*/
|
||||
if (IS_GEN9_LP(i915) || GRAPHICS_VER(i915) >= 11)
|
||||
ggtt->gsm = ioremap(phys_addr, size);
|
||||
else
|
||||
ggtt->gsm = ioremap_wc(phys_addr, size);
|
||||
if (!ggtt->gsm) {
|
||||
drm_err(&i915->drm, "Failed to map the ggtt page table\n");
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
kref_init(&ggtt->vm.resv_ref);
|
||||
ret = setup_scratch_page(&ggtt->vm);
|
||||
if (ret) {
|
||||
drm_err(&i915->drm, "Scratch setup failed\n");
|
||||
/* iounmap will also get called at remove, but meh */
|
||||
iounmap(ggtt->gsm);
|
||||
return ret;
|
||||
}
|
||||
|
||||
pte_flags = 0;
|
||||
if (i915_gem_object_is_lmem(ggtt->vm.scratch[0]))
|
||||
pte_flags |= PTE_LM;
|
||||
|
||||
ggtt->vm.scratch[0]->encode =
|
||||
ggtt->vm.pte_encode(px_dma(ggtt->vm.scratch[0]),
|
||||
I915_CACHE_NONE, pte_flags);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int intel_gt_gmch_gen5_probe(struct i915_ggtt *ggtt)
|
||||
{
|
||||
struct drm_i915_private *i915 = ggtt->vm.i915;
|
||||
phys_addr_t gmadr_base;
|
||||
int ret;
|
||||
|
||||
ret = intel_gmch_probe(i915->bridge_dev, to_pci_dev(i915->drm.dev), NULL);
|
||||
if (!ret) {
|
||||
drm_err(&i915->drm, "failed to set up gmch\n");
|
||||
return -EIO;
|
||||
}
|
||||
|
||||
intel_gtt_get(&ggtt->vm.total, &gmadr_base, &ggtt->mappable_end);
|
||||
|
||||
ggtt->gmadr =
|
||||
(struct resource)DEFINE_RES_MEM(gmadr_base, ggtt->mappable_end);
|
||||
|
||||
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
|
||||
ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
|
||||
|
||||
if (needs_idle_maps(i915)) {
|
||||
drm_notice(&i915->drm,
|
||||
"Flushing DMA requests before IOMMU unmaps; performance may be degraded\n");
|
||||
ggtt->do_idle_maps = true;
|
||||
}
|
||||
|
||||
ggtt->vm.insert_page = gen5_ggtt_insert_page;
|
||||
ggtt->vm.insert_entries = gen5_ggtt_insert_entries;
|
||||
ggtt->vm.clear_range = gen5_ggtt_clear_range;
|
||||
ggtt->vm.cleanup = gen5_gmch_remove;
|
||||
|
||||
ggtt->invalidate = gmch_ggtt_invalidate;
|
||||
|
||||
ggtt->vm.vma_ops.bind_vma = intel_ggtt_bind_vma;
|
||||
ggtt->vm.vma_ops.unbind_vma = intel_ggtt_unbind_vma;
|
||||
|
||||
if (unlikely(ggtt->do_idle_maps))
|
||||
drm_notice(&i915->drm,
|
||||
"Applying Ironlake quirks for intel_iommu\n");
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int intel_gt_gmch_gen6_probe(struct i915_ggtt *ggtt)
|
||||
{
|
||||
struct drm_i915_private *i915 = ggtt->vm.i915;
|
||||
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
|
||||
unsigned int size;
|
||||
u16 snb_gmch_ctl;
|
||||
|
||||
ggtt->gmadr = intel_pci_resource(pdev, 2);
|
||||
ggtt->mappable_end = resource_size(&ggtt->gmadr);
|
||||
|
||||
/*
|
||||
* 64/512MB is the current min/max we actually know of, but this is
|
||||
* just a coarse sanity check.
|
||||
*/
|
||||
if (ggtt->mappable_end < (64<<20) || ggtt->mappable_end > (512<<20)) {
|
||||
drm_err(&i915->drm, "Unknown GMADR size (%pa)\n",
|
||||
&ggtt->mappable_end);
|
||||
return -ENXIO;
|
||||
}
|
||||
|
||||
pci_read_config_word(pdev, SNB_GMCH_CTRL, &snb_gmch_ctl);
|
||||
|
||||
size = gen6_get_total_gtt_size(snb_gmch_ctl);
|
||||
ggtt->vm.total = (size / sizeof(gen6_pte_t)) * I915_GTT_PAGE_SIZE;
|
||||
|
||||
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
|
||||
ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
|
||||
|
||||
ggtt->vm.clear_range = nop_clear_range;
|
||||
if (!HAS_FULL_PPGTT(i915) || intel_scanout_needs_vtd_wa(i915))
|
||||
ggtt->vm.clear_range = gen6_ggtt_clear_range;
|
||||
ggtt->vm.insert_page = gen6_ggtt_insert_page;
|
||||
ggtt->vm.insert_entries = gen6_ggtt_insert_entries;
|
||||
ggtt->vm.cleanup = gen6_gmch_remove;
|
||||
|
||||
ggtt->invalidate = gen6_ggtt_invalidate;
|
||||
|
||||
if (HAS_EDRAM(i915))
|
||||
ggtt->vm.pte_encode = iris_pte_encode;
|
||||
else if (IS_HASWELL(i915))
|
||||
ggtt->vm.pte_encode = hsw_pte_encode;
|
||||
else if (IS_VALLEYVIEW(i915))
|
||||
ggtt->vm.pte_encode = byt_pte_encode;
|
||||
else if (GRAPHICS_VER(i915) >= 7)
|
||||
ggtt->vm.pte_encode = ivb_pte_encode;
|
||||
else
|
||||
ggtt->vm.pte_encode = snb_pte_encode;
|
||||
|
||||
ggtt->vm.vma_ops.bind_vma = intel_ggtt_bind_vma;
|
||||
ggtt->vm.vma_ops.unbind_vma = intel_ggtt_unbind_vma;
|
||||
|
||||
return ggtt_probe_common(ggtt, size);
|
||||
}
|
||||
|
||||
static unsigned int chv_get_total_gtt_size(u16 gmch_ctrl)
|
||||
{
|
||||
gmch_ctrl >>= SNB_GMCH_GGMS_SHIFT;
|
||||
gmch_ctrl &= SNB_GMCH_GGMS_MASK;
|
||||
|
||||
if (gmch_ctrl)
|
||||
return 1 << (20 + gmch_ctrl);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int intel_gt_gmch_gen8_probe(struct i915_ggtt *ggtt)
|
||||
{
|
||||
struct drm_i915_private *i915 = ggtt->vm.i915;
|
||||
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
|
||||
unsigned int size;
|
||||
u16 snb_gmch_ctl;
|
||||
|
||||
/* TODO: We're not aware of mappable constraints on gen8 yet */
|
||||
if (!HAS_LMEM(i915)) {
|
||||
ggtt->gmadr = intel_pci_resource(pdev, 2);
|
||||
ggtt->mappable_end = resource_size(&ggtt->gmadr);
|
||||
}
|
||||
|
||||
pci_read_config_word(pdev, SNB_GMCH_CTRL, &snb_gmch_ctl);
|
||||
if (IS_CHERRYVIEW(i915))
|
||||
size = chv_get_total_gtt_size(snb_gmch_ctl);
|
||||
else
|
||||
size = gen8_get_total_gtt_size(snb_gmch_ctl);
|
||||
|
||||
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
|
||||
ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
|
||||
ggtt->vm.lmem_pt_obj_flags = I915_BO_ALLOC_PM_EARLY;
|
||||
|
||||
ggtt->vm.total = (size / sizeof(gen8_pte_t)) * I915_GTT_PAGE_SIZE;
|
||||
ggtt->vm.cleanup = gen6_gmch_remove;
|
||||
ggtt->vm.insert_page = gen8_ggtt_insert_page;
|
||||
ggtt->vm.clear_range = nop_clear_range;
|
||||
if (intel_scanout_needs_vtd_wa(i915))
|
||||
ggtt->vm.clear_range = gen8_ggtt_clear_range;
|
||||
|
||||
ggtt->vm.insert_entries = gen8_ggtt_insert_entries;
|
||||
|
||||
/*
|
||||
* Serialize GTT updates with aperture access on BXT if VT-d is on,
|
||||
* and always on CHV.
|
||||
*/
|
||||
if (intel_vm_no_concurrent_access_wa(i915)) {
|
||||
ggtt->vm.insert_entries = bxt_vtd_ggtt_insert_entries__BKL;
|
||||
ggtt->vm.insert_page = bxt_vtd_ggtt_insert_page__BKL;
|
||||
ggtt->vm.bind_async_flags =
|
||||
I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
|
||||
}
|
||||
|
||||
ggtt->invalidate = gen8_ggtt_invalidate;
|
||||
|
||||
ggtt->vm.vma_ops.bind_vma = intel_ggtt_bind_vma;
|
||||
ggtt->vm.vma_ops.unbind_vma = intel_ggtt_unbind_vma;
|
||||
|
||||
ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
|
||||
|
||||
setup_private_pat(ggtt->vm.gt->uncore);
|
||||
|
||||
return ggtt_probe_common(ggtt, size);
|
||||
}
|
||||
|
||||
int intel_gt_gmch_gen5_enable_hw(struct drm_i915_private *i915)
|
||||
{
|
||||
if (GRAPHICS_VER(i915) < 6 && !intel_enable_gtt())
|
||||
return -EIO;
|
||||
|
||||
return 0;
|
||||
}
|
|
@ -1,46 +0,0 @@
|
|||
/* SPDX-License-Identifier: MIT */
|
||||
/*
|
||||
* Copyright © 2022 Intel Corporation
|
||||
*/
|
||||
|
||||
#ifndef __INTEL_GT_GMCH_H__
|
||||
#define __INTEL_GT_GMCH_H__
|
||||
|
||||
#include "intel_gtt.h"
|
||||
|
||||
/* For x86 platforms */
|
||||
#if IS_ENABLED(CONFIG_X86)
|
||||
void intel_gt_gmch_gen5_chipset_flush(struct intel_gt *gt);
|
||||
int intel_gt_gmch_gen6_probe(struct i915_ggtt *ggtt);
|
||||
int intel_gt_gmch_gen8_probe(struct i915_ggtt *ggtt);
|
||||
int intel_gt_gmch_gen5_probe(struct i915_ggtt *ggtt);
|
||||
int intel_gt_gmch_gen5_enable_hw(struct drm_i915_private *i915);
|
||||
|
||||
/* Stubs for non-x86 platforms */
|
||||
#else
|
||||
static inline void intel_gt_gmch_gen5_chipset_flush(struct intel_gt *gt)
|
||||
{
|
||||
}
|
||||
static inline int intel_gt_gmch_gen5_probe(struct i915_ggtt *ggtt)
|
||||
{
|
||||
/* No HW should be probed for this case yet, return fail */
|
||||
return -ENODEV;
|
||||
}
|
||||
static inline int intel_gt_gmch_gen6_probe(struct i915_ggtt *ggtt)
|
||||
{
|
||||
/* No HW should be probed for this case yet, return fail */
|
||||
return -ENODEV;
|
||||
}
|
||||
static inline int intel_gt_gmch_gen8_probe(struct i915_ggtt *ggtt)
|
||||
{
|
||||
/* No HW should be probed for this case yet, return fail */
|
||||
return -ENODEV;
|
||||
}
|
||||
static inline int intel_gt_gmch_gen5_enable_hw(struct drm_i915_private *i915)
|
||||
{
|
||||
/* No HW should be enabled for this case yet, return fail */
|
||||
return -ENODEV;
|
||||
}
|
||||
#endif
|
||||
|
||||
#endif /* __INTEL_GT_GMCH_H__ */
|
|
@ -193,6 +193,14 @@ void gen11_gt_irq_reset(struct intel_gt *gt)
|
|||
/* Restore masks irqs on RCS, BCS, VCS and VECS engines. */
|
||||
intel_uncore_write(uncore, GEN11_RCS0_RSVD_INTR_MASK, ~0);
|
||||
intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~0);
|
||||
if (HAS_ENGINE(gt, BCS1) || HAS_ENGINE(gt, BCS2))
|
||||
intel_uncore_write(uncore, XEHPC_BCS1_BCS2_INTR_MASK, ~0);
|
||||
if (HAS_ENGINE(gt, BCS3) || HAS_ENGINE(gt, BCS4))
|
||||
intel_uncore_write(uncore, XEHPC_BCS3_BCS4_INTR_MASK, ~0);
|
||||
if (HAS_ENGINE(gt, BCS5) || HAS_ENGINE(gt, BCS6))
|
||||
intel_uncore_write(uncore, XEHPC_BCS5_BCS6_INTR_MASK, ~0);
|
||||
if (HAS_ENGINE(gt, BCS7) || HAS_ENGINE(gt, BCS8))
|
||||
intel_uncore_write(uncore, XEHPC_BCS7_BCS8_INTR_MASK, ~0);
|
||||
intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~0);
|
||||
intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~0);
|
||||
if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
|
||||
|
@ -248,6 +256,14 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt)
|
|||
/* Unmask irqs on RCS, BCS, VCS and VECS engines. */
|
||||
intel_uncore_write(uncore, GEN11_RCS0_RSVD_INTR_MASK, ~smask);
|
||||
intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~smask);
|
||||
if (HAS_ENGINE(gt, BCS1) || HAS_ENGINE(gt, BCS2))
|
||||
intel_uncore_write(uncore, XEHPC_BCS1_BCS2_INTR_MASK, ~dmask);
|
||||
if (HAS_ENGINE(gt, BCS3) || HAS_ENGINE(gt, BCS4))
|
||||
intel_uncore_write(uncore, XEHPC_BCS3_BCS4_INTR_MASK, ~dmask);
|
||||
if (HAS_ENGINE(gt, BCS5) || HAS_ENGINE(gt, BCS6))
|
||||
intel_uncore_write(uncore, XEHPC_BCS5_BCS6_INTR_MASK, ~dmask);
|
||||
if (HAS_ENGINE(gt, BCS7) || HAS_ENGINE(gt, BCS8))
|
||||
intel_uncore_write(uncore, XEHPC_BCS7_BCS8_INTR_MASK, ~dmask);
|
||||
intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~dmask);
|
||||
intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~dmask);
|
||||
if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
|
||||
|
|
|
@ -0,0 +1,497 @@
|
|||
// SPDX-License-Identifier: MIT
|
||||
/*
|
||||
* Copyright © 2022 Intel Corporation
|
||||
*/
|
||||
|
||||
#include "i915_drv.h"
|
||||
|
||||
#include "intel_gt_mcr.h"
|
||||
#include "intel_gt_regs.h"
|
||||
|
||||
/**
|
||||
* DOC: GT Multicast/Replicated (MCR) Register Support
|
||||
*
|
||||
* Some GT registers are designed as "multicast" or "replicated" registers:
|
||||
* multiple instances of the same register share a single MMIO offset. MCR
|
||||
* registers are generally used when the hardware needs to potentially track
|
||||
* independent values of a register per hardware unit (e.g., per-subslice,
|
||||
* per-L3bank, etc.). The specific types of replication that exist vary
|
||||
* per-platform.
|
||||
*
|
||||
* MMIO accesses to MCR registers are controlled according to the settings
|
||||
* programmed in the platform's MCR_SELECTOR register(s). MMIO writes to MCR
|
||||
* registers can be done in either a (i.e., a single write updates all
|
||||
* instances of the register to the same value) or unicast (a write updates only
|
||||
* one specific instance). Reads of MCR registers always operate in a unicast
|
||||
* manner regardless of how the multicast/unicast bit is set in MCR_SELECTOR.
|
||||
* Selection of a specific MCR instance for unicast operations is referred to
|
||||
* as "steering."
|
||||
*
|
||||
* If MCR register operations are steered toward a hardware unit that is
|
||||
* fused off or currently powered down due to power gating, the MMIO operation
|
||||
* is "terminated" by the hardware. Terminated read operations will return a
|
||||
* value of zero and terminated unicast write operations will be silently
|
||||
* ignored.
|
||||
*/
|
||||
|
||||
#define HAS_MSLICE_STEERING(dev_priv) (INTEL_INFO(dev_priv)->has_mslice_steering)
|
||||
|
||||
static const char * const intel_steering_types[] = {
|
||||
"L3BANK",
|
||||
"MSLICE",
|
||||
"LNCF",
|
||||
"INSTANCE 0",
|
||||
};
|
||||
|
||||
static const struct intel_mmio_range icl_l3bank_steering_table[] = {
|
||||
{ 0x00B100, 0x00B3FF },
|
||||
{},
|
||||
};
|
||||
|
||||
static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
|
||||
{ 0x004000, 0x004AFF },
|
||||
{ 0x00C800, 0x00CFFF },
|
||||
{ 0x00DD00, 0x00DDFF },
|
||||
{ 0x00E900, 0x00FFFF }, /* 0xEA00 - OxEFFF is unused */
|
||||
{},
|
||||
};
|
||||
|
||||
static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = {
|
||||
{ 0x00B000, 0x00B0FF },
|
||||
{ 0x00D800, 0x00D8FF },
|
||||
{},
|
||||
};
|
||||
|
||||
static const struct intel_mmio_range dg2_lncf_steering_table[] = {
|
||||
{ 0x00B000, 0x00B0FF },
|
||||
{ 0x00D880, 0x00D8FF },
|
||||
{},
|
||||
};
|
||||
|
||||
/*
|
||||
* We have several types of MCR registers on PVC where steering to (0,0)
|
||||
* will always provide us with a non-terminated value. We'll stick them
|
||||
* all in the same table for simplicity.
|
||||
*/
|
||||
static const struct intel_mmio_range pvc_instance0_steering_table[] = {
|
||||
{ 0x004000, 0x004AFF }, /* HALF-BSLICE */
|
||||
{ 0x008800, 0x00887F }, /* CC */
|
||||
{ 0x008A80, 0x008AFF }, /* TILEPSMI */
|
||||
{ 0x00B000, 0x00B0FF }, /* HALF-BSLICE */
|
||||
{ 0x00B100, 0x00B3FF }, /* L3BANK */
|
||||
{ 0x00C800, 0x00CFFF }, /* HALF-BSLICE */
|
||||
{ 0x00D800, 0x00D8FF }, /* HALF-BSLICE */
|
||||
{ 0x00DD00, 0x00DDFF }, /* BSLICE */
|
||||
{ 0x00E900, 0x00E9FF }, /* HALF-BSLICE */
|
||||
{ 0x00EC00, 0x00EEFF }, /* HALF-BSLICE */
|
||||
{ 0x00F000, 0x00FFFF }, /* HALF-BSLICE */
|
||||
{ 0x024180, 0x0241FF }, /* HALF-BSLICE */
|
||||
{},
|
||||
};
|
||||
|
||||
void intel_gt_mcr_init(struct intel_gt *gt)
|
||||
{
|
||||
struct drm_i915_private *i915 = gt->i915;
|
||||
|
||||
/*
|
||||
* An mslice is unavailable only if both the meml3 for the slice is
|
||||
* disabled *and* all of the DSS in the slice (quadrant) are disabled.
|
||||
*/
|
||||
if (HAS_MSLICE_STEERING(i915)) {
|
||||
gt->info.mslice_mask =
|
||||
intel_slicemask_from_xehp_dssmask(gt->info.sseu.subslice_mask,
|
||||
GEN_DSS_PER_MSLICE);
|
||||
gt->info.mslice_mask |=
|
||||
(intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
|
||||
GEN12_MEML3_EN_MASK);
|
||||
|
||||
if (!gt->info.mslice_mask) /* should be impossible! */
|
||||
drm_warn(&i915->drm, "mslice mask all zero!\n");
|
||||
}
|
||||
|
||||
if (IS_PONTEVECCHIO(i915)) {
|
||||
gt->steering_table[INSTANCE0] = pvc_instance0_steering_table;
|
||||
} else if (IS_DG2(i915)) {
|
||||
gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
|
||||
gt->steering_table[LNCF] = dg2_lncf_steering_table;
|
||||
} else if (IS_XEHPSDV(i915)) {
|
||||
gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
|
||||
gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
|
||||
} else if (GRAPHICS_VER(i915) >= 11 &&
|
||||
GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) {
|
||||
gt->steering_table[L3BANK] = icl_l3bank_steering_table;
|
||||
gt->info.l3bank_mask =
|
||||
~intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
|
||||
GEN10_L3BANK_MASK;
|
||||
if (!gt->info.l3bank_mask) /* should be impossible! */
|
||||
drm_warn(&i915->drm, "L3 bank mask is all zero!\n");
|
||||
} else if (GRAPHICS_VER(i915) >= 11) {
|
||||
/*
|
||||
* We expect all modern platforms to have at least some
|
||||
* type of steering that needs to be initialized.
|
||||
*/
|
||||
MISSING_CASE(INTEL_INFO(i915)->platform);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* rw_with_mcr_steering_fw - Access a register with specific MCR steering
|
||||
* @uncore: pointer to struct intel_uncore
|
||||
* @reg: register being accessed
|
||||
* @rw_flag: FW_REG_READ for read access or FW_REG_WRITE for write access
|
||||
* @group: group number (documented as "sliceid" on older platforms)
|
||||
* @instance: instance number (documented as "subsliceid" on older platforms)
|
||||
* @value: register value to be written (ignored for read)
|
||||
*
|
||||
* Return: 0 for write access. register value for read access.
|
||||
*
|
||||
* Caller needs to make sure the relevant forcewake wells are up.
|
||||
*/
|
||||
static u32 rw_with_mcr_steering_fw(struct intel_uncore *uncore,
|
||||
i915_reg_t reg, u8 rw_flag,
|
||||
int group, int instance, u32 value)
|
||||
{
|
||||
u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
|
||||
|
||||
lockdep_assert_held(&uncore->lock);
|
||||
|
||||
if (GRAPHICS_VER(uncore->i915) >= 11) {
|
||||
mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
|
||||
mcr_ss = GEN11_MCR_SLICE(group) | GEN11_MCR_SUBSLICE(instance);
|
||||
|
||||
/*
|
||||
* Wa_22013088509
|
||||
*
|
||||
* The setting of the multicast/unicast bit usually wouldn't
|
||||
* matter for read operations (which always return the value
|
||||
* from a single register instance regardless of how that bit
|
||||
* is set), but some platforms have a workaround requiring us
|
||||
* to remain in multicast mode for reads. There's no real
|
||||
* downside to this, so we'll just go ahead and do so on all
|
||||
* platforms; we'll only clear the multicast bit from the mask
|
||||
* when exlicitly doing a write operation.
|
||||
*/
|
||||
if (rw_flag == FW_REG_WRITE)
|
||||
mcr_mask |= GEN11_MCR_MULTICAST;
|
||||
} else {
|
||||
mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
|
||||
mcr_ss = GEN8_MCR_SLICE(group) | GEN8_MCR_SUBSLICE(instance);
|
||||
}
|
||||
|
||||
old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
|
||||
|
||||
mcr &= ~mcr_mask;
|
||||
mcr |= mcr_ss;
|
||||
intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
|
||||
|
||||
if (rw_flag == FW_REG_READ)
|
||||
val = intel_uncore_read_fw(uncore, reg);
|
||||
else
|
||||
intel_uncore_write_fw(uncore, reg, value);
|
||||
|
||||
mcr &= ~mcr_mask;
|
||||
mcr |= old_mcr & mcr_mask;
|
||||
|
||||
intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
|
||||
|
||||
return val;
|
||||
}
|
||||
|
||||
static u32 rw_with_mcr_steering(struct intel_uncore *uncore,
|
||||
i915_reg_t reg, u8 rw_flag,
|
||||
int group, int instance,
|
||||
u32 value)
|
||||
{
|
||||
enum forcewake_domains fw_domains;
|
||||
u32 val;
|
||||
|
||||
fw_domains = intel_uncore_forcewake_for_reg(uncore, reg,
|
||||
rw_flag);
|
||||
fw_domains |= intel_uncore_forcewake_for_reg(uncore,
|
||||
GEN8_MCR_SELECTOR,
|
||||
FW_REG_READ | FW_REG_WRITE);
|
||||
|
||||
spin_lock_irq(&uncore->lock);
|
||||
intel_uncore_forcewake_get__locked(uncore, fw_domains);
|
||||
|
||||
val = rw_with_mcr_steering_fw(uncore, reg, rw_flag, group, instance, value);
|
||||
|
||||
intel_uncore_forcewake_put__locked(uncore, fw_domains);
|
||||
spin_unlock_irq(&uncore->lock);
|
||||
|
||||
return val;
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_gt_mcr_read - read a specific instance of an MCR register
|
||||
* @gt: GT structure
|
||||
* @reg: the MCR register to read
|
||||
* @group: the MCR group
|
||||
* @instance: the MCR instance
|
||||
*
|
||||
* Returns the value read from an MCR register after steering toward a specific
|
||||
* group/instance.
|
||||
*/
|
||||
u32 intel_gt_mcr_read(struct intel_gt *gt,
|
||||
i915_reg_t reg,
|
||||
int group, int instance)
|
||||
{
|
||||
return rw_with_mcr_steering(gt->uncore, reg, FW_REG_READ, group, instance, 0);
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_gt_mcr_unicast_write - write a specific instance of an MCR register
|
||||
* @gt: GT structure
|
||||
* @reg: the MCR register to write
|
||||
* @value: value to write
|
||||
* @group: the MCR group
|
||||
* @instance: the MCR instance
|
||||
*
|
||||
* Write an MCR register in unicast mode after steering toward a specific
|
||||
* group/instance.
|
||||
*/
|
||||
void intel_gt_mcr_unicast_write(struct intel_gt *gt, i915_reg_t reg, u32 value,
|
||||
int group, int instance)
|
||||
{
|
||||
rw_with_mcr_steering(gt->uncore, reg, FW_REG_WRITE, group, instance, value);
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_gt_mcr_multicast_write - write a value to all instances of an MCR register
|
||||
* @gt: GT structure
|
||||
* @reg: the MCR register to write
|
||||
* @value: value to write
|
||||
*
|
||||
* Write an MCR register in multicast mode to update all instances.
|
||||
*/
|
||||
void intel_gt_mcr_multicast_write(struct intel_gt *gt,
|
||||
i915_reg_t reg, u32 value)
|
||||
{
|
||||
intel_uncore_write(gt->uncore, reg, value);
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_gt_mcr_multicast_write_fw - write a value to all instances of an MCR register
|
||||
* @gt: GT structure
|
||||
* @reg: the MCR register to write
|
||||
* @value: value to write
|
||||
*
|
||||
* Write an MCR register in multicast mode to update all instances. This
|
||||
* function assumes the caller is already holding any necessary forcewake
|
||||
* domains; use intel_gt_mcr_multicast_write() in cases where forcewake should
|
||||
* be obtained automatically.
|
||||
*/
|
||||
void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt, i915_reg_t reg, u32 value)
|
||||
{
|
||||
intel_uncore_write_fw(gt->uncore, reg, value);
|
||||
}
|
||||
|
||||
/*
|
||||
* reg_needs_read_steering - determine whether a register read requires
|
||||
* explicit steering
|
||||
* @gt: GT structure
|
||||
* @reg: the register to check steering requirements for
|
||||
* @type: type of multicast steering to check
|
||||
*
|
||||
* Determines whether @reg needs explicit steering of a specific type for
|
||||
* reads.
|
||||
*
|
||||
* Returns false if @reg does not belong to a register range of the given
|
||||
* steering type, or if the default (subslice-based) steering IDs are suitable
|
||||
* for @type steering too.
|
||||
*/
|
||||
static bool reg_needs_read_steering(struct intel_gt *gt,
|
||||
i915_reg_t reg,
|
||||
enum intel_steering_type type)
|
||||
{
|
||||
const u32 offset = i915_mmio_reg_offset(reg);
|
||||
const struct intel_mmio_range *entry;
|
||||
|
||||
if (likely(!gt->steering_table[type]))
|
||||
return false;
|
||||
|
||||
for (entry = gt->steering_table[type]; entry->end; entry++) {
|
||||
if (offset >= entry->start && offset <= entry->end)
|
||||
return true;
|
||||
}
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
/*
|
||||
* get_nonterminated_steering - determines valid IDs for a class of MCR steering
|
||||
* @gt: GT structure
|
||||
* @type: multicast register type
|
||||
* @group: Group ID returned
|
||||
* @instance: Instance ID returned
|
||||
*
|
||||
* Determines group and instance values that will steer reads of the specified
|
||||
* MCR class to a non-terminated instance.
|
||||
*/
|
||||
static void get_nonterminated_steering(struct intel_gt *gt,
|
||||
enum intel_steering_type type,
|
||||
u8 *group, u8 *instance)
|
||||
{
|
||||
switch (type) {
|
||||
case L3BANK:
|
||||
*group = 0; /* unused */
|
||||
*instance = __ffs(gt->info.l3bank_mask);
|
||||
break;
|
||||
case MSLICE:
|
||||
GEM_WARN_ON(!HAS_MSLICE_STEERING(gt->i915));
|
||||
*group = __ffs(gt->info.mslice_mask);
|
||||
*instance = 0; /* unused */
|
||||
break;
|
||||
case LNCF:
|
||||
/*
|
||||
* An LNCF is always present if its mslice is present, so we
|
||||
* can safely just steer to LNCF 0 in all cases.
|
||||
*/
|
||||
GEM_WARN_ON(!HAS_MSLICE_STEERING(gt->i915));
|
||||
*group = __ffs(gt->info.mslice_mask) << 1;
|
||||
*instance = 0; /* unused */
|
||||
break;
|
||||
case INSTANCE0:
|
||||
/*
|
||||
* There are a lot of MCR types for which instance (0, 0)
|
||||
* will always provide a non-terminated value.
|
||||
*/
|
||||
*group = 0;
|
||||
*instance = 0;
|
||||
break;
|
||||
default:
|
||||
MISSING_CASE(type);
|
||||
*group = 0;
|
||||
*instance = 0;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_gt_mcr_get_nonterminated_steering - find group/instance values that
|
||||
* will steer a register to a non-terminated instance
|
||||
* @gt: GT structure
|
||||
* @reg: register for which the steering is required
|
||||
* @group: return variable for group steering
|
||||
* @instance: return variable for instance steering
|
||||
*
|
||||
* This function returns a group/instance pair that is guaranteed to work for
|
||||
* read steering of the given register. Note that a value will be returned even
|
||||
* if the register is not replicated and therefore does not actually require
|
||||
* steering.
|
||||
*/
|
||||
void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
|
||||
i915_reg_t reg,
|
||||
u8 *group, u8 *instance)
|
||||
{
|
||||
int type;
|
||||
|
||||
for (type = 0; type < NUM_STEERING_TYPES; type++) {
|
||||
if (reg_needs_read_steering(gt, reg, type)) {
|
||||
get_nonterminated_steering(gt, type, group, instance);
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
*group = gt->default_steering.groupid;
|
||||
*instance = gt->default_steering.instanceid;
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_gt_mcr_read_any_fw - reads one instance of an MCR register
|
||||
* @gt: GT structure
|
||||
* @reg: register to read
|
||||
*
|
||||
* Reads a GT MCR register. The read will be steered to a non-terminated
|
||||
* instance (i.e., one that isn't fused off or powered down by power gating).
|
||||
* This function assumes the caller is already holding any necessary forcewake
|
||||
* domains; use intel_gt_mcr_read_any() in cases where forcewake should be
|
||||
* obtained automatically.
|
||||
*
|
||||
* Returns the value from a non-terminated instance of @reg.
|
||||
*/
|
||||
u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg)
|
||||
{
|
||||
int type;
|
||||
u8 group, instance;
|
||||
|
||||
for (type = 0; type < NUM_STEERING_TYPES; type++) {
|
||||
if (reg_needs_read_steering(gt, reg, type)) {
|
||||
get_nonterminated_steering(gt, type, &group, &instance);
|
||||
return rw_with_mcr_steering_fw(gt->uncore, reg,
|
||||
FW_REG_READ,
|
||||
group, instance, 0);
|
||||
}
|
||||
}
|
||||
|
||||
return intel_uncore_read_fw(gt->uncore, reg);
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_gt_mcr_read_any - reads one instance of an MCR register
|
||||
* @gt: GT structure
|
||||
* @reg: register to read
|
||||
*
|
||||
* Reads a GT MCR register. The read will be steered to a non-terminated
|
||||
* instance (i.e., one that isn't fused off or powered down by power gating).
|
||||
*
|
||||
* Returns the value from a non-terminated instance of @reg.
|
||||
*/
|
||||
u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg)
|
||||
{
|
||||
int type;
|
||||
u8 group, instance;
|
||||
|
||||
for (type = 0; type < NUM_STEERING_TYPES; type++) {
|
||||
if (reg_needs_read_steering(gt, reg, type)) {
|
||||
get_nonterminated_steering(gt, type, &group, &instance);
|
||||
return rw_with_mcr_steering(gt->uncore, reg,
|
||||
FW_REG_READ,
|
||||
group, instance, 0);
|
||||
}
|
||||
}
|
||||
|
||||
return intel_uncore_read(gt->uncore, reg);
|
||||
}
|
||||
|
||||
static void report_steering_type(struct drm_printer *p,
|
||||
struct intel_gt *gt,
|
||||
enum intel_steering_type type,
|
||||
bool dump_table)
|
||||
{
|
||||
const struct intel_mmio_range *entry;
|
||||
u8 group, instance;
|
||||
|
||||
BUILD_BUG_ON(ARRAY_SIZE(intel_steering_types) != NUM_STEERING_TYPES);
|
||||
|
||||
if (!gt->steering_table[type]) {
|
||||
drm_printf(p, "%s steering: uses default steering\n",
|
||||
intel_steering_types[type]);
|
||||
return;
|
||||
}
|
||||
|
||||
get_nonterminated_steering(gt, type, &group, &instance);
|
||||
drm_printf(p, "%s steering: group=0x%x, instance=0x%x\n",
|
||||
intel_steering_types[type], group, instance);
|
||||
|
||||
if (!dump_table)
|
||||
return;
|
||||
|
||||
for (entry = gt->steering_table[type]; entry->end; entry++)
|
||||
drm_printf(p, "\t0x%06x - 0x%06x\n", entry->start, entry->end);
|
||||
}
|
||||
|
||||
void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
|
||||
bool dump_table)
|
||||
{
|
||||
drm_printf(p, "Default steering: group=0x%x, instance=0x%x\n",
|
||||
gt->default_steering.groupid,
|
||||
gt->default_steering.instanceid);
|
||||
|
||||
if (IS_PONTEVECCHIO(gt->i915)) {
|
||||
report_steering_type(p, gt, INSTANCE0, dump_table);
|
||||
} else if (HAS_MSLICE_STEERING(gt->i915)) {
|
||||
report_steering_type(p, gt, MSLICE, dump_table);
|
||||
report_steering_type(p, gt, LNCF, dump_table);
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,34 @@
|
|||
/* SPDX-License-Identifier: MIT */
|
||||
/*
|
||||
* Copyright © 2022 Intel Corporation
|
||||
*/
|
||||
|
||||
#ifndef __INTEL_GT_MCR__
|
||||
#define __INTEL_GT_MCR__
|
||||
|
||||
#include "intel_gt_types.h"
|
||||
|
||||
void intel_gt_mcr_init(struct intel_gt *gt);
|
||||
|
||||
u32 intel_gt_mcr_read(struct intel_gt *gt,
|
||||
i915_reg_t reg,
|
||||
int group, int instance);
|
||||
u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg);
|
||||
u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg);
|
||||
|
||||
void intel_gt_mcr_unicast_write(struct intel_gt *gt,
|
||||
i915_reg_t reg, u32 value,
|
||||
int group, int instance);
|
||||
void intel_gt_mcr_multicast_write(struct intel_gt *gt,
|
||||
i915_reg_t reg, u32 value);
|
||||
void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
|
||||
i915_reg_t reg, u32 value);
|
||||
|
||||
void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
|
||||
i915_reg_t reg,
|
||||
u8 *group, u8 *instance);
|
||||
|
||||
void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
|
||||
bool dump_table);
|
||||
|
||||
#endif /* __INTEL_GT_MCR__ */
|
|
@ -100,14 +100,16 @@ static int vlv_drpc(struct seq_file *m)
|
|||
{
|
||||
struct intel_gt *gt = m->private;
|
||||
struct intel_uncore *uncore = gt->uncore;
|
||||
u32 rcctl1, pw_status;
|
||||
u32 rcctl1, pw_status, mt_fwake_req;
|
||||
|
||||
mt_fwake_req = intel_uncore_read_fw(uncore, FORCEWAKE_MT);
|
||||
pw_status = intel_uncore_read(uncore, VLV_GTLC_PW_STATUS);
|
||||
rcctl1 = intel_uncore_read(uncore, GEN6_RC_CONTROL);
|
||||
|
||||
seq_printf(m, "RC6 Enabled: %s\n",
|
||||
str_yes_no(rcctl1 & (GEN7_RC_CTL_TO_MODE |
|
||||
GEN6_RC_CTL_EI_MODE(1))));
|
||||
seq_printf(m, "Multi-threaded Forcewake Request: 0x%x\n", mt_fwake_req);
|
||||
seq_printf(m, "Render Power Well: %s\n",
|
||||
(pw_status & VLV_GTLC_PW_RENDER_STATUS_MASK) ? "Up" : "Down");
|
||||
seq_printf(m, "Media Power Well: %s\n",
|
||||
|
@ -124,9 +126,10 @@ static int gen6_drpc(struct seq_file *m)
|
|||
struct intel_gt *gt = m->private;
|
||||
struct drm_i915_private *i915 = gt->i915;
|
||||
struct intel_uncore *uncore = gt->uncore;
|
||||
u32 gt_core_status, rcctl1, rc6vids = 0;
|
||||
u32 gt_core_status, mt_fwake_req, rcctl1, rc6vids = 0;
|
||||
u32 gen9_powergate_enable = 0, gen9_powergate_status = 0;
|
||||
|
||||
mt_fwake_req = intel_uncore_read_fw(uncore, FORCEWAKE_MT);
|
||||
gt_core_status = intel_uncore_read_fw(uncore, GEN6_GT_CORE_STATUS);
|
||||
|
||||
rcctl1 = intel_uncore_read(uncore, GEN6_RC_CONTROL);
|
||||
|
@ -178,6 +181,7 @@ static int gen6_drpc(struct seq_file *m)
|
|||
|
||||
seq_printf(m, "Core Power Down: %s\n",
|
||||
str_yes_no(gt_core_status & GEN6_CORE_CPD_STATE_MASK));
|
||||
seq_printf(m, "Multi-threaded Forcewake Request: 0x%x\n", mt_fwake_req);
|
||||
if (GRAPHICS_VER(i915) >= 9) {
|
||||
seq_printf(m, "Render Power Well: %s\n",
|
||||
(gen9_powergate_status &
|
||||
|
|
|
@ -140,6 +140,7 @@
|
|||
#define FF_SLICE_CS_CHICKEN2 _MMIO(0x20e4)
|
||||
#define GEN9_TSG_BARRIER_ACK_DISABLE (1 << 8)
|
||||
#define GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE (1 << 10)
|
||||
#define GEN12_PERF_FIX_BALANCING_CFE_DISABLE REG_BIT(15)
|
||||
|
||||
#define GEN9_CS_DEBUG_MODE1 _MMIO(0x20ec)
|
||||
#define FF_DOP_CLOCK_GATE_DISABLE REG_BIT(1)
|
||||
|
@ -323,8 +324,11 @@
|
|||
|
||||
#define GEN12_PAT_INDEX(index) _MMIO(0x4800 + (index) * 4)
|
||||
|
||||
#define XEHPSDV_FLAT_CCS_BASE_ADDR _MMIO(0x4910)
|
||||
#define XEHPSDV_CCS_BASE_SHIFT 8
|
||||
#define XEHP_TILE0_ADDR_RANGE _MMIO(0x4900)
|
||||
#define XEHP_TILE_LMEM_RANGE_SHIFT 8
|
||||
|
||||
#define XEHP_FLAT_CCS_BASE_ADDR _MMIO(0x4910)
|
||||
#define XEHP_CCS_BASE_SHIFT 8
|
||||
|
||||
#define GAMTARBMODE _MMIO(0x4a08)
|
||||
#define ARB_MODE_BWGTLB_DISABLE (1 << 9)
|
||||
|
@ -561,6 +565,7 @@
|
|||
#define GEN11_GT_VEBOX_DISABLE_MASK (0x0f << GEN11_GT_VEBOX_DISABLE_SHIFT)
|
||||
|
||||
#define GEN12_GT_COMPUTE_DSS_ENABLE _MMIO(0x9144)
|
||||
#define XEHPC_GT_COMPUTE_DSS_ENABLE_EXT _MMIO(0x9148)
|
||||
|
||||
#define GEN6_UCGCTL1 _MMIO(0x9400)
|
||||
#define GEN6_GAMUNIT_CLOCK_GATE_DISABLE (1 << 22)
|
||||
|
@ -597,24 +602,32 @@
|
|||
/* GEN11 changed all bit defs except for FULL & RENDER */
|
||||
#define GEN11_GRDOM_FULL GEN6_GRDOM_FULL
|
||||
#define GEN11_GRDOM_RENDER GEN6_GRDOM_RENDER
|
||||
#define GEN11_GRDOM_BLT (1 << 2)
|
||||
#define GEN11_GRDOM_GUC (1 << 3)
|
||||
#define GEN11_GRDOM_MEDIA (1 << 5)
|
||||
#define GEN11_GRDOM_MEDIA2 (1 << 6)
|
||||
#define GEN11_GRDOM_MEDIA3 (1 << 7)
|
||||
#define GEN11_GRDOM_MEDIA4 (1 << 8)
|
||||
#define GEN11_GRDOM_MEDIA5 (1 << 9)
|
||||
#define GEN11_GRDOM_MEDIA6 (1 << 10)
|
||||
#define GEN11_GRDOM_MEDIA7 (1 << 11)
|
||||
#define GEN11_GRDOM_MEDIA8 (1 << 12)
|
||||
#define GEN11_GRDOM_VECS (1 << 13)
|
||||
#define GEN11_GRDOM_VECS2 (1 << 14)
|
||||
#define GEN11_GRDOM_VECS3 (1 << 15)
|
||||
#define GEN11_GRDOM_VECS4 (1 << 16)
|
||||
#define GEN11_GRDOM_SFC0 (1 << 17)
|
||||
#define GEN11_GRDOM_SFC1 (1 << 18)
|
||||
#define GEN11_GRDOM_SFC2 (1 << 19)
|
||||
#define GEN11_GRDOM_SFC3 (1 << 20)
|
||||
#define XEHPC_GRDOM_BLT8 REG_BIT(31)
|
||||
#define XEHPC_GRDOM_BLT7 REG_BIT(30)
|
||||
#define XEHPC_GRDOM_BLT6 REG_BIT(29)
|
||||
#define XEHPC_GRDOM_BLT5 REG_BIT(28)
|
||||
#define XEHPC_GRDOM_BLT4 REG_BIT(27)
|
||||
#define XEHPC_GRDOM_BLT3 REG_BIT(26)
|
||||
#define XEHPC_GRDOM_BLT2 REG_BIT(25)
|
||||
#define XEHPC_GRDOM_BLT1 REG_BIT(24)
|
||||
#define GEN11_GRDOM_SFC3 REG_BIT(20)
|
||||
#define GEN11_GRDOM_SFC2 REG_BIT(19)
|
||||
#define GEN11_GRDOM_SFC1 REG_BIT(18)
|
||||
#define GEN11_GRDOM_SFC0 REG_BIT(17)
|
||||
#define GEN11_GRDOM_VECS4 REG_BIT(16)
|
||||
#define GEN11_GRDOM_VECS3 REG_BIT(15)
|
||||
#define GEN11_GRDOM_VECS2 REG_BIT(14)
|
||||
#define GEN11_GRDOM_VECS REG_BIT(13)
|
||||
#define GEN11_GRDOM_MEDIA8 REG_BIT(12)
|
||||
#define GEN11_GRDOM_MEDIA7 REG_BIT(11)
|
||||
#define GEN11_GRDOM_MEDIA6 REG_BIT(10)
|
||||
#define GEN11_GRDOM_MEDIA5 REG_BIT(9)
|
||||
#define GEN11_GRDOM_MEDIA4 REG_BIT(8)
|
||||
#define GEN11_GRDOM_MEDIA3 REG_BIT(7)
|
||||
#define GEN11_GRDOM_MEDIA2 REG_BIT(6)
|
||||
#define GEN11_GRDOM_MEDIA REG_BIT(5)
|
||||
#define GEN11_GRDOM_GUC REG_BIT(3)
|
||||
#define GEN11_GRDOM_BLT REG_BIT(2)
|
||||
#define GEN11_VCS_SFC_RESET_BIT(instance) (GEN11_GRDOM_SFC0 << ((instance) >> 1))
|
||||
#define GEN11_VECS_SFC_RESET_BIT(instance) (GEN11_GRDOM_SFC0 << (instance))
|
||||
|
||||
|
@ -622,6 +635,7 @@
|
|||
|
||||
#define GEN7_MISCCPCTL _MMIO(0x9424)
|
||||
#define GEN7_DOP_CLOCK_GATE_ENABLE (1 << 0)
|
||||
#define GEN12_DOP_CLOCK_GATE_RENDER_ENABLE REG_BIT(1)
|
||||
#define GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE (1 << 2)
|
||||
#define GEN8_DOP_CLOCK_GATE_GUC_ENABLE (1 << 4)
|
||||
#define GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE (1 << 6)
|
||||
|
@ -732,6 +746,7 @@
|
|||
#define GEN6_AGGRESSIVE_TURBO (0 << 15)
|
||||
#define GEN9_SW_REQ_UNSLICE_RATIO_SHIFT 23
|
||||
#define GEN9_IGNORE_SLICE_RATIO (0 << 0)
|
||||
#define GEN12_MEDIA_FREQ_RATIO REG_BIT(13)
|
||||
|
||||
#define GEN6_RC_VIDEO_FREQ _MMIO(0xa00c)
|
||||
#define GEN6_RC_CTL_RC6pp_ENABLE (1 << 16)
|
||||
|
@ -969,6 +984,11 @@
|
|||
#define XEHP_L3SCQREG7 _MMIO(0xb188)
|
||||
#define BLEND_FILL_CACHING_OPT_DIS REG_BIT(3)
|
||||
|
||||
#define XEHPC_L3SCRUB _MMIO(0xb18c)
|
||||
#define SCRUB_CL_DWNGRADE_SHARED REG_BIT(12)
|
||||
#define SCRUB_RATE_PER_BANK_MASK REG_GENMASK(2, 0)
|
||||
#define SCRUB_RATE_4B_PER_CLK REG_FIELD_PREP(SCRUB_RATE_PER_BANK_MASK, 0x6)
|
||||
|
||||
#define L3SQCREG1_CCS0 _MMIO(0xb200)
|
||||
#define FLUSHALLNONCOH REG_BIT(5)
|
||||
|
||||
|
@ -1060,8 +1080,10 @@
|
|||
#define GEN9_ENABLE_GPGPU_PREEMPTION REG_BIT(2)
|
||||
|
||||
#define GEN10_CACHE_MODE_SS _MMIO(0xe420)
|
||||
#define ENABLE_PREFETCH_INTO_IC REG_BIT(3)
|
||||
#define ENABLE_EU_COUNT_FOR_TDL_FLUSH REG_BIT(10)
|
||||
#define DISABLE_ECC REG_BIT(5)
|
||||
#define FLOAT_BLEND_OPTIMIZATION_ENABLE REG_BIT(4)
|
||||
#define ENABLE_PREFETCH_INTO_IC REG_BIT(3)
|
||||
|
||||
#define EU_PERF_CNTL0 _MMIO(0xe458)
|
||||
#define EU_PERF_CNTL4 _MMIO(0xe45c)
|
||||
|
@ -1476,6 +1498,14 @@
|
|||
#define GEN11_KCR (19)
|
||||
#define GEN11_GTPM (16)
|
||||
#define GEN11_BCS (15)
|
||||
#define XEHPC_BCS1 (14)
|
||||
#define XEHPC_BCS2 (13)
|
||||
#define XEHPC_BCS3 (12)
|
||||
#define XEHPC_BCS4 (11)
|
||||
#define XEHPC_BCS5 (10)
|
||||
#define XEHPC_BCS6 (9)
|
||||
#define XEHPC_BCS7 (8)
|
||||
#define XEHPC_BCS8 (23)
|
||||
#define GEN12_CCS3 (7)
|
||||
#define GEN12_CCS2 (6)
|
||||
#define GEN12_CCS1 (5)
|
||||
|
@ -1521,6 +1551,10 @@
|
|||
#define GEN11_GUNIT_CSME_INTR_MASK _MMIO(0x1900f4)
|
||||
#define GEN12_CCS0_CCS1_INTR_MASK _MMIO(0x190100)
|
||||
#define GEN12_CCS2_CCS3_INTR_MASK _MMIO(0x190104)
|
||||
#define XEHPC_BCS1_BCS2_INTR_MASK _MMIO(0x190110)
|
||||
#define XEHPC_BCS3_BCS4_INTR_MASK _MMIO(0x190114)
|
||||
#define XEHPC_BCS5_BCS6_INTR_MASK _MMIO(0x190118)
|
||||
#define XEHPC_BCS7_BCS8_INTR_MASK _MMIO(0x19011c)
|
||||
|
||||
#define GEN12_SFC_DONE(n) _MMIO(0x1cc000 + (n) * 0x1000)
|
||||
|
||||
|
|
|
@ -24,7 +24,7 @@ bool is_object_gt(struct kobject *kobj)
|
|||
|
||||
static struct intel_gt *kobj_to_gt(struct kobject *kobj)
|
||||
{
|
||||
return container_of(kobj, struct kobj_gt, base)->gt;
|
||||
return container_of(kobj, struct intel_gt, sysfs_gt);
|
||||
}
|
||||
|
||||
struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
|
||||
|
@ -72,9 +72,9 @@ static struct attribute *id_attrs[] = {
|
|||
};
|
||||
ATTRIBUTE_GROUPS(id);
|
||||
|
||||
/* A kobject needs a release() method even if it does nothing */
|
||||
static void kobj_gt_release(struct kobject *kobj)
|
||||
{
|
||||
kfree(kobj);
|
||||
}
|
||||
|
||||
static struct kobj_type kobj_gt_type = {
|
||||
|
@ -85,8 +85,6 @@ static struct kobj_type kobj_gt_type = {
|
|||
|
||||
void intel_gt_sysfs_register(struct intel_gt *gt)
|
||||
{
|
||||
struct kobj_gt *kg;
|
||||
|
||||
/*
|
||||
* We need to make things right with the
|
||||
* ABI compatibility. The files were originally
|
||||
|
@ -98,25 +96,22 @@ void intel_gt_sysfs_register(struct intel_gt *gt)
|
|||
if (gt_is_root(gt))
|
||||
intel_gt_sysfs_pm_init(gt, gt_get_parent_obj(gt));
|
||||
|
||||
kg = kzalloc(sizeof(*kg), GFP_KERNEL);
|
||||
if (!kg)
|
||||
/* init and xfer ownership to sysfs tree */
|
||||
if (kobject_init_and_add(>->sysfs_gt, &kobj_gt_type,
|
||||
gt->i915->sysfs_gt, "gt%d", gt->info.id))
|
||||
goto exit_fail;
|
||||
|
||||
kobject_init(&kg->base, &kobj_gt_type);
|
||||
kg->gt = gt;
|
||||
|
||||
/* xfer ownership to sysfs tree */
|
||||
if (kobject_add(&kg->base, gt->i915->sysfs_gt, "gt%d", gt->info.id))
|
||||
goto exit_kobj_put;
|
||||
|
||||
intel_gt_sysfs_pm_init(gt, &kg->base);
|
||||
intel_gt_sysfs_pm_init(gt, >->sysfs_gt);
|
||||
|
||||
return;
|
||||
|
||||
exit_kobj_put:
|
||||
kobject_put(&kg->base);
|
||||
|
||||
exit_fail:
|
||||
kobject_put(>->sysfs_gt);
|
||||
drm_warn(>->i915->drm,
|
||||
"failed to initialize gt%d sysfs root\n", gt->info.id);
|
||||
}
|
||||
|
||||
void intel_gt_sysfs_unregister(struct intel_gt *gt)
|
||||
{
|
||||
kobject_put(>->sysfs_gt);
|
||||
}
|
||||
|
|
|
@ -13,11 +13,6 @@
|
|||
|
||||
struct intel_gt;
|
||||
|
||||
struct kobj_gt {
|
||||
struct kobject base;
|
||||
struct intel_gt *gt;
|
||||
};
|
||||
|
||||
bool is_object_gt(struct kobject *kobj);
|
||||
|
||||
struct drm_i915_private *kobj_to_i915(struct kobject *kobj);
|
||||
|
@ -28,6 +23,7 @@ intel_gt_create_kobj(struct intel_gt *gt,
|
|||
const char *name);
|
||||
|
||||
void intel_gt_sysfs_register(struct intel_gt *gt);
|
||||
void intel_gt_sysfs_unregister(struct intel_gt *gt);
|
||||
struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
|
||||
const char *name);
|
||||
|
||||
|
|
|
@ -14,6 +14,7 @@
|
|||
#include "intel_gt_regs.h"
|
||||
#include "intel_gt_sysfs.h"
|
||||
#include "intel_gt_sysfs_pm.h"
|
||||
#include "intel_pcode.h"
|
||||
#include "intel_rc6.h"
|
||||
#include "intel_rps.h"
|
||||
|
||||
|
@ -558,6 +559,174 @@ static const struct attribute *freq_attrs[] = {
|
|||
NULL
|
||||
};
|
||||
|
||||
/*
|
||||
* Scaling for multipliers (aka frequency factors).
|
||||
* The format of the value in the register is u8.8.
|
||||
*
|
||||
* The presentation to userspace is inspired by the perf event framework.
|
||||
* See:
|
||||
* Documentation/ABI/testing/sysfs-bus-event_source-devices-events
|
||||
* for description of:
|
||||
* /sys/bus/event_source/devices/<pmu>/events/<event>.scale
|
||||
*
|
||||
* Summary: Expose two sysfs files for each multiplier.
|
||||
*
|
||||
* 1. File <attr> contains a raw hardware value.
|
||||
* 2. File <attr>.scale contains the multiplicative scale factor to be
|
||||
* used by userspace to compute the actual value.
|
||||
*
|
||||
* So userspace knows that to get the frequency_factor it multiplies the
|
||||
* provided value by the specified scale factor and vice-versa.
|
||||
*
|
||||
* That way there is no precision loss in the kernel interface and API
|
||||
* is future proof should one day the hardware register change to u16.u16,
|
||||
* on some platform. (Or any other fixed point representation.)
|
||||
*
|
||||
* Example:
|
||||
* File <attr> contains the value 2.5, represented as u8.8 0x0280, which
|
||||
* is comprised of:
|
||||
* - an integer part of 2
|
||||
* - a fractional part of 0x80 (representing 0x80 / 2^8 == 0x80 / 256).
|
||||
* File <attr>.scale contains a string representation of floating point
|
||||
* value 0.00390625 (which is (1 / 256)).
|
||||
* Userspace computes the actual value:
|
||||
* 0x0280 * 0.00390625 -> 2.5
|
||||
* or converts an actual value to the value to be written into <attr>:
|
||||
* 2.5 / 0.00390625 -> 0x0280
|
||||
*/
|
||||
|
||||
#define U8_8_VAL_MASK 0xffff
|
||||
#define U8_8_SCALE_TO_VALUE "0.00390625"
|
||||
|
||||
static ssize_t freq_factor_scale_show(struct device *dev,
|
||||
struct device_attribute *attr,
|
||||
char *buff)
|
||||
{
|
||||
return sysfs_emit(buff, "%s\n", U8_8_SCALE_TO_VALUE);
|
||||
}
|
||||
|
||||
static u32 media_ratio_mode_to_factor(u32 mode)
|
||||
{
|
||||
/* 0 -> 0, 1 -> 256, 2 -> 128 */
|
||||
return !mode ? mode : 256 / mode;
|
||||
}
|
||||
|
||||
static ssize_t media_freq_factor_show(struct device *dev,
|
||||
struct device_attribute *attr,
|
||||
char *buff)
|
||||
{
|
||||
struct intel_gt *gt = intel_gt_sysfs_get_drvdata(dev, attr->attr.name);
|
||||
struct intel_guc_slpc *slpc = >->uc.guc.slpc;
|
||||
intel_wakeref_t wakeref;
|
||||
u32 mode;
|
||||
|
||||
/*
|
||||
* Retrieve media_ratio_mode from GEN6_RPNSWREQ bit 13 set by
|
||||
* GuC. GEN6_RPNSWREQ:13 value 0 represents 1:2 and 1 represents 1:1
|
||||
*/
|
||||
if (IS_XEHPSDV(gt->i915) &&
|
||||
slpc->media_ratio_mode == SLPC_MEDIA_RATIO_MODE_DYNAMIC_CONTROL) {
|
||||
/*
|
||||
* For XEHPSDV dynamic mode GEN6_RPNSWREQ:13 does not contain
|
||||
* the media_ratio_mode, just return the cached media ratio
|
||||
*/
|
||||
mode = slpc->media_ratio_mode;
|
||||
} else {
|
||||
with_intel_runtime_pm(gt->uncore->rpm, wakeref)
|
||||
mode = intel_uncore_read(gt->uncore, GEN6_RPNSWREQ);
|
||||
mode = REG_FIELD_GET(GEN12_MEDIA_FREQ_RATIO, mode) ?
|
||||
SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_ONE :
|
||||
SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_TWO;
|
||||
}
|
||||
|
||||
return sysfs_emit(buff, "%u\n", media_ratio_mode_to_factor(mode));
|
||||
}
|
||||
|
||||
static ssize_t media_freq_factor_store(struct device *dev,
|
||||
struct device_attribute *attr,
|
||||
const char *buff, size_t count)
|
||||
{
|
||||
struct intel_gt *gt = intel_gt_sysfs_get_drvdata(dev, attr->attr.name);
|
||||
struct intel_guc_slpc *slpc = >->uc.guc.slpc;
|
||||
u32 factor, mode;
|
||||
int err;
|
||||
|
||||
err = kstrtou32(buff, 0, &factor);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
for (mode = SLPC_MEDIA_RATIO_MODE_DYNAMIC_CONTROL;
|
||||
mode <= SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_TWO; mode++)
|
||||
if (factor == media_ratio_mode_to_factor(mode))
|
||||
break;
|
||||
|
||||
if (mode > SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_TWO)
|
||||
return -EINVAL;
|
||||
|
||||
err = intel_guc_slpc_set_media_ratio_mode(slpc, mode);
|
||||
if (!err) {
|
||||
slpc->media_ratio_mode = mode;
|
||||
DRM_DEBUG("Set slpc->media_ratio_mode to %d", mode);
|
||||
}
|
||||
return err ?: count;
|
||||
}
|
||||
|
||||
static ssize_t media_RP0_freq_mhz_show(struct device *dev,
|
||||
struct device_attribute *attr,
|
||||
char *buff)
|
||||
{
|
||||
struct intel_gt *gt = intel_gt_sysfs_get_drvdata(dev, attr->attr.name);
|
||||
u32 val;
|
||||
int err;
|
||||
|
||||
err = snb_pcode_read_p(gt->uncore, XEHP_PCODE_FREQUENCY_CONFIG,
|
||||
PCODE_MBOX_FC_SC_READ_FUSED_P0,
|
||||
PCODE_MBOX_DOMAIN_MEDIAFF, &val);
|
||||
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
/* Fused media RP0 read from pcode is in units of 50 MHz */
|
||||
val *= GT_FREQUENCY_MULTIPLIER;
|
||||
|
||||
return sysfs_emit(buff, "%u\n", val);
|
||||
}
|
||||
|
||||
static ssize_t media_RPn_freq_mhz_show(struct device *dev,
|
||||
struct device_attribute *attr,
|
||||
char *buff)
|
||||
{
|
||||
struct intel_gt *gt = intel_gt_sysfs_get_drvdata(dev, attr->attr.name);
|
||||
u32 val;
|
||||
int err;
|
||||
|
||||
err = snb_pcode_read_p(gt->uncore, XEHP_PCODE_FREQUENCY_CONFIG,
|
||||
PCODE_MBOX_FC_SC_READ_FUSED_PN,
|
||||
PCODE_MBOX_DOMAIN_MEDIAFF, &val);
|
||||
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
/* Fused media RPn read from pcode is in units of 50 MHz */
|
||||
val *= GT_FREQUENCY_MULTIPLIER;
|
||||
|
||||
return sysfs_emit(buff, "%u\n", val);
|
||||
}
|
||||
|
||||
static DEVICE_ATTR_RW(media_freq_factor);
|
||||
static struct device_attribute dev_attr_media_freq_factor_scale =
|
||||
__ATTR(media_freq_factor.scale, 0444, freq_factor_scale_show, NULL);
|
||||
static DEVICE_ATTR_RO(media_RP0_freq_mhz);
|
||||
static DEVICE_ATTR_RO(media_RPn_freq_mhz);
|
||||
|
||||
static const struct attribute *media_perf_power_attrs[] = {
|
||||
&dev_attr_media_freq_factor.attr,
|
||||
&dev_attr_media_freq_factor_scale.attr,
|
||||
&dev_attr_media_RP0_freq_mhz.attr,
|
||||
&dev_attr_media_RPn_freq_mhz.attr,
|
||||
NULL
|
||||
};
|
||||
|
||||
static int intel_sysfs_rps_init(struct intel_gt *gt, struct kobject *kobj,
|
||||
const struct attribute * const *attrs)
|
||||
{
|
||||
|
@ -599,4 +768,12 @@ void intel_gt_sysfs_pm_init(struct intel_gt *gt, struct kobject *kobj)
|
|||
drm_warn(>->i915->drm,
|
||||
"failed to create gt%u throttle sysfs files (%pe)",
|
||||
gt->info.id, ERR_PTR(ret));
|
||||
|
||||
if (HAS_MEDIA_RATIO_MODE(gt->i915) && intel_uc_uses_guc_slpc(>->uc)) {
|
||||
ret = sysfs_create_files(kobj, media_perf_power_attrs);
|
||||
if (ret)
|
||||
drm_warn(>->i915->drm,
|
||||
"failed to create gt%u media_perf_power_attrs sysfs (%pe)\n",
|
||||
gt->info.id, ERR_PTR(ret));
|
||||
}
|
||||
}
|
||||
|
|
|
@ -59,6 +59,13 @@ enum intel_steering_type {
|
|||
MSLICE,
|
||||
LNCF,
|
||||
|
||||
/*
|
||||
* On some platforms there are multiple types of MCR registers that
|
||||
* will always return a non-terminated value at instance (0, 0). We'll
|
||||
* lump those all into a single category to keep things simple.
|
||||
*/
|
||||
INSTANCE0,
|
||||
|
||||
NUM_STEERING_TYPES
|
||||
};
|
||||
|
||||
|
@ -221,9 +228,13 @@ struct intel_gt {
|
|||
|
||||
struct {
|
||||
u8 uc_index;
|
||||
u8 wb_index; /* Only used on HAS_L3_CCS_READ() platforms */
|
||||
} mocs;
|
||||
|
||||
struct intel_pxp pxp;
|
||||
|
||||
/* gt/gtN sysfs */
|
||||
struct kobject sysfs_gt;
|
||||
};
|
||||
|
||||
enum intel_gt_scratch_field {
|
||||
|
|
|
@ -306,6 +306,15 @@ struct i915_address_space {
|
|||
struct i915_vma_resource *vma_res,
|
||||
enum i915_cache_level cache_level,
|
||||
u32 flags);
|
||||
void (*raw_insert_page)(struct i915_address_space *vm,
|
||||
dma_addr_t addr,
|
||||
u64 offset,
|
||||
enum i915_cache_level cache_level,
|
||||
u32 flags);
|
||||
void (*raw_insert_entries)(struct i915_address_space *vm,
|
||||
struct i915_vma_resource *vma_res,
|
||||
enum i915_cache_level cache_level,
|
||||
u32 flags);
|
||||
void (*cleanup)(struct i915_address_space *vm);
|
||||
|
||||
void (*foreach)(struct i915_address_space *vm,
|
||||
|
@ -345,6 +354,19 @@ struct i915_ggtt {
|
|||
|
||||
bool do_idle_maps;
|
||||
|
||||
/**
|
||||
* @pte_lost: Are ptes lost on resume?
|
||||
*
|
||||
* Whether the system was recently restored from hibernate and
|
||||
* thus may have lost pte content.
|
||||
*/
|
||||
bool pte_lost;
|
||||
|
||||
/**
|
||||
* @probed_pte: Probed pte value on suspend. Re-checked on resume.
|
||||
*/
|
||||
u64 probed_pte;
|
||||
|
||||
int mtrr;
|
||||
|
||||
/** Bit 6 swizzling required for X tiling */
|
||||
|
@ -548,14 +570,13 @@ i915_page_dir_dma_addr(const struct i915_ppgtt *ppgtt, const unsigned int n)
|
|||
|
||||
void ppgtt_init(struct i915_ppgtt *ppgtt, struct intel_gt *gt,
|
||||
unsigned long lmem_pt_obj_flags);
|
||||
|
||||
void intel_ggtt_bind_vma(struct i915_address_space *vm,
|
||||
struct i915_vm_pt_stash *stash,
|
||||
struct i915_vma_resource *vma_res,
|
||||
enum i915_cache_level cache_level,
|
||||
u32 flags);
|
||||
struct i915_vm_pt_stash *stash,
|
||||
struct i915_vma_resource *vma_res,
|
||||
enum i915_cache_level cache_level,
|
||||
u32 flags);
|
||||
void intel_ggtt_unbind_vma(struct i915_address_space *vm,
|
||||
struct i915_vma_resource *vma_res);
|
||||
struct i915_vma_resource *vma_res);
|
||||
|
||||
int i915_ggtt_probe_hw(struct drm_i915_private *i915);
|
||||
int i915_ggtt_init_hw(struct drm_i915_private *i915);
|
||||
|
@ -581,6 +602,17 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm);
|
|||
void i915_ggtt_suspend(struct i915_ggtt *gtt);
|
||||
void i915_ggtt_resume(struct i915_ggtt *ggtt);
|
||||
|
||||
/**
|
||||
* i915_ggtt_mark_pte_lost - Mark ggtt ptes as lost or clear such a marking
|
||||
* @i915 The device private.
|
||||
* @val whether the ptes should be marked as lost.
|
||||
*
|
||||
* In some cases pte content is retained across suspend, but typically lost
|
||||
* across hibernate. Typically they should be marked as lost on
|
||||
* hibernation restore and such marking cleared on suspend.
|
||||
*/
|
||||
void i915_ggtt_mark_pte_lost(struct drm_i915_private *i915, bool val);
|
||||
|
||||
void
|
||||
fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count);
|
||||
|
||||
|
@ -627,7 +659,6 @@ release_pd_entry(struct i915_page_directory * const pd,
|
|||
struct i915_page_table * const pt,
|
||||
const struct drm_i915_gem_object * const scratch);
|
||||
void gen6_ggtt_invalidate(struct i915_ggtt *ggtt);
|
||||
void gen8_ggtt_invalidate(struct i915_ggtt *ggtt);
|
||||
|
||||
void ppgtt_bind_vma(struct i915_address_space *vm,
|
||||
struct i915_vm_pt_stash *stash,
|
||||
|
|
|
@ -111,16 +111,6 @@ enum {
|
|||
#define XEHP_SW_COUNTER_SHIFT 58
|
||||
#define XEHP_SW_COUNTER_WIDTH 6
|
||||
|
||||
static inline u32 lrc_desc_priority(int prio)
|
||||
{
|
||||
if (prio > I915_PRIORITY_NORMAL)
|
||||
return GEN12_CTX_PRIORITY_HIGH;
|
||||
else if (prio < I915_PRIORITY_NORMAL)
|
||||
return GEN12_CTX_PRIORITY_LOW;
|
||||
else
|
||||
return GEN12_CTX_PRIORITY_NORMAL;
|
||||
}
|
||||
|
||||
static inline void lrc_runtime_start(struct intel_context *ce)
|
||||
{
|
||||
struct intel_context_stats *stats = &ce->stats;
|
||||
|
|
|
@ -23,6 +23,7 @@ struct drm_i915_mocs_table {
|
|||
unsigned int n_entries;
|
||||
const struct drm_i915_mocs_entry *table;
|
||||
u8 uc_index;
|
||||
u8 wb_index; /* Only used on HAS_L3_CCS_READ() platforms */
|
||||
u8 unused_entries_index;
|
||||
};
|
||||
|
||||
|
@ -47,6 +48,7 @@ struct drm_i915_mocs_table {
|
|||
|
||||
/* Helper defines */
|
||||
#define GEN9_NUM_MOCS_ENTRIES 64 /* 63-64 are reserved, but configured. */
|
||||
#define PVC_NUM_MOCS_ENTRIES 3
|
||||
|
||||
/* (e)LLC caching options */
|
||||
/*
|
||||
|
@ -394,6 +396,17 @@ static const struct drm_i915_mocs_entry dg2_mocs_table_g10_ax[] = {
|
|||
MOCS_ENTRY(3, 0, L3_3_WB | L3_LKUP(1)),
|
||||
};
|
||||
|
||||
static const struct drm_i915_mocs_entry pvc_mocs_table[] = {
|
||||
/* Error */
|
||||
MOCS_ENTRY(0, 0, L3_3_WB),
|
||||
|
||||
/* UC */
|
||||
MOCS_ENTRY(1, 0, L3_1_UC),
|
||||
|
||||
/* WB */
|
||||
MOCS_ENTRY(2, 0, L3_3_WB),
|
||||
};
|
||||
|
||||
enum {
|
||||
HAS_GLOBAL_MOCS = BIT(0),
|
||||
HAS_ENGINE_MOCS = BIT(1),
|
||||
|
@ -423,7 +436,14 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
|
|||
memset(table, 0, sizeof(struct drm_i915_mocs_table));
|
||||
|
||||
table->unused_entries_index = I915_MOCS_PTE;
|
||||
if (IS_DG2(i915)) {
|
||||
if (IS_PONTEVECCHIO(i915)) {
|
||||
table->size = ARRAY_SIZE(pvc_mocs_table);
|
||||
table->table = pvc_mocs_table;
|
||||
table->n_entries = PVC_NUM_MOCS_ENTRIES;
|
||||
table->uc_index = 1;
|
||||
table->wb_index = 2;
|
||||
table->unused_entries_index = 2;
|
||||
} else if (IS_DG2(i915)) {
|
||||
if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0)) {
|
||||
table->size = ARRAY_SIZE(dg2_mocs_table_g10_ax);
|
||||
table->table = dg2_mocs_table_g10_ax;
|
||||
|
@ -622,6 +642,8 @@ void intel_set_mocs_index(struct intel_gt *gt)
|
|||
|
||||
get_mocs_settings(gt->i915, &table);
|
||||
gt->mocs.uc_index = table.uc_index;
|
||||
if (HAS_L3_CCS_READ(gt->i915))
|
||||
gt->mocs.wb_index = table.wb_index;
|
||||
}
|
||||
|
||||
void intel_mocs_init(struct intel_gt *gt)
|
||||
|
|
|
@ -12,6 +12,7 @@
|
|||
#include "gem/i915_gem_region.h"
|
||||
#include "gem/i915_gem_ttm.h"
|
||||
#include "gt/intel_gt.h"
|
||||
#include "gt/intel_gt_mcr.h"
|
||||
#include "gt/intel_gt_regs.h"
|
||||
|
||||
static int
|
||||
|
@ -101,14 +102,24 @@ static struct intel_memory_region *setup_lmem(struct intel_gt *gt)
|
|||
return ERR_PTR(-ENODEV);
|
||||
|
||||
if (HAS_FLAT_CCS(i915)) {
|
||||
resource_size_t lmem_range;
|
||||
u64 tile_stolen, flat_ccs_base;
|
||||
|
||||
lmem_size = pci_resource_len(pdev, 2);
|
||||
flat_ccs_base = intel_gt_read_register(gt, XEHPSDV_FLAT_CCS_BASE_ADDR);
|
||||
flat_ccs_base = (flat_ccs_base >> XEHPSDV_CCS_BASE_SHIFT) * SZ_64K;
|
||||
lmem_range = intel_gt_mcr_read_any(&i915->gt0, XEHP_TILE0_ADDR_RANGE) & 0xFFFF;
|
||||
lmem_size = lmem_range >> XEHP_TILE_LMEM_RANGE_SHIFT;
|
||||
lmem_size *= SZ_1G;
|
||||
|
||||
flat_ccs_base = intel_gt_mcr_read_any(gt, XEHP_FLAT_CCS_BASE_ADDR);
|
||||
flat_ccs_base = (flat_ccs_base >> XEHP_CCS_BASE_SHIFT) * SZ_64K;
|
||||
|
||||
/* FIXME: Remove this when we have small-bar enabled */
|
||||
if (pci_resource_len(pdev, 2) < lmem_size) {
|
||||
drm_err(&i915->drm, "System requires small-BAR support, which is currently unsupported on this kernel\n");
|
||||
return ERR_PTR(-EINVAL);
|
||||
}
|
||||
|
||||
if (GEM_WARN_ON(lmem_size < flat_ccs_base))
|
||||
return ERR_PTR(-ENODEV);
|
||||
return ERR_PTR(-EIO);
|
||||
|
||||
tile_stolen = lmem_size - flat_ccs_base;
|
||||
|
||||
|
@ -131,7 +142,7 @@ static struct intel_memory_region *setup_lmem(struct intel_gt *gt)
|
|||
io_start = pci_resource_start(pdev, 2);
|
||||
io_size = min(pci_resource_len(pdev, 2), lmem_size);
|
||||
if (!io_size)
|
||||
return ERR_PTR(-ENODEV);
|
||||
return ERR_PTR(-EIO);
|
||||
|
||||
min_page_size = HAS_64K_PAGES(i915) ? I915_GTT_PAGE_SIZE_64K :
|
||||
I915_GTT_PAGE_SIZE_4K;
|
||||
|
|
|
@ -117,7 +117,9 @@ static void flush_cs_tlb(struct intel_engine_cs *engine)
|
|||
return;
|
||||
|
||||
/* ring should be idle before issuing a sync flush*/
|
||||
GEM_DEBUG_WARN_ON((ENGINE_READ(engine, RING_MI_MODE) & MODE_IDLE) == 0);
|
||||
if ((ENGINE_READ(engine, RING_MI_MODE) & MODE_IDLE) == 0)
|
||||
drm_warn(&engine->i915->drm, "%s not idle before sync flush!\n",
|
||||
engine->name);
|
||||
|
||||
ENGINE_WRITE_FW(engine, RING_INSTPM,
|
||||
_MASKED_BIT_ENABLE(INSTPM_TLB_INVALIDATE |
|
||||
|
@ -596,8 +598,9 @@ static void ring_context_reset(struct intel_context *ce)
|
|||
clear_bit(CONTEXT_VALID_BIT, &ce->flags);
|
||||
}
|
||||
|
||||
static void ring_context_ban(struct intel_context *ce,
|
||||
struct i915_request *rq)
|
||||
static void ring_context_revoke(struct intel_context *ce,
|
||||
struct i915_request *rq,
|
||||
unsigned int preempt_timeout_ms)
|
||||
{
|
||||
struct intel_engine_cs *engine;
|
||||
|
||||
|
@ -632,7 +635,7 @@ static const struct intel_context_ops ring_context_ops = {
|
|||
|
||||
.cancel_request = ring_context_cancel_request,
|
||||
|
||||
.ban = ring_context_ban,
|
||||
.revoke = ring_context_revoke,
|
||||
|
||||
.pre_pin = ring_context_pre_pin,
|
||||
.pin = ring_context_pin,
|
||||
|
|
|
@ -1075,7 +1075,9 @@ static u32 intel_rps_read_state_cap(struct intel_rps *rps)
|
|||
struct drm_i915_private *i915 = rps_to_i915(rps);
|
||||
struct intel_uncore *uncore = rps_to_uncore(rps);
|
||||
|
||||
if (IS_XEHPSDV(i915))
|
||||
if (IS_PONTEVECCHIO(i915))
|
||||
return intel_uncore_read(uncore, PVC_RP_STATE_CAP);
|
||||
else if (IS_XEHPSDV(i915))
|
||||
return intel_uncore_read(uncore, XEHPSDV_RP_STATE_CAP);
|
||||
else if (IS_GEN9_LP(i915))
|
||||
return intel_uncore_read(uncore, BXT_RP_STATE_CAP);
|
||||
|
|
|
@ -16,11 +16,6 @@ void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices,
|
|||
sseu->max_slices = max_slices;
|
||||
sseu->max_subslices = max_subslices;
|
||||
sseu->max_eus_per_subslice = max_eus_per_subslice;
|
||||
|
||||
sseu->ss_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
|
||||
GEM_BUG_ON(sseu->ss_stride > GEN_MAX_SUBSLICE_STRIDE);
|
||||
sseu->eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
|
||||
GEM_BUG_ON(sseu->eu_stride > GEN_MAX_EU_STRIDE);
|
||||
}
|
||||
|
||||
unsigned int
|
||||
|
@ -28,152 +23,240 @@ intel_sseu_subslice_total(const struct sseu_dev_info *sseu)
|
|||
{
|
||||
unsigned int i, total = 0;
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask); i++)
|
||||
total += hweight8(sseu->subslice_mask[i]);
|
||||
if (sseu->has_xehp_dss)
|
||||
return bitmap_weight(sseu->subslice_mask.xehp,
|
||||
XEHP_BITMAP_BITS(sseu->subslice_mask));
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(sseu->subslice_mask.hsw); i++)
|
||||
total += hweight8(sseu->subslice_mask.hsw[i]);
|
||||
|
||||
return total;
|
||||
}
|
||||
|
||||
static u32
|
||||
sseu_get_subslices(const struct sseu_dev_info *sseu,
|
||||
const u8 *subslice_mask, u8 slice)
|
||||
{
|
||||
int i, offset = slice * sseu->ss_stride;
|
||||
u32 mask = 0;
|
||||
|
||||
GEM_BUG_ON(slice >= sseu->max_slices);
|
||||
|
||||
for (i = 0; i < sseu->ss_stride; i++)
|
||||
mask |= (u32)subslice_mask[offset + i] << i * BITS_PER_BYTE;
|
||||
|
||||
return mask;
|
||||
}
|
||||
|
||||
u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice)
|
||||
{
|
||||
return sseu_get_subslices(sseu, sseu->subslice_mask, slice);
|
||||
}
|
||||
|
||||
static u32 sseu_get_geometry_subslices(const struct sseu_dev_info *sseu)
|
||||
{
|
||||
return sseu_get_subslices(sseu, sseu->geometry_subslice_mask, 0);
|
||||
}
|
||||
|
||||
u32 intel_sseu_get_compute_subslices(const struct sseu_dev_info *sseu)
|
||||
{
|
||||
return sseu_get_subslices(sseu, sseu->compute_subslice_mask, 0);
|
||||
}
|
||||
|
||||
void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
|
||||
u8 *subslice_mask, u32 ss_mask)
|
||||
{
|
||||
int offset = slice * sseu->ss_stride;
|
||||
|
||||
memcpy(&subslice_mask[offset], &ss_mask, sseu->ss_stride);
|
||||
}
|
||||
|
||||
unsigned int
|
||||
intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice)
|
||||
intel_sseu_get_hsw_subslices(const struct sseu_dev_info *sseu, u8 slice)
|
||||
{
|
||||
return hweight32(intel_sseu_get_subslices(sseu, slice));
|
||||
}
|
||||
WARN_ON(sseu->has_xehp_dss);
|
||||
if (WARN_ON(slice >= sseu->max_slices))
|
||||
return 0;
|
||||
|
||||
static int sseu_eu_idx(const struct sseu_dev_info *sseu, int slice,
|
||||
int subslice)
|
||||
{
|
||||
int slice_stride = sseu->max_subslices * sseu->eu_stride;
|
||||
|
||||
return slice * slice_stride + subslice * sseu->eu_stride;
|
||||
return sseu->subslice_mask.hsw[slice];
|
||||
}
|
||||
|
||||
static u16 sseu_get_eus(const struct sseu_dev_info *sseu, int slice,
|
||||
int subslice)
|
||||
{
|
||||
int i, offset = sseu_eu_idx(sseu, slice, subslice);
|
||||
u16 eu_mask = 0;
|
||||
|
||||
for (i = 0; i < sseu->eu_stride; i++)
|
||||
eu_mask |=
|
||||
((u16)sseu->eu_mask[offset + i]) << (i * BITS_PER_BYTE);
|
||||
|
||||
return eu_mask;
|
||||
if (sseu->has_xehp_dss) {
|
||||
WARN_ON(slice > 0);
|
||||
return sseu->eu_mask.xehp[subslice];
|
||||
} else {
|
||||
return sseu->eu_mask.hsw[slice][subslice];
|
||||
}
|
||||
}
|
||||
|
||||
static void sseu_set_eus(struct sseu_dev_info *sseu, int slice, int subslice,
|
||||
u16 eu_mask)
|
||||
{
|
||||
int i, offset = sseu_eu_idx(sseu, slice, subslice);
|
||||
|
||||
for (i = 0; i < sseu->eu_stride; i++)
|
||||
sseu->eu_mask[offset + i] =
|
||||
(eu_mask >> (BITS_PER_BYTE * i)) & 0xff;
|
||||
GEM_WARN_ON(eu_mask && __fls(eu_mask) >= sseu->max_eus_per_subslice);
|
||||
if (sseu->has_xehp_dss) {
|
||||
GEM_WARN_ON(slice > 0);
|
||||
sseu->eu_mask.xehp[subslice] = eu_mask;
|
||||
} else {
|
||||
sseu->eu_mask.hsw[slice][subslice] = eu_mask;
|
||||
}
|
||||
}
|
||||
|
||||
static u16 compute_eu_total(const struct sseu_dev_info *sseu)
|
||||
{
|
||||
u16 i, total = 0;
|
||||
int s, ss, total = 0;
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(sseu->eu_mask); i++)
|
||||
total += hweight8(sseu->eu_mask[i]);
|
||||
for (s = 0; s < sseu->max_slices; s++)
|
||||
for (ss = 0; ss < sseu->max_subslices; ss++)
|
||||
if (sseu->has_xehp_dss)
|
||||
total += hweight16(sseu->eu_mask.xehp[ss]);
|
||||
else
|
||||
total += hweight16(sseu->eu_mask.hsw[s][ss]);
|
||||
|
||||
return total;
|
||||
}
|
||||
|
||||
static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en)
|
||||
/**
|
||||
* intel_sseu_copy_eumask_to_user - Copy EU mask into a userspace buffer
|
||||
* @to: Pointer to userspace buffer to copy to
|
||||
* @sseu: SSEU structure containing EU mask to copy
|
||||
*
|
||||
* Copies the EU mask to a userspace buffer in the format expected by
|
||||
* the query ioctl's topology queries.
|
||||
*
|
||||
* Returns the result of the copy_to_user() operation.
|
||||
*/
|
||||
int intel_sseu_copy_eumask_to_user(void __user *to,
|
||||
const struct sseu_dev_info *sseu)
|
||||
{
|
||||
u32 ss_mask;
|
||||
|
||||
ss_mask = ss_en >> (s * sseu->max_subslices);
|
||||
ss_mask &= GENMASK(sseu->max_subslices - 1, 0);
|
||||
|
||||
return ss_mask;
|
||||
}
|
||||
|
||||
static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en,
|
||||
u32 g_ss_en, u32 c_ss_en, u16 eu_en)
|
||||
{
|
||||
int s, ss;
|
||||
|
||||
/* g_ss_en/c_ss_en represent entire subslice mask across all slices */
|
||||
GEM_BUG_ON(sseu->max_slices * sseu->max_subslices >
|
||||
sizeof(g_ss_en) * BITS_PER_BYTE);
|
||||
u8 eu_mask[GEN_SS_MASK_SIZE * GEN_MAX_EU_STRIDE] = {};
|
||||
int eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
|
||||
int len = sseu->max_slices * sseu->max_subslices * eu_stride;
|
||||
int s, ss, i;
|
||||
|
||||
for (s = 0; s < sseu->max_slices; s++) {
|
||||
if ((s_en & BIT(s)) == 0)
|
||||
continue;
|
||||
for (ss = 0; ss < sseu->max_subslices; ss++) {
|
||||
int uapi_offset =
|
||||
s * sseu->max_subslices * eu_stride +
|
||||
ss * eu_stride;
|
||||
u16 mask = sseu_get_eus(sseu, s, ss);
|
||||
|
||||
sseu->slice_mask |= BIT(s);
|
||||
|
||||
/*
|
||||
* XeHP introduces the concept of compute vs geometry DSS. To
|
||||
* reduce variation between GENs around subslice usage, store a
|
||||
* mask for both the geometry and compute enabled masks since
|
||||
* userspace will need to be able to query these masks
|
||||
* independently. Also compute a total enabled subslice count
|
||||
* for the purposes of selecting subslices to use in a
|
||||
* particular GEM context.
|
||||
*/
|
||||
intel_sseu_set_subslices(sseu, s, sseu->compute_subslice_mask,
|
||||
get_ss_stride_mask(sseu, s, c_ss_en));
|
||||
intel_sseu_set_subslices(sseu, s, sseu->geometry_subslice_mask,
|
||||
get_ss_stride_mask(sseu, s, g_ss_en));
|
||||
intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
|
||||
get_ss_stride_mask(sseu, s,
|
||||
g_ss_en | c_ss_en));
|
||||
|
||||
for (ss = 0; ss < sseu->max_subslices; ss++)
|
||||
if (intel_sseu_has_subslice(sseu, s, ss))
|
||||
sseu_set_eus(sseu, s, ss, eu_en);
|
||||
for (i = 0; i < eu_stride; i++)
|
||||
eu_mask[uapi_offset + i] =
|
||||
(mask >> (BITS_PER_BYTE * i)) & 0xff;
|
||||
}
|
||||
}
|
||||
|
||||
return copy_to_user(to, eu_mask, len);
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_sseu_copy_ssmask_to_user - Copy subslice mask into a userspace buffer
|
||||
* @to: Pointer to userspace buffer to copy to
|
||||
* @sseu: SSEU structure containing subslice mask to copy
|
||||
*
|
||||
* Copies the subslice mask to a userspace buffer in the format expected by
|
||||
* the query ioctl's topology queries.
|
||||
*
|
||||
* Returns the result of the copy_to_user() operation.
|
||||
*/
|
||||
int intel_sseu_copy_ssmask_to_user(void __user *to,
|
||||
const struct sseu_dev_info *sseu)
|
||||
{
|
||||
u8 ss_mask[GEN_SS_MASK_SIZE] = {};
|
||||
int ss_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
|
||||
int len = sseu->max_slices * ss_stride;
|
||||
int s, ss, i;
|
||||
|
||||
for (s = 0; s < sseu->max_slices; s++) {
|
||||
for (ss = 0; ss < sseu->max_subslices; ss++) {
|
||||
i = s * ss_stride * BITS_PER_BYTE + ss;
|
||||
|
||||
if (!intel_sseu_has_subslice(sseu, s, ss))
|
||||
continue;
|
||||
|
||||
ss_mask[i / BITS_PER_BYTE] |= BIT(i % BITS_PER_BYTE);
|
||||
}
|
||||
}
|
||||
|
||||
return copy_to_user(to, ss_mask, len);
|
||||
}
|
||||
|
||||
static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
|
||||
u32 ss_en, u16 eu_en)
|
||||
{
|
||||
u32 valid_ss_mask = GENMASK(sseu->max_subslices - 1, 0);
|
||||
int ss;
|
||||
|
||||
sseu->slice_mask |= BIT(0);
|
||||
sseu->subslice_mask.hsw[0] = ss_en & valid_ss_mask;
|
||||
|
||||
for (ss = 0; ss < sseu->max_subslices; ss++)
|
||||
if (intel_sseu_has_subslice(sseu, 0, ss))
|
||||
sseu_set_eus(sseu, 0, ss, eu_en);
|
||||
|
||||
sseu->eu_per_subslice = hweight16(eu_en);
|
||||
sseu->eu_total = compute_eu_total(sseu);
|
||||
}
|
||||
|
||||
static void xehp_compute_sseu_info(struct sseu_dev_info *sseu,
|
||||
u16 eu_en)
|
||||
{
|
||||
int ss;
|
||||
|
||||
sseu->slice_mask |= BIT(0);
|
||||
|
||||
bitmap_or(sseu->subslice_mask.xehp,
|
||||
sseu->compute_subslice_mask.xehp,
|
||||
sseu->geometry_subslice_mask.xehp,
|
||||
XEHP_BITMAP_BITS(sseu->subslice_mask));
|
||||
|
||||
for (ss = 0; ss < sseu->max_subslices; ss++)
|
||||
if (intel_sseu_has_subslice(sseu, 0, ss))
|
||||
sseu_set_eus(sseu, 0, ss, eu_en);
|
||||
|
||||
sseu->eu_per_subslice = hweight16(eu_en);
|
||||
sseu->eu_total = compute_eu_total(sseu);
|
||||
}
|
||||
|
||||
static void
|
||||
xehp_load_dss_mask(struct intel_uncore *uncore,
|
||||
intel_sseu_ss_mask_t *ssmask,
|
||||
int numregs,
|
||||
...)
|
||||
{
|
||||
va_list argp;
|
||||
u32 fuse_val[I915_MAX_SS_FUSE_REGS] = {};
|
||||
int i;
|
||||
|
||||
if (WARN_ON(numregs > I915_MAX_SS_FUSE_REGS))
|
||||
numregs = I915_MAX_SS_FUSE_REGS;
|
||||
|
||||
va_start(argp, numregs);
|
||||
for (i = 0; i < numregs; i++)
|
||||
fuse_val[i] = intel_uncore_read(uncore, va_arg(argp, i915_reg_t));
|
||||
va_end(argp);
|
||||
|
||||
bitmap_from_arr32(ssmask->xehp, fuse_val, numregs * 32);
|
||||
}
|
||||
|
||||
static void xehp_sseu_info_init(struct intel_gt *gt)
|
||||
{
|
||||
struct sseu_dev_info *sseu = >->info.sseu;
|
||||
struct intel_uncore *uncore = gt->uncore;
|
||||
u16 eu_en = 0;
|
||||
u8 eu_en_fuse;
|
||||
int num_compute_regs, num_geometry_regs;
|
||||
int eu;
|
||||
|
||||
if (IS_PONTEVECCHIO(gt->i915)) {
|
||||
num_geometry_regs = 0;
|
||||
num_compute_regs = 2;
|
||||
} else {
|
||||
num_geometry_regs = 1;
|
||||
num_compute_regs = 1;
|
||||
}
|
||||
|
||||
/*
|
||||
* The concept of slice has been removed in Xe_HP. To be compatible
|
||||
* with prior generations, assume a single slice across the entire
|
||||
* device. Then calculate out the DSS for each workload type within
|
||||
* that software slice.
|
||||
*/
|
||||
intel_sseu_set_info(sseu, 1,
|
||||
32 * max(num_geometry_regs, num_compute_regs),
|
||||
HAS_ONE_EU_PER_FUSE_BIT(gt->i915) ? 8 : 16);
|
||||
sseu->has_xehp_dss = 1;
|
||||
|
||||
xehp_load_dss_mask(uncore, &sseu->geometry_subslice_mask,
|
||||
num_geometry_regs,
|
||||
GEN12_GT_GEOMETRY_DSS_ENABLE);
|
||||
xehp_load_dss_mask(uncore, &sseu->compute_subslice_mask,
|
||||
num_compute_regs,
|
||||
GEN12_GT_COMPUTE_DSS_ENABLE,
|
||||
XEHPC_GT_COMPUTE_DSS_ENABLE_EXT);
|
||||
|
||||
eu_en_fuse = intel_uncore_read(uncore, XEHP_EU_ENABLE) & XEHP_EU_ENA_MASK;
|
||||
|
||||
if (HAS_ONE_EU_PER_FUSE_BIT(gt->i915))
|
||||
eu_en = eu_en_fuse;
|
||||
else
|
||||
for (eu = 0; eu < sseu->max_eus_per_subslice / 2; eu++)
|
||||
if (eu_en_fuse & BIT(eu))
|
||||
eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1);
|
||||
|
||||
xehp_compute_sseu_info(sseu, eu_en);
|
||||
}
|
||||
|
||||
static void gen12_sseu_info_init(struct intel_gt *gt)
|
||||
{
|
||||
struct sseu_dev_info *sseu = >->info.sseu;
|
||||
struct intel_uncore *uncore = gt->uncore;
|
||||
u32 g_dss_en, c_dss_en = 0;
|
||||
u32 g_dss_en;
|
||||
u16 eu_en = 0;
|
||||
u8 eu_en_fuse;
|
||||
u8 s_en;
|
||||
|
@ -183,43 +266,28 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
|
|||
* Gen12 has Dual-Subslices, which behave similarly to 2 gen11 SS.
|
||||
* Instead of splitting these, provide userspace with an array
|
||||
* of DSS to more closely represent the hardware resource.
|
||||
*
|
||||
* In addition, the concept of slice has been removed in Xe_HP.
|
||||
* To be compatible with prior generations, assume a single slice
|
||||
* across the entire device. Then calculate out the DSS for each
|
||||
* workload type within that software slice.
|
||||
*/
|
||||
if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915))
|
||||
intel_sseu_set_info(sseu, 1, 32, 16);
|
||||
else
|
||||
intel_sseu_set_info(sseu, 1, 6, 16);
|
||||
intel_sseu_set_info(sseu, 1, 6, 16);
|
||||
|
||||
/*
|
||||
* As mentioned above, Xe_HP does not have the concept of a slice.
|
||||
* Enable one for software backwards compatibility.
|
||||
* Although gen12 architecture supported multiple slices, TGL, RKL,
|
||||
* DG1, and ADL only had a single slice.
|
||||
*/
|
||||
if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
|
||||
s_en = 0x1;
|
||||
else
|
||||
s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
|
||||
GEN11_GT_S_ENA_MASK;
|
||||
s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
|
||||
GEN11_GT_S_ENA_MASK;
|
||||
drm_WARN_ON(>->i915->drm, s_en != 0x1);
|
||||
|
||||
g_dss_en = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE);
|
||||
if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
|
||||
c_dss_en = intel_uncore_read(uncore, GEN12_GT_COMPUTE_DSS_ENABLE);
|
||||
|
||||
/* one bit per pair of EUs */
|
||||
if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
|
||||
eu_en_fuse = intel_uncore_read(uncore, XEHP_EU_ENABLE) & XEHP_EU_ENA_MASK;
|
||||
else
|
||||
eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
|
||||
GEN11_EU_DIS_MASK);
|
||||
eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
|
||||
GEN11_EU_DIS_MASK);
|
||||
|
||||
for (eu = 0; eu < sseu->max_eus_per_subslice / 2; eu++)
|
||||
if (eu_en_fuse & BIT(eu))
|
||||
eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1);
|
||||
|
||||
gen11_compute_sseu_info(sseu, s_en, g_dss_en, c_dss_en, eu_en);
|
||||
gen11_compute_sseu_info(sseu, g_dss_en, eu_en);
|
||||
|
||||
/* TGL only supports slice-level power gating */
|
||||
sseu->has_slice_pg = 1;
|
||||
|
@ -238,14 +306,20 @@ static void gen11_sseu_info_init(struct intel_gt *gt)
|
|||
else
|
||||
intel_sseu_set_info(sseu, 1, 8, 8);
|
||||
|
||||
/*
|
||||
* Although gen11 architecture supported multiple slices, ICL and
|
||||
* EHL/JSL only had a single slice in practice.
|
||||
*/
|
||||
s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
|
||||
GEN11_GT_S_ENA_MASK;
|
||||
drm_WARN_ON(>->i915->drm, s_en != 0x1);
|
||||
|
||||
ss_en = ~intel_uncore_read(uncore, GEN11_GT_SUBSLICE_DISABLE);
|
||||
|
||||
eu_en = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
|
||||
GEN11_EU_DIS_MASK);
|
||||
|
||||
gen11_compute_sseu_info(sseu, s_en, ss_en, 0, eu_en);
|
||||
gen11_compute_sseu_info(sseu, ss_en, eu_en);
|
||||
|
||||
/* ICL has no power gating restrictions. */
|
||||
sseu->has_slice_pg = 1;
|
||||
|
@ -257,7 +331,6 @@ static void cherryview_sseu_info_init(struct intel_gt *gt)
|
|||
{
|
||||
struct sseu_dev_info *sseu = >->info.sseu;
|
||||
u32 fuse;
|
||||
u8 subslice_mask = 0;
|
||||
|
||||
fuse = intel_uncore_read(gt->uncore, CHV_FUSE_GT);
|
||||
|
||||
|
@ -271,8 +344,8 @@ static void cherryview_sseu_info_init(struct intel_gt *gt)
|
|||
(((fuse & CHV_FGT_EU_DIS_SS0_R1_MASK) >>
|
||||
CHV_FGT_EU_DIS_SS0_R1_SHIFT) << 4);
|
||||
|
||||
subslice_mask |= BIT(0);
|
||||
sseu_set_eus(sseu, 0, 0, ~disabled_mask);
|
||||
sseu->subslice_mask.hsw[0] |= BIT(0);
|
||||
sseu_set_eus(sseu, 0, 0, ~disabled_mask & 0xFF);
|
||||
}
|
||||
|
||||
if (!(fuse & CHV_FGT_DISABLE_SS1)) {
|
||||
|
@ -282,12 +355,10 @@ static void cherryview_sseu_info_init(struct intel_gt *gt)
|
|||
(((fuse & CHV_FGT_EU_DIS_SS1_R1_MASK) >>
|
||||
CHV_FGT_EU_DIS_SS1_R1_SHIFT) << 4);
|
||||
|
||||
subslice_mask |= BIT(1);
|
||||
sseu_set_eus(sseu, 0, 1, ~disabled_mask);
|
||||
sseu->subslice_mask.hsw[0] |= BIT(1);
|
||||
sseu_set_eus(sseu, 0, 1, ~disabled_mask & 0xFF);
|
||||
}
|
||||
|
||||
intel_sseu_set_subslices(sseu, 0, sseu->subslice_mask, subslice_mask);
|
||||
|
||||
sseu->eu_total = compute_eu_total(sseu);
|
||||
|
||||
/*
|
||||
|
@ -342,8 +413,7 @@ static void gen9_sseu_info_init(struct intel_gt *gt)
|
|||
/* skip disabled slice */
|
||||
continue;
|
||||
|
||||
intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
|
||||
subslice_mask);
|
||||
sseu->subslice_mask.hsw[s] = subslice_mask;
|
||||
|
||||
eu_disable = intel_uncore_read(uncore, GEN9_EU_DISABLE(s));
|
||||
for (ss = 0; ss < sseu->max_subslices; ss++) {
|
||||
|
@ -356,7 +426,7 @@ static void gen9_sseu_info_init(struct intel_gt *gt)
|
|||
|
||||
eu_disabled_mask = (eu_disable >> (ss * 8)) & eu_mask;
|
||||
|
||||
sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
|
||||
sseu_set_eus(sseu, s, ss, ~eu_disabled_mask & eu_mask);
|
||||
|
||||
eu_per_ss = sseu->max_eus_per_subslice -
|
||||
hweight8(eu_disabled_mask);
|
||||
|
@ -400,8 +470,8 @@ static void gen9_sseu_info_init(struct intel_gt *gt)
|
|||
sseu->has_eu_pg = sseu->eu_per_subslice > 2;
|
||||
|
||||
if (IS_GEN9_LP(i915)) {
|
||||
#define IS_SS_DISABLED(ss) (!(sseu->subslice_mask[0] & BIT(ss)))
|
||||
info->has_pooled_eu = hweight8(sseu->subslice_mask[0]) == 3;
|
||||
#define IS_SS_DISABLED(ss) (!(sseu->subslice_mask.hsw[0] & BIT(ss)))
|
||||
info->has_pooled_eu = hweight8(sseu->subslice_mask.hsw[0]) == 3;
|
||||
|
||||
sseu->min_eu_in_pool = 0;
|
||||
if (info->has_pooled_eu) {
|
||||
|
@ -455,8 +525,7 @@ static void bdw_sseu_info_init(struct intel_gt *gt)
|
|||
/* skip disabled slice */
|
||||
continue;
|
||||
|
||||
intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
|
||||
subslice_mask);
|
||||
sseu->subslice_mask.hsw[s] = subslice_mask;
|
||||
|
||||
for (ss = 0; ss < sseu->max_subslices; ss++) {
|
||||
u8 eu_disabled_mask;
|
||||
|
@ -469,7 +538,7 @@ static void bdw_sseu_info_init(struct intel_gt *gt)
|
|||
eu_disabled_mask =
|
||||
eu_disable[s] >> (ss * sseu->max_eus_per_subslice);
|
||||
|
||||
sseu_set_eus(sseu, s, ss, ~eu_disabled_mask);
|
||||
sseu_set_eus(sseu, s, ss, ~eu_disabled_mask & 0xFF);
|
||||
|
||||
n_disabled = hweight8(eu_disabled_mask);
|
||||
|
||||
|
@ -553,8 +622,7 @@ static void hsw_sseu_info_init(struct intel_gt *gt)
|
|||
sseu->eu_per_subslice);
|
||||
|
||||
for (s = 0; s < sseu->max_slices; s++) {
|
||||
intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
|
||||
subslice_mask);
|
||||
sseu->subslice_mask.hsw[s] = subslice_mask;
|
||||
|
||||
for (ss = 0; ss < sseu->max_subslices; ss++) {
|
||||
sseu_set_eus(sseu, s, ss,
|
||||
|
@ -574,18 +642,20 @@ void intel_sseu_info_init(struct intel_gt *gt)
|
|||
{
|
||||
struct drm_i915_private *i915 = gt->i915;
|
||||
|
||||
if (IS_HASWELL(i915))
|
||||
hsw_sseu_info_init(gt);
|
||||
else if (IS_CHERRYVIEW(i915))
|
||||
cherryview_sseu_info_init(gt);
|
||||
else if (IS_BROADWELL(i915))
|
||||
bdw_sseu_info_init(gt);
|
||||
else if (GRAPHICS_VER(i915) == 9)
|
||||
gen9_sseu_info_init(gt);
|
||||
else if (GRAPHICS_VER(i915) == 11)
|
||||
gen11_sseu_info_init(gt);
|
||||
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
|
||||
xehp_sseu_info_init(gt);
|
||||
else if (GRAPHICS_VER(i915) >= 12)
|
||||
gen12_sseu_info_init(gt);
|
||||
else if (GRAPHICS_VER(i915) >= 11)
|
||||
gen11_sseu_info_init(gt);
|
||||
else if (GRAPHICS_VER(i915) >= 9)
|
||||
gen9_sseu_info_init(gt);
|
||||
else if (IS_BROADWELL(i915))
|
||||
bdw_sseu_info_init(gt);
|
||||
else if (IS_CHERRYVIEW(i915))
|
||||
cherryview_sseu_info_init(gt);
|
||||
else if (IS_HASWELL(i915))
|
||||
hsw_sseu_info_init(gt);
|
||||
}
|
||||
|
||||
u32 intel_sseu_make_rpcs(struct intel_gt *gt,
|
||||
|
@ -641,7 +711,7 @@ u32 intel_sseu_make_rpcs(struct intel_gt *gt,
|
|||
*/
|
||||
if (GRAPHICS_VER(i915) == 11 &&
|
||||
slices == 1 &&
|
||||
subslices > min_t(u8, 4, hweight8(sseu->subslice_mask[0]) / 2)) {
|
||||
subslices > min_t(u8, 4, hweight8(sseu->subslice_mask.hsw[0]) / 2)) {
|
||||
GEM_BUG_ON(subslices & 1);
|
||||
|
||||
subslice_pg = false;
|
||||
|
@ -707,14 +777,29 @@ void intel_sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p)
|
|||
{
|
||||
int s;
|
||||
|
||||
drm_printf(p, "slice total: %u, mask=%04x\n",
|
||||
hweight8(sseu->slice_mask), sseu->slice_mask);
|
||||
drm_printf(p, "subslice total: %u\n", intel_sseu_subslice_total(sseu));
|
||||
for (s = 0; s < sseu->max_slices; s++) {
|
||||
drm_printf(p, "slice%d: %u subslices, mask=%08x\n",
|
||||
s, intel_sseu_subslices_per_slice(sseu, s),
|
||||
intel_sseu_get_subslices(sseu, s));
|
||||
if (sseu->has_xehp_dss) {
|
||||
drm_printf(p, "subslice total: %u\n",
|
||||
intel_sseu_subslice_total(sseu));
|
||||
drm_printf(p, "geometry dss mask=%*pb\n",
|
||||
XEHP_BITMAP_BITS(sseu->geometry_subslice_mask),
|
||||
sseu->geometry_subslice_mask.xehp);
|
||||
drm_printf(p, "compute dss mask=%*pb\n",
|
||||
XEHP_BITMAP_BITS(sseu->compute_subslice_mask),
|
||||
sseu->compute_subslice_mask.xehp);
|
||||
} else {
|
||||
drm_printf(p, "slice total: %u, mask=%04x\n",
|
||||
hweight8(sseu->slice_mask), sseu->slice_mask);
|
||||
drm_printf(p, "subslice total: %u\n",
|
||||
intel_sseu_subslice_total(sseu));
|
||||
|
||||
for (s = 0; s < sseu->max_slices; s++) {
|
||||
u8 ss_mask = sseu->subslice_mask.hsw[s];
|
||||
|
||||
drm_printf(p, "slice%d: %u subslices, mask=%08x\n",
|
||||
s, hweight8(ss_mask), ss_mask);
|
||||
}
|
||||
}
|
||||
|
||||
drm_printf(p, "EU total: %u\n", sseu->eu_total);
|
||||
drm_printf(p, "EU per subslice: %u\n", sseu->eu_per_subslice);
|
||||
drm_printf(p, "has slice power gating: %s\n",
|
||||
|
@ -731,9 +816,10 @@ static void sseu_print_hsw_topology(const struct sseu_dev_info *sseu,
|
|||
int s, ss;
|
||||
|
||||
for (s = 0; s < sseu->max_slices; s++) {
|
||||
u8 ss_mask = sseu->subslice_mask.hsw[s];
|
||||
|
||||
drm_printf(p, "slice%d: %u subslice(s) (0x%08x):\n",
|
||||
s, intel_sseu_subslices_per_slice(sseu, s),
|
||||
intel_sseu_get_subslices(sseu, s));
|
||||
s, hweight8(ss_mask), ss_mask);
|
||||
|
||||
for (ss = 0; ss < sseu->max_subslices; ss++) {
|
||||
u16 enabled_eus = sseu_get_eus(sseu, s, ss);
|
||||
|
@ -747,16 +833,14 @@ static void sseu_print_hsw_topology(const struct sseu_dev_info *sseu,
|
|||
static void sseu_print_xehp_topology(const struct sseu_dev_info *sseu,
|
||||
struct drm_printer *p)
|
||||
{
|
||||
u32 g_dss_mask = sseu_get_geometry_subslices(sseu);
|
||||
u32 c_dss_mask = intel_sseu_get_compute_subslices(sseu);
|
||||
int dss;
|
||||
|
||||
for (dss = 0; dss < sseu->max_subslices; dss++) {
|
||||
u16 enabled_eus = sseu_get_eus(sseu, 0, dss);
|
||||
|
||||
drm_printf(p, "DSS_%02d: G:%3s C:%3s, %2u EUs (0x%04hx)\n", dss,
|
||||
str_yes_no(g_dss_mask & BIT(dss)),
|
||||
str_yes_no(c_dss_mask & BIT(dss)),
|
||||
str_yes_no(test_bit(dss, sseu->geometry_subslice_mask.xehp)),
|
||||
str_yes_no(test_bit(dss, sseu->compute_subslice_mask.xehp)),
|
||||
hweight16(enabled_eus), enabled_eus);
|
||||
}
|
||||
}
|
||||
|
@ -774,20 +858,44 @@ void intel_sseu_print_topology(struct drm_i915_private *i915,
|
|||
}
|
||||
}
|
||||
|
||||
u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice)
|
||||
void intel_sseu_print_ss_info(const char *type,
|
||||
const struct sseu_dev_info *sseu,
|
||||
struct seq_file *m)
|
||||
{
|
||||
u16 slice_mask = 0;
|
||||
int s;
|
||||
|
||||
if (sseu->has_xehp_dss) {
|
||||
seq_printf(m, " %s Geometry DSS: %u\n", type,
|
||||
bitmap_weight(sseu->geometry_subslice_mask.xehp,
|
||||
XEHP_BITMAP_BITS(sseu->geometry_subslice_mask)));
|
||||
seq_printf(m, " %s Compute DSS: %u\n", type,
|
||||
bitmap_weight(sseu->compute_subslice_mask.xehp,
|
||||
XEHP_BITMAP_BITS(sseu->compute_subslice_mask)));
|
||||
} else {
|
||||
for (s = 0; s < fls(sseu->slice_mask); s++)
|
||||
seq_printf(m, " %s Slice%i subslices: %u\n", type,
|
||||
s, hweight8(sseu->subslice_mask.hsw[s]));
|
||||
}
|
||||
}
|
||||
|
||||
u16 intel_slicemask_from_xehp_dssmask(intel_sseu_ss_mask_t dss_mask,
|
||||
int dss_per_slice)
|
||||
{
|
||||
intel_sseu_ss_mask_t per_slice_mask = {};
|
||||
unsigned long slice_mask = 0;
|
||||
int i;
|
||||
|
||||
WARN_ON(sizeof(dss_mask) * 8 / dss_per_slice > 8 * sizeof(slice_mask));
|
||||
WARN_ON(DIV_ROUND_UP(XEHP_BITMAP_BITS(dss_mask), dss_per_slice) >
|
||||
8 * sizeof(slice_mask));
|
||||
|
||||
for (i = 0; dss_mask; i++) {
|
||||
if (dss_mask & GENMASK(dss_per_slice - 1, 0))
|
||||
bitmap_fill(per_slice_mask.xehp, dss_per_slice);
|
||||
for (i = 0; !bitmap_empty(dss_mask.xehp, XEHP_BITMAP_BITS(dss_mask)); i++) {
|
||||
if (bitmap_intersects(dss_mask.xehp, per_slice_mask.xehp, dss_per_slice))
|
||||
slice_mask |= BIT(i);
|
||||
|
||||
dss_mask >>= dss_per_slice;
|
||||
bitmap_shift_right(dss_mask.xehp, dss_mask.xehp, dss_per_slice,
|
||||
XEHP_BITMAP_BITS(dss_mask));
|
||||
}
|
||||
|
||||
return slice_mask;
|
||||
}
|
||||
|
||||
|
|
|
@ -25,12 +25,16 @@ struct drm_printer;
|
|||
/*
|
||||
* Maximum number of subslices that can exist within a HSW-style slice. This
|
||||
* is only relevant to pre-Xe_HP platforms (Xe_HP and beyond use the
|
||||
* GEN_MAX_DSS value below).
|
||||
* I915_MAX_SS_FUSE_BITS value below).
|
||||
*/
|
||||
#define GEN_MAX_SS_PER_HSW_SLICE 6
|
||||
|
||||
/* Maximum number of DSS on newer platforms (Xe_HP and beyond). */
|
||||
#define GEN_MAX_DSS 32
|
||||
/*
|
||||
* Maximum number of 32-bit registers used by hardware to express the
|
||||
* enabled/disabled subslices.
|
||||
*/
|
||||
#define I915_MAX_SS_FUSE_REGS 2
|
||||
#define I915_MAX_SS_FUSE_BITS (I915_MAX_SS_FUSE_REGS * 32)
|
||||
|
||||
/* Maximum number of EUs that can exist within a subslice or DSS. */
|
||||
#define GEN_MAX_EUS_PER_SS 16
|
||||
|
@ -38,7 +42,7 @@ struct drm_printer;
|
|||
#define SSEU_MAX(a, b) ((a) > (b) ? (a) : (b))
|
||||
|
||||
/* The maximum number of bits needed to express each subslice/DSS independently */
|
||||
#define GEN_SS_MASK_SIZE SSEU_MAX(GEN_MAX_DSS, \
|
||||
#define GEN_SS_MASK_SIZE SSEU_MAX(I915_MAX_SS_FUSE_BITS, \
|
||||
GEN_MAX_HSW_SLICES * GEN_MAX_SS_PER_HSW_SLICE)
|
||||
|
||||
#define GEN_SSEU_STRIDE(max_entries) DIV_ROUND_UP(max_entries, BITS_PER_BYTE)
|
||||
|
@ -49,15 +53,28 @@ struct drm_printer;
|
|||
#define GEN_DSS_PER_CSLICE 8
|
||||
#define GEN_DSS_PER_MSLICE 8
|
||||
|
||||
#define GEN_MAX_GSLICES (GEN_MAX_DSS / GEN_DSS_PER_GSLICE)
|
||||
#define GEN_MAX_CSLICES (GEN_MAX_DSS / GEN_DSS_PER_CSLICE)
|
||||
#define GEN_MAX_GSLICES (I915_MAX_SS_FUSE_BITS / GEN_DSS_PER_GSLICE)
|
||||
#define GEN_MAX_CSLICES (I915_MAX_SS_FUSE_BITS / GEN_DSS_PER_CSLICE)
|
||||
|
||||
typedef union {
|
||||
u8 hsw[GEN_MAX_HSW_SLICES];
|
||||
|
||||
/* Bitmap compatible with linux/bitmap.h; may exceed size of u64 */
|
||||
unsigned long xehp[BITS_TO_LONGS(I915_MAX_SS_FUSE_BITS)];
|
||||
} intel_sseu_ss_mask_t;
|
||||
|
||||
#define XEHP_BITMAP_BITS(mask) ((int)BITS_PER_TYPE(typeof(mask.xehp)))
|
||||
|
||||
struct sseu_dev_info {
|
||||
u8 slice_mask;
|
||||
u8 subslice_mask[GEN_SS_MASK_SIZE];
|
||||
u8 geometry_subslice_mask[GEN_SS_MASK_SIZE];
|
||||
u8 compute_subslice_mask[GEN_SS_MASK_SIZE];
|
||||
u8 eu_mask[GEN_SS_MASK_SIZE * GEN_MAX_EU_STRIDE];
|
||||
intel_sseu_ss_mask_t subslice_mask;
|
||||
intel_sseu_ss_mask_t geometry_subslice_mask;
|
||||
intel_sseu_ss_mask_t compute_subslice_mask;
|
||||
union {
|
||||
u16 hsw[GEN_MAX_HSW_SLICES][GEN_MAX_SS_PER_HSW_SLICE];
|
||||
u16 xehp[I915_MAX_SS_FUSE_BITS];
|
||||
} eu_mask;
|
||||
|
||||
u16 eu_total;
|
||||
u8 eu_per_subslice;
|
||||
u8 min_eu_in_pool;
|
||||
|
@ -66,14 +83,16 @@ struct sseu_dev_info {
|
|||
u8 has_slice_pg:1;
|
||||
u8 has_subslice_pg:1;
|
||||
u8 has_eu_pg:1;
|
||||
/*
|
||||
* For Xe_HP and beyond, the hardware no longer has traditional slices
|
||||
* so we just report the entire DSS pool under a fake "slice 0."
|
||||
*/
|
||||
u8 has_xehp_dss:1;
|
||||
|
||||
/* Topology fields */
|
||||
u8 max_slices;
|
||||
u8 max_subslices;
|
||||
u8 max_eus_per_subslice;
|
||||
|
||||
u8 ss_stride;
|
||||
u8 eu_stride;
|
||||
};
|
||||
|
||||
/*
|
||||
|
@ -91,7 +110,7 @@ intel_sseu_from_device_info(const struct sseu_dev_info *sseu)
|
|||
{
|
||||
struct intel_sseu value = {
|
||||
.slice_mask = sseu->slice_mask,
|
||||
.subslice_mask = sseu->subslice_mask[0],
|
||||
.subslice_mask = sseu->subslice_mask.hsw[0],
|
||||
.min_eus_per_subslice = sseu->max_eus_per_subslice,
|
||||
.max_eus_per_subslice = sseu->max_eus_per_subslice,
|
||||
};
|
||||
|
@ -103,18 +122,28 @@ static inline bool
|
|||
intel_sseu_has_subslice(const struct sseu_dev_info *sseu, int slice,
|
||||
int subslice)
|
||||
{
|
||||
u8 mask;
|
||||
int ss_idx = subslice / BITS_PER_BYTE;
|
||||
|
||||
if (slice >= sseu->max_slices ||
|
||||
subslice >= sseu->max_subslices)
|
||||
return false;
|
||||
|
||||
GEM_BUG_ON(ss_idx >= sseu->ss_stride);
|
||||
if (sseu->has_xehp_dss)
|
||||
return test_bit(subslice, sseu->subslice_mask.xehp);
|
||||
else
|
||||
return sseu->subslice_mask.hsw[slice] & BIT(subslice);
|
||||
}
|
||||
|
||||
mask = sseu->subslice_mask[slice * sseu->ss_stride + ss_idx];
|
||||
|
||||
return mask & BIT(subslice % BITS_PER_BYTE);
|
||||
/*
|
||||
* Used to obtain the index of the first DSS. Can start searching from the
|
||||
* beginning of a specific dss group (e.g., gslice, cslice, etc.) if
|
||||
* groupsize and groupnum are non-zero.
|
||||
*/
|
||||
static inline unsigned int
|
||||
intel_sseu_find_first_xehp_dss(const struct sseu_dev_info *sseu, int groupsize,
|
||||
int groupnum)
|
||||
{
|
||||
return find_next_bit(sseu->subslice_mask.xehp,
|
||||
XEHP_BITMAP_BITS(sseu->subslice_mask),
|
||||
groupnum * groupsize);
|
||||
}
|
||||
|
||||
void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices,
|
||||
|
@ -124,14 +153,10 @@ unsigned int
|
|||
intel_sseu_subslice_total(const struct sseu_dev_info *sseu);
|
||||
|
||||
unsigned int
|
||||
intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice);
|
||||
intel_sseu_get_hsw_subslices(const struct sseu_dev_info *sseu, u8 slice);
|
||||
|
||||
u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice);
|
||||
|
||||
u32 intel_sseu_get_compute_subslices(const struct sseu_dev_info *sseu);
|
||||
|
||||
void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
|
||||
u8 *subslice_mask, u32 ss_mask);
|
||||
intel_sseu_ss_mask_t
|
||||
intel_sseu_get_compute_subslices(const struct sseu_dev_info *sseu);
|
||||
|
||||
void intel_sseu_info_init(struct intel_gt *gt);
|
||||
|
||||
|
@ -143,6 +168,15 @@ void intel_sseu_print_topology(struct drm_i915_private *i915,
|
|||
const struct sseu_dev_info *sseu,
|
||||
struct drm_printer *p);
|
||||
|
||||
u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice);
|
||||
u16 intel_slicemask_from_xehp_dssmask(intel_sseu_ss_mask_t dss_mask, int dss_per_slice);
|
||||
|
||||
int intel_sseu_copy_eumask_to_user(void __user *to,
|
||||
const struct sseu_dev_info *sseu);
|
||||
int intel_sseu_copy_ssmask_to_user(void __user *to,
|
||||
const struct sseu_dev_info *sseu);
|
||||
|
||||
void intel_sseu_print_ss_info(const char *type,
|
||||
const struct sseu_dev_info *sseu,
|
||||
struct seq_file *m);
|
||||
|
||||
#endif /* __INTEL_SSEU_H__ */
|
||||
|
|
|
@ -4,6 +4,7 @@
|
|||
* Copyright © 2020 Intel Corporation
|
||||
*/
|
||||
|
||||
#include <linux/bitmap.h>
|
||||
#include <linux/string_helpers.h>
|
||||
|
||||
#include "i915_drv.h"
|
||||
|
@ -11,14 +12,6 @@
|
|||
#include "intel_gt_regs.h"
|
||||
#include "intel_sseu_debugfs.h"
|
||||
|
||||
static void sseu_copy_subslices(const struct sseu_dev_info *sseu,
|
||||
int slice, u8 *to_mask)
|
||||
{
|
||||
int offset = slice * sseu->ss_stride;
|
||||
|
||||
memcpy(&to_mask[offset], &sseu->subslice_mask[offset], sseu->ss_stride);
|
||||
}
|
||||
|
||||
static void cherryview_sseu_device_status(struct intel_gt *gt,
|
||||
struct sseu_dev_info *sseu)
|
||||
{
|
||||
|
@ -41,7 +34,7 @@ static void cherryview_sseu_device_status(struct intel_gt *gt,
|
|||
continue;
|
||||
|
||||
sseu->slice_mask = BIT(0);
|
||||
sseu->subslice_mask[0] |= BIT(ss);
|
||||
sseu->subslice_mask.hsw[0] |= BIT(ss);
|
||||
eu_cnt = ((sig1[ss] & CHV_EU08_PG_ENABLE) ? 0 : 2) +
|
||||
((sig1[ss] & CHV_EU19_PG_ENABLE) ? 0 : 2) +
|
||||
((sig1[ss] & CHV_EU210_PG_ENABLE) ? 0 : 2) +
|
||||
|
@ -92,7 +85,7 @@ static void gen11_sseu_device_status(struct intel_gt *gt,
|
|||
continue;
|
||||
|
||||
sseu->slice_mask |= BIT(s);
|
||||
sseu_copy_subslices(&info->sseu, s, sseu->subslice_mask);
|
||||
sseu->subslice_mask.hsw[s] = info->sseu.subslice_mask.hsw[s];
|
||||
|
||||
for (ss = 0; ss < info->sseu.max_subslices; ss++) {
|
||||
unsigned int eu_cnt;
|
||||
|
@ -147,21 +140,17 @@ static void gen9_sseu_device_status(struct intel_gt *gt,
|
|||
sseu->slice_mask |= BIT(s);
|
||||
|
||||
if (IS_GEN9_BC(gt->i915))
|
||||
sseu_copy_subslices(&info->sseu, s,
|
||||
sseu->subslice_mask);
|
||||
sseu->subslice_mask.hsw[s] = info->sseu.subslice_mask.hsw[s];
|
||||
|
||||
for (ss = 0; ss < info->sseu.max_subslices; ss++) {
|
||||
unsigned int eu_cnt;
|
||||
u8 ss_idx = s * info->sseu.ss_stride +
|
||||
ss / BITS_PER_BYTE;
|
||||
|
||||
if (IS_GEN9_LP(gt->i915)) {
|
||||
if (!(s_reg[s] & (GEN9_PGCTL_SS_ACK(ss))))
|
||||
/* skip disabled subslice */
|
||||
continue;
|
||||
|
||||
sseu->subslice_mask[ss_idx] |=
|
||||
BIT(ss % BITS_PER_BYTE);
|
||||
sseu->subslice_mask.hsw[s] |= BIT(ss);
|
||||
}
|
||||
|
||||
eu_cnt = eu_reg[2 * s + ss / 2] & eu_mask[ss % 2];
|
||||
|
@ -188,8 +177,7 @@ static void bdw_sseu_device_status(struct intel_gt *gt,
|
|||
if (sseu->slice_mask) {
|
||||
sseu->eu_per_subslice = info->sseu.eu_per_subslice;
|
||||
for (s = 0; s < fls(sseu->slice_mask); s++)
|
||||
sseu_copy_subslices(&info->sseu, s,
|
||||
sseu->subslice_mask);
|
||||
sseu->subslice_mask.hsw[s] = info->sseu.subslice_mask.hsw[s];
|
||||
sseu->eu_total = sseu->eu_per_subslice *
|
||||
intel_sseu_subslice_total(sseu);
|
||||
|
||||
|
@ -208,7 +196,6 @@ static void i915_print_sseu_info(struct seq_file *m,
|
|||
const struct sseu_dev_info *sseu)
|
||||
{
|
||||
const char *type = is_available_info ? "Available" : "Enabled";
|
||||
int s;
|
||||
|
||||
seq_printf(m, " %s Slice Mask: %04x\n", type,
|
||||
sseu->slice_mask);
|
||||
|
@ -216,10 +203,7 @@ static void i915_print_sseu_info(struct seq_file *m,
|
|||
hweight8(sseu->slice_mask));
|
||||
seq_printf(m, " %s Subslice Total: %u\n", type,
|
||||
intel_sseu_subslice_total(sseu));
|
||||
for (s = 0; s < fls(sseu->slice_mask); s++) {
|
||||
seq_printf(m, " %s Slice%i subslices: %u\n", type,
|
||||
s, intel_sseu_subslices_per_slice(sseu, s));
|
||||
}
|
||||
intel_sseu_print_ss_info(type, sseu, m);
|
||||
seq_printf(m, " %s EU Total: %u\n", type,
|
||||
sseu->eu_total);
|
||||
seq_printf(m, " %s EU Per Subslice: %u\n", type,
|
||||
|
|
|
@ -9,6 +9,7 @@
|
|||
#include "intel_engine_regs.h"
|
||||
#include "intel_gpu_commands.h"
|
||||
#include "intel_gt.h"
|
||||
#include "intel_gt_mcr.h"
|
||||
#include "intel_gt_regs.h"
|
||||
#include "intel_ring.h"
|
||||
#include "intel_workarounds.h"
|
||||
|
@ -776,7 +777,9 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
|
|||
if (engine->class != RENDER_CLASS)
|
||||
goto done;
|
||||
|
||||
if (IS_DG2(i915))
|
||||
if (IS_PONTEVECCHIO(i915))
|
||||
; /* noop; none at this time */
|
||||
else if (IS_DG2(i915))
|
||||
dg2_ctx_workarounds_init(engine, wal);
|
||||
else if (IS_XEHPSDV(i915))
|
||||
; /* noop; none at this time */
|
||||
|
@ -948,8 +951,8 @@ gen9_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
|
|||
* on s/ss combo, the read should be done with read_subslice_reg.
|
||||
*/
|
||||
slice = ffs(sseu->slice_mask) - 1;
|
||||
GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask));
|
||||
subslice = ffs(intel_sseu_get_subslices(sseu, slice));
|
||||
GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask.hsw));
|
||||
subslice = ffs(intel_sseu_get_hsw_subslices(sseu, slice));
|
||||
GEM_BUG_ON(!subslice);
|
||||
subslice--;
|
||||
|
||||
|
@ -1080,18 +1083,17 @@ static void __add_mcr_wa(struct intel_gt *gt, struct i915_wa_list *wal,
|
|||
gt->default_steering.instanceid = subslice;
|
||||
|
||||
if (drm_debug_enabled(DRM_UT_DRIVER))
|
||||
intel_gt_report_steering(&p, gt, false);
|
||||
intel_gt_mcr_report_steering(&p, gt, false);
|
||||
}
|
||||
|
||||
static void
|
||||
icl_wa_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
|
||||
{
|
||||
const struct sseu_dev_info *sseu = >->info.sseu;
|
||||
unsigned int slice, subslice;
|
||||
unsigned int subslice;
|
||||
|
||||
GEM_BUG_ON(GRAPHICS_VER(gt->i915) < 11);
|
||||
GEM_BUG_ON(hweight8(sseu->slice_mask) > 1);
|
||||
slice = 0;
|
||||
|
||||
/*
|
||||
* Although a platform may have subslices, we need to always steer
|
||||
|
@ -1102,7 +1104,7 @@ icl_wa_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
|
|||
* one of the higher subslices, we run the risk of reading back 0's or
|
||||
* random garbage.
|
||||
*/
|
||||
subslice = __ffs(intel_sseu_get_subslices(sseu, slice));
|
||||
subslice = __ffs(intel_sseu_get_hsw_subslices(sseu, 0));
|
||||
|
||||
/*
|
||||
* If the subslice we picked above also steers us to a valid L3 bank,
|
||||
|
@ -1112,7 +1114,7 @@ icl_wa_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
|
|||
if (gt->info.l3bank_mask & BIT(subslice))
|
||||
gt->steering_table[L3BANK] = NULL;
|
||||
|
||||
__add_mcr_wa(gt, wal, slice, subslice);
|
||||
__add_mcr_wa(gt, wal, 0, subslice);
|
||||
}
|
||||
|
||||
static void
|
||||
|
@ -1120,7 +1122,6 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
|
|||
{
|
||||
const struct sseu_dev_info *sseu = >->info.sseu;
|
||||
unsigned long slice, subslice = 0, slice_mask = 0;
|
||||
u64 dss_mask = 0;
|
||||
u32 lncf_mask = 0;
|
||||
int i;
|
||||
|
||||
|
@ -1151,8 +1152,8 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
|
|||
*/
|
||||
|
||||
/* Find the potential gslice candidates */
|
||||
dss_mask = intel_sseu_get_subslices(sseu, 0);
|
||||
slice_mask = intel_slicemask_from_dssmask(dss_mask, GEN_DSS_PER_GSLICE);
|
||||
slice_mask = intel_slicemask_from_xehp_dssmask(sseu->subslice_mask,
|
||||
GEN_DSS_PER_GSLICE);
|
||||
|
||||
/*
|
||||
* Find the potential LNCF candidates. Either LNCF within a valid
|
||||
|
@ -1177,9 +1178,8 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
|
|||
}
|
||||
|
||||
slice = __ffs(slice_mask);
|
||||
subslice = __ffs(dss_mask >> (slice * GEN_DSS_PER_GSLICE));
|
||||
WARN_ON(subslice > GEN_DSS_PER_GSLICE);
|
||||
WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
|
||||
subslice = intel_sseu_find_first_xehp_dss(sseu, GEN_DSS_PER_GSLICE, slice) %
|
||||
GEN_DSS_PER_GSLICE;
|
||||
|
||||
__add_mcr_wa(gt, wal, slice, subslice);
|
||||
|
||||
|
@ -1196,6 +1196,20 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
|
|||
__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
|
||||
}
|
||||
|
||||
static void
|
||||
pvc_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
|
||||
{
|
||||
unsigned int dss;
|
||||
|
||||
/*
|
||||
* Setup implicit steering for COMPUTE and DSS ranges to the first
|
||||
* non-fused-off DSS. All other types of MCR registers will be
|
||||
* explicitly steered.
|
||||
*/
|
||||
dss = intel_sseu_find_first_xehp_dss(>->info.sseu, 0, 0);
|
||||
__add_mcr_wa(gt, wal, dss / GEN_DSS_PER_CSLICE, dss % GEN_DSS_PER_CSLICE);
|
||||
}
|
||||
|
||||
static void
|
||||
icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
|
||||
{
|
||||
|
@ -1487,6 +1501,18 @@ dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
|
|||
* performance guide section.
|
||||
*/
|
||||
wa_write_or(wal, GEN12_SQCM, EN_32B_ACCESS);
|
||||
|
||||
/* Wa_14015795083 */
|
||||
wa_write_clr(wal, GEN7_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
|
||||
}
|
||||
|
||||
static void
|
||||
pvc_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
|
||||
{
|
||||
pvc_init_mcr(gt, wal);
|
||||
|
||||
/* Wa_14015795083 */
|
||||
wa_write_clr(wal, GEN7_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE);
|
||||
}
|
||||
|
||||
static void
|
||||
|
@ -1494,7 +1520,9 @@ gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal)
|
|||
{
|
||||
struct drm_i915_private *i915 = gt->i915;
|
||||
|
||||
if (IS_DG2(i915))
|
||||
if (IS_PONTEVECCHIO(i915))
|
||||
pvc_gt_workarounds_init(gt, wal);
|
||||
else if (IS_DG2(i915))
|
||||
dg2_gt_workarounds_init(gt, wal);
|
||||
else if (IS_XEHPSDV(i915))
|
||||
xehpsdv_gt_workarounds_init(gt, wal);
|
||||
|
@ -1596,13 +1624,13 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
|
|||
u32 val, old = 0;
|
||||
|
||||
/* open-coded rmw due to steering */
|
||||
old = wa->clr ? intel_gt_read_register_fw(gt, wa->reg) : 0;
|
||||
old = wa->clr ? intel_gt_mcr_read_any_fw(gt, wa->reg) : 0;
|
||||
val = (old & ~wa->clr) | wa->set;
|
||||
if (val != old || !wa->clr)
|
||||
intel_uncore_write_fw(uncore, wa->reg, val);
|
||||
|
||||
if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
|
||||
wa_verify(wa, intel_gt_read_register_fw(gt, wa->reg),
|
||||
wa_verify(wa, intel_gt_mcr_read_any_fw(gt, wa->reg),
|
||||
wal->name, "application");
|
||||
}
|
||||
|
||||
|
@ -1633,7 +1661,7 @@ static bool wa_list_verify(struct intel_gt *gt,
|
|||
|
||||
for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
|
||||
ok &= wa_verify(wa,
|
||||
intel_gt_read_register_fw(gt, wa->reg),
|
||||
intel_gt_mcr_read_any_fw(gt, wa->reg),
|
||||
wal->name, from);
|
||||
|
||||
intel_uncore_forcewake_put__locked(uncore, fw);
|
||||
|
@ -1924,6 +1952,32 @@ static void dg2_whitelist_build(struct intel_engine_cs *engine)
|
|||
}
|
||||
}
|
||||
|
||||
static void blacklist_trtt(struct intel_engine_cs *engine)
|
||||
{
|
||||
struct i915_wa_list *w = &engine->whitelist;
|
||||
|
||||
/*
|
||||
* Prevent read/write access to [0x4400, 0x4600) which covers
|
||||
* the TRTT range across all engines. Note that normally userspace
|
||||
* cannot access the other engines' trtt control, but for simplicity
|
||||
* we cover the entire range on each engine.
|
||||
*/
|
||||
whitelist_reg_ext(w, _MMIO(0x4400),
|
||||
RING_FORCE_TO_NONPRIV_DENY |
|
||||
RING_FORCE_TO_NONPRIV_RANGE_64);
|
||||
whitelist_reg_ext(w, _MMIO(0x4500),
|
||||
RING_FORCE_TO_NONPRIV_DENY |
|
||||
RING_FORCE_TO_NONPRIV_RANGE_64);
|
||||
}
|
||||
|
||||
static void pvc_whitelist_build(struct intel_engine_cs *engine)
|
||||
{
|
||||
allow_read_ctx_timestamp(engine);
|
||||
|
||||
/* Wa_16014440446:pvc */
|
||||
blacklist_trtt(engine);
|
||||
}
|
||||
|
||||
void intel_engine_init_whitelist(struct intel_engine_cs *engine)
|
||||
{
|
||||
struct drm_i915_private *i915 = engine->i915;
|
||||
|
@ -1931,7 +1985,9 @@ void intel_engine_init_whitelist(struct intel_engine_cs *engine)
|
|||
|
||||
wa_init_start(w, "whitelist", engine->name);
|
||||
|
||||
if (IS_DG2(i915))
|
||||
if (IS_PONTEVECCHIO(i915))
|
||||
pvc_whitelist_build(engine);
|
||||
else if (IS_DG2(i915))
|
||||
dg2_whitelist_build(engine);
|
||||
else if (IS_XEHPSDV(i915))
|
||||
xehpsdv_whitelist_build(engine);
|
||||
|
@ -1994,27 +2050,44 @@ void intel_engine_apply_whitelist(struct intel_engine_cs *engine)
|
|||
static void
|
||||
engine_fake_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
||||
{
|
||||
u8 mocs;
|
||||
u8 mocs_w, mocs_r;
|
||||
|
||||
/*
|
||||
* RING_CMD_CCTL are need to be programed to un-cached
|
||||
* for memory writes and reads outputted by Command
|
||||
* Streamers on Gen12 onward platforms.
|
||||
* RING_CMD_CCTL specifies the default MOCS entry that will be used
|
||||
* by the command streamer when executing commands that don't have
|
||||
* a way to explicitly specify a MOCS setting. The default should
|
||||
* usually reference whichever MOCS entry corresponds to uncached
|
||||
* behavior, although use of a WB cached entry is recommended by the
|
||||
* spec in certain circumstances on specific platforms.
|
||||
*/
|
||||
if (GRAPHICS_VER(engine->i915) >= 12) {
|
||||
mocs = engine->gt->mocs.uc_index;
|
||||
mocs_r = engine->gt->mocs.uc_index;
|
||||
mocs_w = engine->gt->mocs.uc_index;
|
||||
|
||||
if (HAS_L3_CCS_READ(engine->i915) &&
|
||||
engine->class == COMPUTE_CLASS) {
|
||||
mocs_r = engine->gt->mocs.wb_index;
|
||||
|
||||
/*
|
||||
* Even on the few platforms where MOCS 0 is a
|
||||
* legitimate table entry, it's never the correct
|
||||
* setting to use here; we can assume the MOCS init
|
||||
* just forgot to initialize wb_index.
|
||||
*/
|
||||
drm_WARN_ON(&engine->i915->drm, mocs_r == 0);
|
||||
}
|
||||
|
||||
wa_masked_field_set(wal,
|
||||
RING_CMD_CCTL(engine->mmio_base),
|
||||
CMD_CCTL_MOCS_MASK,
|
||||
CMD_CCTL_MOCS_OVERRIDE(mocs, mocs));
|
||||
CMD_CCTL_MOCS_OVERRIDE(mocs_w, mocs_r));
|
||||
}
|
||||
}
|
||||
|
||||
static bool needs_wa_1308578152(struct intel_engine_cs *engine)
|
||||
{
|
||||
u64 dss_mask = intel_sseu_get_subslices(&engine->gt->info.sseu, 0);
|
||||
|
||||
return (dss_mask & GENMASK(GEN_DSS_PER_GSLICE - 1, 0)) == 0;
|
||||
return intel_sseu_find_first_xehp_dss(&engine->gt->info.sseu, 0, 0) >=
|
||||
GEN_DSS_PER_GSLICE;
|
||||
}
|
||||
|
||||
static void
|
||||
|
@ -2023,9 +2096,6 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
|||
struct drm_i915_private *i915 = engine->i915;
|
||||
|
||||
if (IS_DG2(i915)) {
|
||||
/* Wa_14015227452:dg2 */
|
||||
wa_masked_en(wal, GEN9_ROW_CHICKEN4, XEHP_DIS_BBL_SYSPIPE);
|
||||
|
||||
/* Wa_1509235366:dg2 */
|
||||
wa_write_or(wal, GEN12_GAMCNTRL_CTRL, INVALIDATION_BROADCAST_MODE_DIS |
|
||||
GLOBAL_INVALIDATION_MODE);
|
||||
|
@ -2036,12 +2106,6 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
|||
* performance guide section.
|
||||
*/
|
||||
wa_write_or(wal, XEHP_L3SCQREG7, BLEND_FILL_CACHING_OPT_DIS);
|
||||
|
||||
/* Wa_18018781329:dg2 */
|
||||
wa_write_or(wal, RENDER_MOD_CTRL, FORCE_MISS_FTLB);
|
||||
wa_write_or(wal, COMP_MOD_CTRL, FORCE_MISS_FTLB);
|
||||
wa_write_or(wal, VDBX_MOD_CTRL, FORCE_MISS_FTLB);
|
||||
wa_write_or(wal, VEBX_MOD_CTRL, FORCE_MISS_FTLB);
|
||||
}
|
||||
|
||||
if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_A0, STEP_B0)) {
|
||||
|
@ -2160,6 +2224,16 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
|||
wa_write_or(wal, GEN12_MERT_MOD_CTRL, FORCE_MISS_FTLB);
|
||||
}
|
||||
|
||||
if (IS_DG2_GRAPHICS_STEP(i915, G11, STEP_B0, STEP_FOREVER) ||
|
||||
IS_DG2_G10(i915)) {
|
||||
/* Wa_22014600077:dg2 */
|
||||
wa_add(wal, GEN10_CACHE_MODE_SS, 0,
|
||||
_MASKED_BIT_ENABLE(ENABLE_EU_COUNT_FOR_TDL_FLUSH),
|
||||
0 /* Wa_14012342262 :write-only reg, so skip
|
||||
verification */,
|
||||
true);
|
||||
}
|
||||
|
||||
if (IS_DG1_GRAPHICS_STEP(i915, STEP_A0, STEP_B0) ||
|
||||
IS_TGL_UY_GRAPHICS_STEP(i915, STEP_A0, STEP_B0)) {
|
||||
/*
|
||||
|
@ -2583,6 +2657,15 @@ xcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
|||
}
|
||||
}
|
||||
|
||||
static void
|
||||
ccs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
|
||||
{
|
||||
if (IS_PVC_CT_STEP(engine->i915, STEP_A0, STEP_C0)) {
|
||||
/* Wa_14014999345:pvc */
|
||||
wa_masked_en(wal, GEN10_CACHE_MODE_SS, DISABLE_ECC);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* The workarounds in this function apply to shared registers in
|
||||
* the general render reset domain that aren't tied to a
|
||||
|
@ -2597,6 +2680,15 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
|
|||
{
|
||||
struct drm_i915_private *i915 = engine->i915;
|
||||
|
||||
if (IS_PONTEVECCHIO(i915)) {
|
||||
/*
|
||||
* The following is not actually a "workaround" but rather
|
||||
* a recommended tuning setting documented in the bspec's
|
||||
* performance guide section.
|
||||
*/
|
||||
wa_write(wal, XEHPC_L3SCRUB, SCRUB_CL_DWNGRADE_SHARED | SCRUB_RATE_4B_PER_CLK);
|
||||
}
|
||||
|
||||
if (IS_XEHPSDV(i915)) {
|
||||
/* Wa_1409954639 */
|
||||
wa_masked_en(wal,
|
||||
|
@ -2629,9 +2721,21 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
|
|||
GLOBAL_INVALIDATION_MODE);
|
||||
}
|
||||
|
||||
if (IS_DG2(i915)) {
|
||||
/* Wa_22014226127:dg2 */
|
||||
if (IS_DG2(i915) || IS_PONTEVECCHIO(i915)) {
|
||||
/* Wa_14015227452:dg2,pvc */
|
||||
wa_masked_en(wal, GEN9_ROW_CHICKEN4, XEHP_DIS_BBL_SYSPIPE);
|
||||
|
||||
/* Wa_22014226127:dg2,pvc */
|
||||
wa_write_or(wal, LSC_CHICKEN_BIT_0, DISABLE_D8_D16_COASLESCE);
|
||||
|
||||
/* Wa_16015675438:dg2,pvc */
|
||||
wa_masked_en(wal, FF_SLICE_CS_CHICKEN2, GEN12_PERF_FIX_BALANCING_CFE_DISABLE);
|
||||
|
||||
/* Wa_18018781329:dg2,pvc */
|
||||
wa_write_or(wal, RENDER_MOD_CTRL, FORCE_MISS_FTLB);
|
||||
wa_write_or(wal, COMP_MOD_CTRL, FORCE_MISS_FTLB);
|
||||
wa_write_or(wal, VDBX_MOD_CTRL, FORCE_MISS_FTLB);
|
||||
wa_write_or(wal, VEBX_MOD_CTRL, FORCE_MISS_FTLB);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -2651,7 +2755,9 @@ engine_init_workarounds(struct intel_engine_cs *engine, struct i915_wa_list *wal
|
|||
if (engine->flags & I915_ENGINE_FIRST_RENDER_COMPUTE)
|
||||
general_render_compute_wa_init(engine, wal);
|
||||
|
||||
if (engine->class == RENDER_CLASS)
|
||||
if (engine->class == COMPUTE_CLASS)
|
||||
ccs_engine_wa_init(engine, wal);
|
||||
else if (engine->class == RENDER_CLASS)
|
||||
rcs_engine_wa_init(engine, wal);
|
||||
else
|
||||
xcs_engine_wa_init(engine, wal);
|
||||
|
|
|
@ -976,6 +976,7 @@ static int __igt_reset_engines(struct intel_gt *gt,
|
|||
{
|
||||
struct i915_gpu_error *global = >->i915->gpu_error;
|
||||
struct intel_engine_cs *engine, *other;
|
||||
struct active_engine *threads;
|
||||
enum intel_engine_id id, tmp;
|
||||
struct hang h;
|
||||
int err = 0;
|
||||
|
@ -996,8 +997,11 @@ static int __igt_reset_engines(struct intel_gt *gt,
|
|||
h.ctx->sched.priority = 1024;
|
||||
}
|
||||
|
||||
threads = kmalloc_array(I915_NUM_ENGINES, sizeof(*threads), GFP_KERNEL);
|
||||
if (!threads)
|
||||
return -ENOMEM;
|
||||
|
||||
for_each_engine(engine, gt, id) {
|
||||
struct active_engine threads[I915_NUM_ENGINES] = {};
|
||||
unsigned long device = i915_reset_count(global);
|
||||
unsigned long count = 0, reported;
|
||||
bool using_guc = intel_engine_uses_guc(engine);
|
||||
|
@ -1016,7 +1020,7 @@ static int __igt_reset_engines(struct intel_gt *gt,
|
|||
break;
|
||||
}
|
||||
|
||||
memset(threads, 0, sizeof(threads));
|
||||
memset(threads, 0, sizeof(*threads) * I915_NUM_ENGINES);
|
||||
for_each_engine(other, gt, tmp) {
|
||||
struct task_struct *tsk;
|
||||
|
||||
|
@ -1236,6 +1240,7 @@ unwind:
|
|||
break;
|
||||
}
|
||||
}
|
||||
kfree(threads);
|
||||
|
||||
if (intel_gt_is_wedged(gt))
|
||||
err = -EIO;
|
||||
|
|
|
@ -122,6 +122,12 @@ enum slpc_param_id {
|
|||
SLPC_MAX_PARAM = 32,
|
||||
};
|
||||
|
||||
enum slpc_media_ratio_mode {
|
||||
SLPC_MEDIA_RATIO_MODE_DYNAMIC_CONTROL = 0,
|
||||
SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_ONE = 1,
|
||||
SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_TWO = 2,
|
||||
};
|
||||
|
||||
enum slpc_event_id {
|
||||
SLPC_EVENT_RESET = 0,
|
||||
SLPC_EVENT_SHUTDOWN = 1,
|
||||
|
|
|
@ -310,8 +310,8 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc)
|
|||
if (IS_DG2(gt->i915))
|
||||
flags |= GUC_WA_DUAL_QUEUE;
|
||||
|
||||
/* Wa_22011802037: graphics version 12 */
|
||||
if (GRAPHICS_VER(gt->i915) == 12)
|
||||
/* Wa_22011802037: graphics version 11/12 */
|
||||
if (IS_GRAPHICS_VER(gt->i915, 11, 12))
|
||||
flags |= GUC_WA_PRE_PARSER;
|
||||
|
||||
/* Wa_16011777198:dg2 */
|
||||
|
@ -327,6 +327,10 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc)
|
|||
IS_DG2_GRAPHICS_STEP(gt->i915, G11, STEP_A0, STEP_FOREVER))
|
||||
flags |= GUC_WA_CONTEXT_ISOLATION;
|
||||
|
||||
/* Wa_16015675438 */
|
||||
if (!RCS_MASK(gt))
|
||||
flags |= GUC_WA_RCS_REGS_IN_CCS_REGS_LIST;
|
||||
|
||||
return flags;
|
||||
}
|
||||
|
||||
|
|
|
@ -230,6 +230,14 @@ struct intel_guc {
|
|||
* @shift: Right shift value for the gpm timestamp
|
||||
*/
|
||||
u32 shift;
|
||||
|
||||
/**
|
||||
* @last_stat_jiffies: jiffies at last actual stats collection time
|
||||
* We use this timestamp to ensure we don't oversample the
|
||||
* stats because runtime power management events can trigger
|
||||
* stats collection at much higher rates than required.
|
||||
*/
|
||||
unsigned long last_stat_jiffies;
|
||||
} timestamp;
|
||||
|
||||
#ifdef CONFIG_DRM_I915_SELFTEST
|
||||
|
|
|
@ -7,6 +7,7 @@
|
|||
|
||||
#include "gt/intel_engine_regs.h"
|
||||
#include "gt/intel_gt.h"
|
||||
#include "gt/intel_gt_mcr.h"
|
||||
#include "gt/intel_gt_regs.h"
|
||||
#include "gt/intel_lrc.h"
|
||||
#include "gt/shmem_utils.h"
|
||||
|
@ -313,7 +314,7 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
|
|||
* tracking, it is easier to just program the default steering for all
|
||||
* regs that don't need a non-default one.
|
||||
*/
|
||||
intel_gt_get_valid_steering_for_reg(gt, reg, &group, &inst);
|
||||
intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst);
|
||||
entry.flags |= GUC_REGSET_STEERING(group, inst);
|
||||
|
||||
slot = __mmio_reg_add(regset, &entry);
|
||||
|
@ -457,7 +458,7 @@ static void fill_engine_enable_masks(struct intel_gt *gt,
|
|||
{
|
||||
info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], RCS_MASK(gt));
|
||||
info_map_write(info_map, engine_enabled_masks[GUC_COMPUTE_CLASS], CCS_MASK(gt));
|
||||
info_map_write(info_map, engine_enabled_masks[GUC_BLITTER_CLASS], 1);
|
||||
info_map_write(info_map, engine_enabled_masks[GUC_BLITTER_CLASS], BCS_MASK(gt));
|
||||
info_map_write(info_map, engine_enabled_masks[GUC_VIDEO_CLASS], VDBOX_MASK(gt));
|
||||
info_map_write(info_map, engine_enabled_masks[GUC_VIDEOENHANCE_CLASS], VEBOX_MASK(gt));
|
||||
}
|
||||
|
|
|
@ -420,72 +420,6 @@ guc_capture_get_device_reglist(struct intel_guc *guc)
|
|||
return default_lists;
|
||||
}
|
||||
|
||||
static const char *
|
||||
__stringify_owner(u32 owner)
|
||||
{
|
||||
switch (owner) {
|
||||
case GUC_CAPTURE_LIST_INDEX_PF:
|
||||
return "PF";
|
||||
case GUC_CAPTURE_LIST_INDEX_VF:
|
||||
return "VF";
|
||||
default:
|
||||
return "unknown";
|
||||
}
|
||||
|
||||
return "";
|
||||
}
|
||||
|
||||
static const char *
|
||||
__stringify_type(u32 type)
|
||||
{
|
||||
switch (type) {
|
||||
case GUC_CAPTURE_LIST_TYPE_GLOBAL:
|
||||
return "Global";
|
||||
case GUC_CAPTURE_LIST_TYPE_ENGINE_CLASS:
|
||||
return "Class";
|
||||
case GUC_CAPTURE_LIST_TYPE_ENGINE_INSTANCE:
|
||||
return "Instance";
|
||||
default:
|
||||
return "unknown";
|
||||
}
|
||||
|
||||
return "";
|
||||
}
|
||||
|
||||
static const char *
|
||||
__stringify_engclass(u32 class)
|
||||
{
|
||||
switch (class) {
|
||||
case GUC_RENDER_CLASS:
|
||||
return "Render";
|
||||
case GUC_VIDEO_CLASS:
|
||||
return "Video";
|
||||
case GUC_VIDEOENHANCE_CLASS:
|
||||
return "VideoEnhance";
|
||||
case GUC_BLITTER_CLASS:
|
||||
return "Blitter";
|
||||
case GUC_COMPUTE_CLASS:
|
||||
return "Compute";
|
||||
default:
|
||||
return "unknown";
|
||||
}
|
||||
|
||||
return "";
|
||||
}
|
||||
|
||||
static void
|
||||
guc_capture_warn_with_list_info(struct drm_i915_private *i915, char *msg,
|
||||
u32 owner, u32 type, u32 classid)
|
||||
{
|
||||
if (type == GUC_CAPTURE_LIST_TYPE_GLOBAL)
|
||||
drm_dbg(&i915->drm, "GuC-capture: %s for %s %s-Registers.\n", msg,
|
||||
__stringify_owner(owner), __stringify_type(type));
|
||||
else
|
||||
drm_dbg(&i915->drm, "GuC-capture: %s for %s %s-Registers on %s-Engine\n", msg,
|
||||
__stringify_owner(owner), __stringify_type(type),
|
||||
__stringify_engclass(classid));
|
||||
}
|
||||
|
||||
static int
|
||||
guc_capture_list_init(struct intel_guc *guc, u32 owner, u32 type, u32 classid,
|
||||
struct guc_mmio_reg *ptr, u16 num_entries)
|
||||
|
@ -501,11 +435,8 @@ guc_capture_list_init(struct intel_guc *guc, u32 owner, u32 type, u32 classid,
|
|||
return -ENODEV;
|
||||
|
||||
match = guc_capture_get_one_list(reglists, owner, type, classid);
|
||||
if (!match) {
|
||||
guc_capture_warn_with_list_info(i915, "Missing register list init", owner, type,
|
||||
classid);
|
||||
if (!match)
|
||||
return -ENODATA;
|
||||
}
|
||||
|
||||
for (i = 0; i < num_entries && i < match->num_regs; ++i) {
|
||||
ptr[i].offset = match->list[i].reg.reg;
|
||||
|
@ -556,7 +487,6 @@ int
|
|||
intel_guc_capture_getlistsize(struct intel_guc *guc, u32 owner, u32 type, u32 classid,
|
||||
size_t *size)
|
||||
{
|
||||
struct drm_i915_private *i915 = guc_to_gt(guc)->i915;
|
||||
struct intel_guc_state_capture *gc = guc->capture;
|
||||
struct __guc_capture_ads_cache *cache = &gc->ads_cache[owner][type][classid];
|
||||
int num_regs;
|
||||
|
@ -570,11 +500,8 @@ intel_guc_capture_getlistsize(struct intel_guc *guc, u32 owner, u32 type, u32 cl
|
|||
}
|
||||
|
||||
num_regs = guc_cap_list_num_regs(gc, owner, type, classid);
|
||||
if (!num_regs) {
|
||||
guc_capture_warn_with_list_info(i915, "Missing register list size",
|
||||
owner, type, classid);
|
||||
if (!num_regs)
|
||||
return -ENODATA;
|
||||
}
|
||||
|
||||
*size = PAGE_ALIGN((sizeof(struct guc_debug_capture_list)) +
|
||||
(num_regs * sizeof(struct guc_mmio_reg)));
|
||||
|
|
|
@ -105,6 +105,7 @@
|
|||
#define GUC_WA_PRE_PARSER BIT(14)
|
||||
#define GUC_WA_HOLD_CCS_SWITCHOUT BIT(17)
|
||||
#define GUC_WA_POLLCS BIT(18)
|
||||
#define GUC_WA_RCS_REGS_IN_CCS_REGS_LIST BIT(21)
|
||||
|
||||
#define GUC_CTL_FEATURE 2
|
||||
#define GUC_CTL_ENABLE_SLPC BIT(2)
|
||||
|
|
|
@ -94,9 +94,9 @@ static int guc_hwconfig_fill_buffer(struct intel_guc *guc, struct intel_hwconfig
|
|||
|
||||
static bool has_table(struct drm_i915_private *i915)
|
||||
{
|
||||
if (IS_ALDERLAKE_P(i915))
|
||||
if (IS_ALDERLAKE_P(i915) && !IS_ADLP_N(i915))
|
||||
return true;
|
||||
if (IS_DG2(i915))
|
||||
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55))
|
||||
return true;
|
||||
|
||||
return false;
|
||||
|
|
|
@ -49,7 +49,6 @@ static int guc_action_control_gucrc(struct intel_guc *guc, bool enable)
|
|||
static int __guc_rc_control(struct intel_guc *guc, bool enable)
|
||||
{
|
||||
struct intel_gt *gt = guc_to_gt(guc);
|
||||
struct drm_device *drm = &guc_to_gt(guc)->i915->drm;
|
||||
int ret;
|
||||
|
||||
if (!intel_uc_uses_guc_rc(>->uc))
|
||||
|
@ -60,8 +59,8 @@ static int __guc_rc_control(struct intel_guc *guc, bool enable)
|
|||
|
||||
ret = guc_action_control_gucrc(guc, enable);
|
||||
if (ret) {
|
||||
drm_err(drm, "Failed to %s GuC RC (%pe)\n",
|
||||
str_enable_disable(enable), ERR_PTR(ret));
|
||||
i915_probe_error(guc_to_gt(guc)->i915, "Failed to %s GuC RC (%pe)\n",
|
||||
str_enable_disable(enable), ERR_PTR(ret));
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -96,6 +96,7 @@
|
|||
|
||||
#define GUC_SHIM_CONTROL2 _MMIO(0xc068)
|
||||
#define GUC_IS_PRIVILEGED (1<<29)
|
||||
#define GSC_LOADS_HUC (1<<30)
|
||||
|
||||
#define GUC_SEND_INTERRUPT _MMIO(0xc4c8)
|
||||
#define GUC_SEND_TRIGGER (1<<0)
|
||||
|
|
|
@ -98,6 +98,30 @@ static u32 slpc_get_state(struct intel_guc_slpc *slpc)
|
|||
return data->header.global_state;
|
||||
}
|
||||
|
||||
static int guc_action_slpc_set_param_nb(struct intel_guc *guc, u8 id, u32 value)
|
||||
{
|
||||
u32 request[] = {
|
||||
GUC_ACTION_HOST2GUC_PC_SLPC_REQUEST,
|
||||
SLPC_EVENT(SLPC_EVENT_PARAMETER_SET, 2),
|
||||
id,
|
||||
value,
|
||||
};
|
||||
int ret;
|
||||
|
||||
ret = intel_guc_send_nb(guc, request, ARRAY_SIZE(request), 0);
|
||||
|
||||
return ret > 0 ? -EPROTO : ret;
|
||||
}
|
||||
|
||||
static int slpc_set_param_nb(struct intel_guc_slpc *slpc, u8 id, u32 value)
|
||||
{
|
||||
struct intel_guc *guc = slpc_to_guc(slpc);
|
||||
|
||||
GEM_BUG_ON(id >= SLPC_MAX_PARAM);
|
||||
|
||||
return guc_action_slpc_set_param_nb(guc, id, value);
|
||||
}
|
||||
|
||||
static int guc_action_slpc_set_param(struct intel_guc *guc, u8 id, u32 value)
|
||||
{
|
||||
u32 request[] = {
|
||||
|
@ -208,12 +232,14 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq)
|
|||
*/
|
||||
|
||||
with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
|
||||
ret = slpc_set_param(slpc,
|
||||
SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
|
||||
freq);
|
||||
/* Non-blocking request will avoid stalls */
|
||||
ret = slpc_set_param_nb(slpc,
|
||||
SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
|
||||
freq);
|
||||
if (ret)
|
||||
i915_probe_error(i915, "Unable to force min freq to %u: %d",
|
||||
freq, ret);
|
||||
drm_notice(&i915->drm,
|
||||
"Failed to send set_param for min freq(%d): (%d)\n",
|
||||
freq, ret);
|
||||
}
|
||||
|
||||
return ret;
|
||||
|
@ -222,6 +248,7 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq)
|
|||
static void slpc_boost_work(struct work_struct *work)
|
||||
{
|
||||
struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work);
|
||||
int err;
|
||||
|
||||
/*
|
||||
* Raise min freq to boost. It's possible that
|
||||
|
@ -231,8 +258,9 @@ static void slpc_boost_work(struct work_struct *work)
|
|||
*/
|
||||
mutex_lock(&slpc->lock);
|
||||
if (atomic_read(&slpc->num_waiters)) {
|
||||
slpc_force_min_freq(slpc, slpc->boost_freq);
|
||||
slpc->num_boosts++;
|
||||
err = slpc_force_min_freq(slpc, slpc->boost_freq);
|
||||
if (!err)
|
||||
slpc->num_boosts++;
|
||||
}
|
||||
mutex_unlock(&slpc->lock);
|
||||
}
|
||||
|
@ -260,6 +288,7 @@ int intel_guc_slpc_init(struct intel_guc_slpc *slpc)
|
|||
slpc->boost_freq = 0;
|
||||
atomic_set(&slpc->num_waiters, 0);
|
||||
slpc->num_boosts = 0;
|
||||
slpc->media_ratio_mode = SLPC_MEDIA_RATIO_MODE_DYNAMIC_CONTROL;
|
||||
|
||||
mutex_init(&slpc->lock);
|
||||
INIT_WORK(&slpc->boost_work, slpc_boost_work);
|
||||
|
@ -506,6 +535,22 @@ int intel_guc_slpc_get_min_freq(struct intel_guc_slpc *slpc, u32 *val)
|
|||
return ret;
|
||||
}
|
||||
|
||||
int intel_guc_slpc_set_media_ratio_mode(struct intel_guc_slpc *slpc, u32 val)
|
||||
{
|
||||
struct drm_i915_private *i915 = slpc_to_i915(slpc);
|
||||
intel_wakeref_t wakeref;
|
||||
int ret = 0;
|
||||
|
||||
if (!HAS_MEDIA_RATIO_MODE(i915))
|
||||
return -ENODEV;
|
||||
|
||||
with_intel_runtime_pm(&i915->runtime_pm, wakeref)
|
||||
ret = slpc_set_param(slpc,
|
||||
SLPC_PARAM_MEDIA_FF_RATIO_MODE,
|
||||
val);
|
||||
return ret;
|
||||
}
|
||||
|
||||
void intel_guc_pm_intrmsk_enable(struct intel_gt *gt)
|
||||
{
|
||||
u32 pm_intrmsk_mbz = 0;
|
||||
|
@ -654,6 +699,9 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)
|
|||
return ret;
|
||||
}
|
||||
|
||||
/* Set cached media freq ratio mode */
|
||||
intel_guc_slpc_set_media_ratio_mode(slpc, slpc->media_ratio_mode);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
|
|
@ -38,6 +38,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val);
|
|||
int intel_guc_slpc_get_max_freq(struct intel_guc_slpc *slpc, u32 *val);
|
||||
int intel_guc_slpc_get_min_freq(struct intel_guc_slpc *slpc, u32 *val);
|
||||
int intel_guc_slpc_print_info(struct intel_guc_slpc *slpc, struct drm_printer *p);
|
||||
int intel_guc_slpc_set_media_ratio_mode(struct intel_guc_slpc *slpc, u32 val);
|
||||
void intel_guc_pm_intrmsk_enable(struct intel_gt *gt);
|
||||
void intel_guc_slpc_boost(struct intel_guc_slpc *slpc);
|
||||
void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc);
|
||||
|
|
|
@ -29,6 +29,9 @@ struct intel_guc_slpc {
|
|||
u32 min_freq_softlimit;
|
||||
u32 max_freq_softlimit;
|
||||
|
||||
/* cached media ratio mode */
|
||||
u32 media_ratio_mode;
|
||||
|
||||
/* Protects set/reset of boost freq
|
||||
* and value of num_waiters
|
||||
*/
|
||||
|
|
|
@ -1314,6 +1314,8 @@ static void __update_guc_busyness_stats(struct intel_guc *guc)
|
|||
unsigned long flags;
|
||||
ktime_t unused;
|
||||
|
||||
guc->timestamp.last_stat_jiffies = jiffies;
|
||||
|
||||
spin_lock_irqsave(&guc->timestamp.lock, flags);
|
||||
|
||||
guc_update_pm_timestamp(guc, &unused);
|
||||
|
@ -1386,6 +1388,17 @@ void intel_guc_busyness_park(struct intel_gt *gt)
|
|||
return;
|
||||
|
||||
cancel_delayed_work(&guc->timestamp.work);
|
||||
|
||||
/*
|
||||
* Before parking, we should sample engine busyness stats if we need to.
|
||||
* We can skip it if we are less than half a ping from the last time we
|
||||
* sampled the busyness stats.
|
||||
*/
|
||||
if (guc->timestamp.last_stat_jiffies &&
|
||||
!time_after(jiffies, guc->timestamp.last_stat_jiffies +
|
||||
(guc->timestamp.ping_delay / 2)))
|
||||
return;
|
||||
|
||||
__update_guc_busyness_stats(guc);
|
||||
}
|
||||
|
||||
|
@ -1527,87 +1540,18 @@ static void guc_reset_state(struct intel_context *ce, u32 head, bool scrub)
|
|||
lrc_update_regs(ce, engine, head);
|
||||
}
|
||||
|
||||
static u32 __cs_pending_mi_force_wakes(struct intel_engine_cs *engine)
|
||||
{
|
||||
static const i915_reg_t _reg[I915_NUM_ENGINES] = {
|
||||
[RCS0] = MSG_IDLE_CS,
|
||||
[BCS0] = MSG_IDLE_BCS,
|
||||
[VCS0] = MSG_IDLE_VCS0,
|
||||
[VCS1] = MSG_IDLE_VCS1,
|
||||
[VCS2] = MSG_IDLE_VCS2,
|
||||
[VCS3] = MSG_IDLE_VCS3,
|
||||
[VCS4] = MSG_IDLE_VCS4,
|
||||
[VCS5] = MSG_IDLE_VCS5,
|
||||
[VCS6] = MSG_IDLE_VCS6,
|
||||
[VCS7] = MSG_IDLE_VCS7,
|
||||
[VECS0] = MSG_IDLE_VECS0,
|
||||
[VECS1] = MSG_IDLE_VECS1,
|
||||
[VECS2] = MSG_IDLE_VECS2,
|
||||
[VECS3] = MSG_IDLE_VECS3,
|
||||
[CCS0] = MSG_IDLE_CS,
|
||||
[CCS1] = MSG_IDLE_CS,
|
||||
[CCS2] = MSG_IDLE_CS,
|
||||
[CCS3] = MSG_IDLE_CS,
|
||||
};
|
||||
u32 val;
|
||||
|
||||
if (!_reg[engine->id].reg)
|
||||
return 0;
|
||||
|
||||
val = intel_uncore_read(engine->uncore, _reg[engine->id]);
|
||||
|
||||
/* bits[29:25] & bits[13:9] >> shift */
|
||||
return (val & (val >> 16) & MSG_IDLE_FW_MASK) >> MSG_IDLE_FW_SHIFT;
|
||||
}
|
||||
|
||||
static void __gpm_wait_for_fw_complete(struct intel_gt *gt, u32 fw_mask)
|
||||
{
|
||||
int ret;
|
||||
|
||||
/* Ensure GPM receives fw up/down after CS is stopped */
|
||||
udelay(1);
|
||||
|
||||
/* Wait for forcewake request to complete in GPM */
|
||||
ret = __intel_wait_for_register_fw(gt->uncore,
|
||||
GEN9_PWRGT_DOMAIN_STATUS,
|
||||
fw_mask, fw_mask, 5000, 0, NULL);
|
||||
|
||||
/* Ensure CS receives fw ack from GPM */
|
||||
udelay(1);
|
||||
|
||||
if (ret)
|
||||
GT_TRACE(gt, "Failed to complete pending forcewake %d\n", ret);
|
||||
}
|
||||
|
||||
/*
|
||||
* Wa_22011802037:gen12: In addition to stopping the cs, we need to wait for any
|
||||
* pending MI_FORCE_WAKEUP requests that the CS has initiated to complete. The
|
||||
* pending status is indicated by bits[13:9] (masked by bits[ 29:25]) in the
|
||||
* MSG_IDLE register. There's one MSG_IDLE register per reset domain. Since we
|
||||
* are concerned only with the gt reset here, we use a logical OR of pending
|
||||
* forcewakeups from all reset domains and then wait for them to complete by
|
||||
* querying PWRGT_DOMAIN_STATUS.
|
||||
*/
|
||||
static void guc_engine_reset_prepare(struct intel_engine_cs *engine)
|
||||
{
|
||||
u32 fw_pending;
|
||||
|
||||
if (GRAPHICS_VER(engine->i915) != 12)
|
||||
if (!IS_GRAPHICS_VER(engine->i915, 11, 12))
|
||||
return;
|
||||
|
||||
/*
|
||||
* Wa_22011802037
|
||||
* TODO: Occasionally trying to stop the cs times out, but does not
|
||||
* adversely affect functionality. The timeout is set as a config
|
||||
* parameter that defaults to 100ms. Assuming that this timeout is
|
||||
* sufficient for any pending MI_FORCEWAKEs to complete, ignore the
|
||||
* timeout returned here until it is root caused.
|
||||
*/
|
||||
intel_engine_stop_cs(engine);
|
||||
|
||||
fw_pending = __cs_pending_mi_force_wakes(engine);
|
||||
if (fw_pending)
|
||||
__gpm_wait_for_fw_complete(engine->gt, fw_pending);
|
||||
/*
|
||||
* Wa_22011802037:gen11/gen12: In addition to stopping the cs, we need
|
||||
* to wait for any pending mi force wakeups
|
||||
*/
|
||||
intel_engine_wait_for_pending_mi_fw(engine);
|
||||
}
|
||||
|
||||
static void guc_reset_nop(struct intel_engine_cs *engine)
|
||||
|
@ -2394,6 +2338,26 @@ static int guc_context_policy_init(struct intel_context *ce, bool loop)
|
|||
return ret;
|
||||
}
|
||||
|
||||
static u32 map_guc_prio_to_lrc_desc_prio(u8 prio)
|
||||
{
|
||||
/*
|
||||
* this matches the mapping we do in map_i915_prio_to_guc_prio()
|
||||
* (e.g. prio < I915_PRIORITY_NORMAL maps to GUC_CLIENT_PRIORITY_NORMAL)
|
||||
*/
|
||||
switch (prio) {
|
||||
default:
|
||||
MISSING_CASE(prio);
|
||||
fallthrough;
|
||||
case GUC_CLIENT_PRIORITY_KMD_NORMAL:
|
||||
return GEN12_CTX_PRIORITY_NORMAL;
|
||||
case GUC_CLIENT_PRIORITY_NORMAL:
|
||||
return GEN12_CTX_PRIORITY_LOW;
|
||||
case GUC_CLIENT_PRIORITY_HIGH:
|
||||
case GUC_CLIENT_PRIORITY_KMD_HIGH:
|
||||
return GEN12_CTX_PRIORITY_HIGH;
|
||||
}
|
||||
}
|
||||
|
||||
static void prepare_context_registration_info(struct intel_context *ce,
|
||||
struct guc_ctxt_registration_info *info)
|
||||
{
|
||||
|
@ -2420,6 +2384,8 @@ static void prepare_context_registration_info(struct intel_context *ce,
|
|||
*/
|
||||
info->hwlrca_lo = lower_32_bits(ce->lrc.lrca);
|
||||
info->hwlrca_hi = upper_32_bits(ce->lrc.lrca);
|
||||
if (engine->flags & I915_ENGINE_HAS_EU_PRIORITY)
|
||||
info->hwlrca_lo |= map_guc_prio_to_lrc_desc_prio(ce->guc_state.prio);
|
||||
info->flags = CONTEXT_REGISTRATION_FLAG_KMD;
|
||||
|
||||
/*
|
||||
|
@ -2768,7 +2734,9 @@ static void __guc_context_set_preemption_timeout(struct intel_guc *guc,
|
|||
__guc_context_set_context_policies(guc, &policy, true);
|
||||
}
|
||||
|
||||
static void guc_context_ban(struct intel_context *ce, struct i915_request *rq)
|
||||
static void
|
||||
guc_context_revoke(struct intel_context *ce, struct i915_request *rq,
|
||||
unsigned int preempt_timeout_ms)
|
||||
{
|
||||
struct intel_guc *guc = ce_to_guc(ce);
|
||||
struct intel_runtime_pm *runtime_pm =
|
||||
|
@ -2807,7 +2775,8 @@ static void guc_context_ban(struct intel_context *ce, struct i915_request *rq)
|
|||
* gets kicked off the HW ASAP.
|
||||
*/
|
||||
with_intel_runtime_pm(runtime_pm, wakeref) {
|
||||
__guc_context_set_preemption_timeout(guc, guc_id, 1);
|
||||
__guc_context_set_preemption_timeout(guc, guc_id,
|
||||
preempt_timeout_ms);
|
||||
__guc_context_sched_disable(guc, ce, guc_id);
|
||||
}
|
||||
} else {
|
||||
|
@ -2815,7 +2784,7 @@ static void guc_context_ban(struct intel_context *ce, struct i915_request *rq)
|
|||
with_intel_runtime_pm(runtime_pm, wakeref)
|
||||
__guc_context_set_preemption_timeout(guc,
|
||||
ce->guc_id.id,
|
||||
1);
|
||||
preempt_timeout_ms);
|
||||
spin_unlock_irqrestore(&ce->guc_state.lock, flags);
|
||||
}
|
||||
}
|
||||
|
@ -3168,7 +3137,7 @@ static const struct intel_context_ops guc_context_ops = {
|
|||
.unpin = guc_context_unpin,
|
||||
.post_unpin = guc_context_post_unpin,
|
||||
|
||||
.ban = guc_context_ban,
|
||||
.revoke = guc_context_revoke,
|
||||
|
||||
.cancel_request = guc_context_cancel_request,
|
||||
|
||||
|
@ -3417,7 +3386,7 @@ static const struct intel_context_ops virtual_guc_context_ops = {
|
|||
.unpin = guc_virtual_context_unpin,
|
||||
.post_unpin = guc_context_post_unpin,
|
||||
|
||||
.ban = guc_context_ban,
|
||||
.revoke = guc_context_revoke,
|
||||
|
||||
.cancel_request = guc_context_cancel_request,
|
||||
|
||||
|
@ -3506,7 +3475,7 @@ static const struct intel_context_ops virtual_parent_context_ops = {
|
|||
.unpin = guc_parent_context_unpin,
|
||||
.post_unpin = guc_context_post_unpin,
|
||||
|
||||
.ban = guc_context_ban,
|
||||
.revoke = guc_context_revoke,
|
||||
|
||||
.cancel_request = guc_context_cancel_request,
|
||||
|
||||
|
|
|
@ -6,6 +6,7 @@
|
|||
#include <linux/types.h>
|
||||
|
||||
#include "gt/intel_gt.h"
|
||||
#include "intel_guc_reg.h"
|
||||
#include "intel_huc.h"
|
||||
#include "i915_drv.h"
|
||||
|
||||
|
@ -17,11 +18,15 @@
|
|||
* capabilities by adding HuC specific commands to batch buffers.
|
||||
*
|
||||
* The kernel driver is only responsible for loading the HuC firmware and
|
||||
* triggering its security authentication, which is performed by the GuC. For
|
||||
* The GuC to correctly perform the authentication, the HuC binary must be
|
||||
* loaded before the GuC one. Loading the HuC is optional; however, not using
|
||||
* the HuC might negatively impact power usage and/or performance of media
|
||||
* workloads, depending on the use-cases.
|
||||
* triggering its security authentication, which is performed by the GuC on
|
||||
* older platforms and by the GSC on newer ones. For the GuC to correctly
|
||||
* perform the authentication, the HuC binary must be loaded before the GuC one.
|
||||
* Loading the HuC is optional; however, not using the HuC might negatively
|
||||
* impact power usage and/or performance of media workloads, depending on the
|
||||
* use-cases.
|
||||
* HuC must be reloaded on events that cause the WOPCM to lose its contents
|
||||
* (S3/S4, FLR); GuC-authenticated HuC must also be reloaded on GuC/GT reset,
|
||||
* while GSC-managed HuC will survive that.
|
||||
*
|
||||
* See https://github.com/intel/media-driver for the latest details on HuC
|
||||
* functionality.
|
||||
|
@ -54,11 +59,51 @@ void intel_huc_init_early(struct intel_huc *huc)
|
|||
}
|
||||
}
|
||||
|
||||
#define HUC_LOAD_MODE_STRING(x) (x ? "GSC" : "legacy")
|
||||
static int check_huc_loading_mode(struct intel_huc *huc)
|
||||
{
|
||||
struct intel_gt *gt = huc_to_gt(huc);
|
||||
bool fw_needs_gsc = intel_huc_is_loaded_by_gsc(huc);
|
||||
bool hw_uses_gsc = false;
|
||||
|
||||
/*
|
||||
* The fuse for HuC load via GSC is only valid on platforms that have
|
||||
* GuC deprivilege.
|
||||
*/
|
||||
if (HAS_GUC_DEPRIVILEGE(gt->i915))
|
||||
hw_uses_gsc = intel_uncore_read(gt->uncore, GUC_SHIM_CONTROL2) &
|
||||
GSC_LOADS_HUC;
|
||||
|
||||
if (fw_needs_gsc != hw_uses_gsc) {
|
||||
drm_err(>->i915->drm,
|
||||
"mismatch between HuC FW (%s) and HW (%s) load modes\n",
|
||||
HUC_LOAD_MODE_STRING(fw_needs_gsc),
|
||||
HUC_LOAD_MODE_STRING(hw_uses_gsc));
|
||||
return -ENOEXEC;
|
||||
}
|
||||
|
||||
/* make sure we can access the GSC via the mei driver if we need it */
|
||||
if (!(IS_ENABLED(CONFIG_INTEL_MEI_PXP) && IS_ENABLED(CONFIG_INTEL_MEI_GSC)) &&
|
||||
fw_needs_gsc) {
|
||||
drm_info(>->i915->drm,
|
||||
"Can't load HuC due to missing MEI modules\n");
|
||||
return -EIO;
|
||||
}
|
||||
|
||||
drm_dbg(>->i915->drm, "GSC loads huc=%s\n", str_yes_no(fw_needs_gsc));
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int intel_huc_init(struct intel_huc *huc)
|
||||
{
|
||||
struct drm_i915_private *i915 = huc_to_gt(huc)->i915;
|
||||
int err;
|
||||
|
||||
err = check_huc_loading_mode(huc);
|
||||
if (err)
|
||||
goto out;
|
||||
|
||||
err = intel_uc_fw_init(&huc->fw);
|
||||
if (err)
|
||||
goto out;
|
||||
|
@ -68,7 +113,7 @@ int intel_huc_init(struct intel_huc *huc)
|
|||
return 0;
|
||||
|
||||
out:
|
||||
i915_probe_error(i915, "failed with %d\n", err);
|
||||
drm_info(&i915->drm, "HuC init failed with %d\n", err);
|
||||
return err;
|
||||
}
|
||||
|
||||
|
@ -96,17 +141,20 @@ int intel_huc_auth(struct intel_huc *huc)
|
|||
struct intel_guc *guc = >->uc.guc;
|
||||
int ret;
|
||||
|
||||
GEM_BUG_ON(intel_huc_is_authenticated(huc));
|
||||
|
||||
if (!intel_uc_fw_is_loaded(&huc->fw))
|
||||
return -ENOEXEC;
|
||||
|
||||
/* GSC will do the auth */
|
||||
if (intel_huc_is_loaded_by_gsc(huc))
|
||||
return -ENODEV;
|
||||
|
||||
ret = i915_inject_probe_error(gt->i915, -ENXIO);
|
||||
if (ret)
|
||||
goto fail;
|
||||
|
||||
ret = intel_guc_auth_huc(guc,
|
||||
intel_guc_ggtt_offset(guc, huc->fw.rsa_data));
|
||||
GEM_BUG_ON(intel_uc_fw_is_running(&huc->fw));
|
||||
|
||||
ret = intel_guc_auth_huc(guc, intel_guc_ggtt_offset(guc, huc->fw.rsa_data));
|
||||
if (ret) {
|
||||
DRM_ERROR("HuC: GuC did not ack Auth request %d\n", ret);
|
||||
goto fail;
|
||||
|
@ -133,6 +181,18 @@ fail:
|
|||
return ret;
|
||||
}
|
||||
|
||||
static bool huc_is_authenticated(struct intel_huc *huc)
|
||||
{
|
||||
struct intel_gt *gt = huc_to_gt(huc);
|
||||
intel_wakeref_t wakeref;
|
||||
u32 status = 0;
|
||||
|
||||
with_intel_runtime_pm(gt->uncore->rpm, wakeref)
|
||||
status = intel_uncore_read(gt->uncore, huc->status.reg);
|
||||
|
||||
return (status & huc->status.mask) == huc->status.value;
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_huc_check_status() - check HuC status
|
||||
* @huc: intel_huc structure
|
||||
|
@ -150,10 +210,6 @@ fail:
|
|||
*/
|
||||
int intel_huc_check_status(struct intel_huc *huc)
|
||||
{
|
||||
struct intel_gt *gt = huc_to_gt(huc);
|
||||
intel_wakeref_t wakeref;
|
||||
u32 status = 0;
|
||||
|
||||
switch (__intel_uc_fw_status(&huc->fw)) {
|
||||
case INTEL_UC_FIRMWARE_NOT_SUPPORTED:
|
||||
return -ENODEV;
|
||||
|
@ -167,10 +223,17 @@ int intel_huc_check_status(struct intel_huc *huc)
|
|||
break;
|
||||
}
|
||||
|
||||
with_intel_runtime_pm(gt->uncore->rpm, wakeref)
|
||||
status = intel_uncore_read(gt->uncore, huc->status.reg);
|
||||
return huc_is_authenticated(huc);
|
||||
}
|
||||
|
||||
return (status & huc->status.mask) == huc->status.value;
|
||||
void intel_huc_update_auth_status(struct intel_huc *huc)
|
||||
{
|
||||
if (!intel_uc_fw_is_loadable(&huc->fw))
|
||||
return;
|
||||
|
||||
if (huc_is_authenticated(huc))
|
||||
intel_uc_fw_change_status(&huc->fw,
|
||||
INTEL_UC_FIRMWARE_RUNNING);
|
||||
}
|
||||
|
||||
/**
|
||||
|
|
|
@ -27,6 +27,7 @@ int intel_huc_init(struct intel_huc *huc);
|
|||
void intel_huc_fini(struct intel_huc *huc);
|
||||
int intel_huc_auth(struct intel_huc *huc);
|
||||
int intel_huc_check_status(struct intel_huc *huc);
|
||||
void intel_huc_update_auth_status(struct intel_huc *huc);
|
||||
|
||||
static inline int intel_huc_sanitize(struct intel_huc *huc)
|
||||
{
|
||||
|
@ -50,9 +51,9 @@ static inline bool intel_huc_is_used(struct intel_huc *huc)
|
|||
return intel_uc_fw_is_available(&huc->fw);
|
||||
}
|
||||
|
||||
static inline bool intel_huc_is_authenticated(struct intel_huc *huc)
|
||||
static inline bool intel_huc_is_loaded_by_gsc(const struct intel_huc *huc)
|
||||
{
|
||||
return intel_uc_fw_is_running(&huc->fw);
|
||||
return huc->fw.loaded_via_gsc;
|
||||
}
|
||||
|
||||
void intel_huc_load_status(struct intel_huc *huc, struct drm_printer *p);
|
||||
|
|
|
@ -8,7 +8,7 @@
|
|||
#include "i915_drv.h"
|
||||
|
||||
/**
|
||||
* intel_huc_fw_upload() - load HuC uCode to device
|
||||
* intel_huc_fw_upload() - load HuC uCode to device via DMA transfer
|
||||
* @huc: intel_huc structure
|
||||
*
|
||||
* Called from intel_uc_init_hw() during driver load, resume from sleep and
|
||||
|
@ -21,6 +21,9 @@
|
|||
*/
|
||||
int intel_huc_fw_upload(struct intel_huc *huc)
|
||||
{
|
||||
if (intel_huc_is_loaded_by_gsc(huc))
|
||||
return -ENODEV;
|
||||
|
||||
/* HW doesn't look at destination address for HuC, so set it to 0 */
|
||||
return intel_uc_fw_upload(&huc->fw, 0, HUC_UKERNEL);
|
||||
}
|
||||
|
|
|
@ -45,6 +45,10 @@ static void uc_expand_default_options(struct intel_uc *uc)
|
|||
|
||||
/* Default: enable HuC authentication and GuC submission */
|
||||
i915->params.enable_guc = ENABLE_GUC_LOAD_HUC | ENABLE_GUC_SUBMISSION;
|
||||
|
||||
/* XEHPSDV and PVC do not use HuC */
|
||||
if (IS_XEHPSDV(i915) || IS_PONTEVECCHIO(i915))
|
||||
i915->params.enable_guc &= ~ENABLE_GUC_LOAD_HUC;
|
||||
}
|
||||
|
||||
/* Reset GuC providing us with fresh state for both GuC and HuC.
|
||||
|
@ -323,17 +327,10 @@ static int __uc_init(struct intel_uc *uc)
|
|||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (intel_uc_uses_huc(uc)) {
|
||||
ret = intel_huc_init(huc);
|
||||
if (ret)
|
||||
goto out_guc;
|
||||
}
|
||||
if (intel_uc_uses_huc(uc))
|
||||
intel_huc_init(huc);
|
||||
|
||||
return 0;
|
||||
|
||||
out_guc:
|
||||
intel_guc_fini(guc);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void __uc_fini(struct intel_uc *uc)
|
||||
|
@ -509,7 +506,16 @@ static int __uc_init_hw(struct intel_uc *uc)
|
|||
if (ret)
|
||||
goto err_log_capture;
|
||||
|
||||
intel_huc_auth(huc);
|
||||
/*
|
||||
* GSC-loaded HuC is authenticated by the GSC, so we don't need to
|
||||
* trigger the auth here. However, given that the HuC loaded this way
|
||||
* survive GT reset, we still need to update our SW bookkeeping to make
|
||||
* sure it reflects the correct HW status.
|
||||
*/
|
||||
if (intel_huc_is_loaded_by_gsc(huc))
|
||||
intel_huc_update_auth_status(huc);
|
||||
else
|
||||
intel_huc_auth(huc);
|
||||
|
||||
if (intel_uc_uses_guc_submission(uc))
|
||||
intel_guc_submission_enable(guc);
|
||||
|
|
|
@ -156,7 +156,7 @@ __uc_fw_auto_select(struct drm_i915_private *i915, struct intel_uc_fw *uc_fw)
|
|||
[INTEL_UC_FW_TYPE_GUC] = { blobs_guc, ARRAY_SIZE(blobs_guc) },
|
||||
[INTEL_UC_FW_TYPE_HUC] = { blobs_huc, ARRAY_SIZE(blobs_huc) },
|
||||
};
|
||||
static const struct uc_fw_platform_requirement *fw_blobs;
|
||||
const struct uc_fw_platform_requirement *fw_blobs;
|
||||
enum intel_platform p = INTEL_INFO(i915)->platform;
|
||||
u32 fw_count;
|
||||
u8 rev = INTEL_REVID(i915);
|
||||
|
@ -301,6 +301,82 @@ static void __force_fw_fetch_failures(struct intel_uc_fw *uc_fw, int e)
|
|||
}
|
||||
}
|
||||
|
||||
static int check_gsc_manifest(const struct firmware *fw,
|
||||
struct intel_uc_fw *uc_fw)
|
||||
{
|
||||
u32 *dw = (u32 *)fw->data;
|
||||
u32 version = dw[HUC_GSC_VERSION_DW];
|
||||
|
||||
uc_fw->major_ver_found = FIELD_GET(HUC_GSC_MAJOR_VER_MASK, version);
|
||||
uc_fw->minor_ver_found = FIELD_GET(HUC_GSC_MINOR_VER_MASK, version);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int check_ccs_header(struct drm_i915_private *i915,
|
||||
const struct firmware *fw,
|
||||
struct intel_uc_fw *uc_fw)
|
||||
{
|
||||
struct uc_css_header *css;
|
||||
size_t size;
|
||||
|
||||
/* Check the size of the blob before examining buffer contents */
|
||||
if (unlikely(fw->size < sizeof(struct uc_css_header))) {
|
||||
drm_warn(&i915->drm, "%s firmware %s: invalid size: %zu < %zu\n",
|
||||
intel_uc_fw_type_repr(uc_fw->type), uc_fw->path,
|
||||
fw->size, sizeof(struct uc_css_header));
|
||||
return -ENODATA;
|
||||
}
|
||||
|
||||
css = (struct uc_css_header *)fw->data;
|
||||
|
||||
/* Check integrity of size values inside CSS header */
|
||||
size = (css->header_size_dw - css->key_size_dw - css->modulus_size_dw -
|
||||
css->exponent_size_dw) * sizeof(u32);
|
||||
if (unlikely(size != sizeof(struct uc_css_header))) {
|
||||
drm_warn(&i915->drm,
|
||||
"%s firmware %s: unexpected header size: %zu != %zu\n",
|
||||
intel_uc_fw_type_repr(uc_fw->type), uc_fw->path,
|
||||
fw->size, sizeof(struct uc_css_header));
|
||||
return -EPROTO;
|
||||
}
|
||||
|
||||
/* uCode size must calculated from other sizes */
|
||||
uc_fw->ucode_size = (css->size_dw - css->header_size_dw) * sizeof(u32);
|
||||
|
||||
/* now RSA */
|
||||
uc_fw->rsa_size = css->key_size_dw * sizeof(u32);
|
||||
|
||||
/* At least, it should have header, uCode and RSA. Size of all three. */
|
||||
size = sizeof(struct uc_css_header) + uc_fw->ucode_size + uc_fw->rsa_size;
|
||||
if (unlikely(fw->size < size)) {
|
||||
drm_warn(&i915->drm, "%s firmware %s: invalid size: %zu < %zu\n",
|
||||
intel_uc_fw_type_repr(uc_fw->type), uc_fw->path,
|
||||
fw->size, size);
|
||||
return -ENOEXEC;
|
||||
}
|
||||
|
||||
/* Sanity check whether this fw is not larger than whole WOPCM memory */
|
||||
size = __intel_uc_fw_get_upload_size(uc_fw);
|
||||
if (unlikely(size >= i915->wopcm.size)) {
|
||||
drm_warn(&i915->drm, "%s firmware %s: invalid size: %zu > %zu\n",
|
||||
intel_uc_fw_type_repr(uc_fw->type), uc_fw->path,
|
||||
size, (size_t)i915->wopcm.size);
|
||||
return -E2BIG;
|
||||
}
|
||||
|
||||
/* Get version numbers from the CSS header */
|
||||
uc_fw->major_ver_found = FIELD_GET(CSS_SW_VERSION_UC_MAJOR,
|
||||
css->sw_version);
|
||||
uc_fw->minor_ver_found = FIELD_GET(CSS_SW_VERSION_UC_MINOR,
|
||||
css->sw_version);
|
||||
|
||||
if (uc_fw->type == INTEL_UC_FW_TYPE_GUC)
|
||||
uc_fw->private_data_size = css->private_data_size;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* intel_uc_fw_fetch - fetch uC firmware
|
||||
* @uc_fw: uC firmware
|
||||
|
@ -315,8 +391,6 @@ int intel_uc_fw_fetch(struct intel_uc_fw *uc_fw)
|
|||
struct device *dev = i915->drm.dev;
|
||||
struct drm_i915_gem_object *obj;
|
||||
const struct firmware *fw = NULL;
|
||||
struct uc_css_header *css;
|
||||
size_t size;
|
||||
int err;
|
||||
|
||||
GEM_BUG_ON(!i915->wopcm.size);
|
||||
|
@ -333,60 +407,12 @@ int intel_uc_fw_fetch(struct intel_uc_fw *uc_fw)
|
|||
if (err)
|
||||
goto fail;
|
||||
|
||||
/* Check the size of the blob before examining buffer contents */
|
||||
if (unlikely(fw->size < sizeof(struct uc_css_header))) {
|
||||
drm_warn(&i915->drm, "%s firmware %s: invalid size: %zu < %zu\n",
|
||||
intel_uc_fw_type_repr(uc_fw->type), uc_fw->path,
|
||||
fw->size, sizeof(struct uc_css_header));
|
||||
err = -ENODATA;
|
||||
if (uc_fw->loaded_via_gsc)
|
||||
err = check_gsc_manifest(fw, uc_fw);
|
||||
else
|
||||
err = check_ccs_header(i915, fw, uc_fw);
|
||||
if (err)
|
||||
goto fail;
|
||||
}
|
||||
|
||||
css = (struct uc_css_header *)fw->data;
|
||||
|
||||
/* Check integrity of size values inside CSS header */
|
||||
size = (css->header_size_dw - css->key_size_dw - css->modulus_size_dw -
|
||||
css->exponent_size_dw) * sizeof(u32);
|
||||
if (unlikely(size != sizeof(struct uc_css_header))) {
|
||||
drm_warn(&i915->drm,
|
||||
"%s firmware %s: unexpected header size: %zu != %zu\n",
|
||||
intel_uc_fw_type_repr(uc_fw->type), uc_fw->path,
|
||||
fw->size, sizeof(struct uc_css_header));
|
||||
err = -EPROTO;
|
||||
goto fail;
|
||||
}
|
||||
|
||||
/* uCode size must calculated from other sizes */
|
||||
uc_fw->ucode_size = (css->size_dw - css->header_size_dw) * sizeof(u32);
|
||||
|
||||
/* now RSA */
|
||||
uc_fw->rsa_size = css->key_size_dw * sizeof(u32);
|
||||
|
||||
/* At least, it should have header, uCode and RSA. Size of all three. */
|
||||
size = sizeof(struct uc_css_header) + uc_fw->ucode_size + uc_fw->rsa_size;
|
||||
if (unlikely(fw->size < size)) {
|
||||
drm_warn(&i915->drm, "%s firmware %s: invalid size: %zu < %zu\n",
|
||||
intel_uc_fw_type_repr(uc_fw->type), uc_fw->path,
|
||||
fw->size, size);
|
||||
err = -ENOEXEC;
|
||||
goto fail;
|
||||
}
|
||||
|
||||
/* Sanity check whether this fw is not larger than whole WOPCM memory */
|
||||
size = __intel_uc_fw_get_upload_size(uc_fw);
|
||||
if (unlikely(size >= i915->wopcm.size)) {
|
||||
drm_warn(&i915->drm, "%s firmware %s: invalid size: %zu > %zu\n",
|
||||
intel_uc_fw_type_repr(uc_fw->type), uc_fw->path,
|
||||
size, (size_t)i915->wopcm.size);
|
||||
err = -E2BIG;
|
||||
goto fail;
|
||||
}
|
||||
|
||||
/* Get version numbers from the CSS header */
|
||||
uc_fw->major_ver_found = FIELD_GET(CSS_SW_VERSION_UC_MAJOR,
|
||||
css->sw_version);
|
||||
uc_fw->minor_ver_found = FIELD_GET(CSS_SW_VERSION_UC_MINOR,
|
||||
css->sw_version);
|
||||
|
||||
if (uc_fw->major_ver_found != uc_fw->major_ver_wanted ||
|
||||
uc_fw->minor_ver_found < uc_fw->minor_ver_wanted) {
|
||||
|
@ -400,9 +426,6 @@ int intel_uc_fw_fetch(struct intel_uc_fw *uc_fw)
|
|||
}
|
||||
}
|
||||
|
||||
if (uc_fw->type == INTEL_UC_FW_TYPE_GUC)
|
||||
uc_fw->private_data_size = css->private_data_size;
|
||||
|
||||
if (HAS_LMEM(i915)) {
|
||||
obj = i915_gem_object_create_lmem_from_data(i915, fw->data, fw->size);
|
||||
if (!IS_ERR(obj))
|
||||
|
@ -470,7 +493,10 @@ static void uc_fw_bind_ggtt(struct intel_uc_fw *uc_fw)
|
|||
if (i915_gem_object_is_lmem(obj))
|
||||
pte_flags |= PTE_LM;
|
||||
|
||||
ggtt->vm.insert_entries(&ggtt->vm, dummy, I915_CACHE_NONE, pte_flags);
|
||||
if (ggtt->vm.raw_insert_entries)
|
||||
ggtt->vm.raw_insert_entries(&ggtt->vm, dummy, I915_CACHE_NONE, pte_flags);
|
||||
else
|
||||
ggtt->vm.insert_entries(&ggtt->vm, dummy, I915_CACHE_NONE, pte_flags);
|
||||
}
|
||||
|
||||
static void uc_fw_unbind_ggtt(struct intel_uc_fw *uc_fw)
|
||||
|
|
|
@ -102,6 +102,8 @@ struct intel_uc_fw {
|
|||
u32 ucode_size;
|
||||
|
||||
u32 private_data_size;
|
||||
|
||||
bool loaded_via_gsc;
|
||||
};
|
||||
|
||||
#ifdef CONFIG_DRM_I915_DEBUG_GUC
|
||||
|
|
|
@ -39,6 +39,11 @@
|
|||
* 3. Length info of each component can be found in header, in dwords.
|
||||
* 4. Modulus and exponent key are not required by driver. They may not appear
|
||||
* in fw. So driver will load a truncated firmware in this case.
|
||||
*
|
||||
* Starting from DG2, the HuC is loaded by the GSC instead of i915. The GSC
|
||||
* firmware performs all the required integrity checks, we just need to check
|
||||
* the version. Note that the header for GSC-managed blobs is different from the
|
||||
* CSS used for dma-loaded firmwares.
|
||||
*/
|
||||
|
||||
struct uc_css_header {
|
||||
|
@ -78,4 +83,8 @@ struct uc_css_header {
|
|||
} __packed;
|
||||
static_assert(sizeof(struct uc_css_header) == 128);
|
||||
|
||||
#define HUC_GSC_VERSION_DW 44
|
||||
#define HUC_GSC_MAJOR_VER_MASK (0xFF << 0)
|
||||
#define HUC_GSC_MINOR_VER_MASK (0xFF << 16)
|
||||
|
||||
#endif /* _INTEL_UC_FW_ABI_H */
|
||||
|
|
|
@ -428,7 +428,7 @@ struct cmd_info {
|
|||
#define R_VECS BIT(VECS0)
|
||||
#define R_ALL (R_RCS | R_VCS | R_BCS | R_VECS)
|
||||
/* rings that support this cmd: BLT/RCS/VCS/VECS */
|
||||
u16 rings;
|
||||
intel_engine_mask_t rings;
|
||||
|
||||
/* devices that support this cmd: SNB/IVB/HSW/... */
|
||||
u16 devices;
|
||||
|
|
|
@ -100,6 +100,9 @@
|
|||
#include "intel_region_ttm.h"
|
||||
#include "vlv_suspend.h"
|
||||
|
||||
/* Intel Rapid Start Technology ACPI device name */
|
||||
static const char irst_name[] = "INT3392";
|
||||
|
||||
static const struct drm_driver i915_drm_driver;
|
||||
|
||||
static int i915_get_bridge_dev(struct drm_i915_private *dev_priv)
|
||||
|
@ -520,6 +523,22 @@ mask_err:
|
|||
return ret;
|
||||
}
|
||||
|
||||
static int i915_pcode_init(struct drm_i915_private *i915)
|
||||
{
|
||||
struct intel_gt *gt;
|
||||
int id, ret;
|
||||
|
||||
for_each_gt(gt, i915, id) {
|
||||
ret = intel_pcode_init(gt->uncore);
|
||||
if (ret) {
|
||||
drm_err(>->i915->drm, "gt%d: intel_pcode_init failed %d\n", id, ret);
|
||||
return ret;
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* i915_driver_hw_probe - setup state requiring device access
|
||||
* @dev_priv: device private
|
||||
|
@ -629,7 +648,7 @@ static int i915_driver_hw_probe(struct drm_i915_private *dev_priv)
|
|||
|
||||
intel_opregion_setup(dev_priv);
|
||||
|
||||
ret = intel_pcode_init(&dev_priv->uncore);
|
||||
ret = i915_pcode_init(dev_priv);
|
||||
if (ret)
|
||||
goto err_msi;
|
||||
|
||||
|
@ -1251,7 +1270,7 @@ static int i915_drm_resume(struct drm_device *dev)
|
|||
|
||||
disable_rpm_wakeref_asserts(&dev_priv->runtime_pm);
|
||||
|
||||
ret = intel_pcode_init(&dev_priv->uncore);
|
||||
ret = i915_pcode_init(dev_priv);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
|
@ -1425,6 +1444,8 @@ static int i915_pm_suspend(struct device *kdev)
|
|||
return -ENODEV;
|
||||
}
|
||||
|
||||
i915_ggtt_mark_pte_lost(i915, false);
|
||||
|
||||
if (i915->drm.switch_power_state == DRM_SWITCH_POWER_OFF)
|
||||
return 0;
|
||||
|
||||
|
@ -1477,6 +1498,14 @@ static int i915_pm_resume(struct device *kdev)
|
|||
if (i915->drm.switch_power_state == DRM_SWITCH_POWER_OFF)
|
||||
return 0;
|
||||
|
||||
/*
|
||||
* If IRST is enabled, or if we can't detect whether it's enabled,
|
||||
* then we must assume we lost the GGTT page table entries, since
|
||||
* they are not retained if IRST decided to enter S4.
|
||||
*/
|
||||
if (!IS_ENABLED(CONFIG_ACPI) || acpi_dev_present(irst_name, NULL, -1))
|
||||
i915_ggtt_mark_pte_lost(i915, true);
|
||||
|
||||
return i915_drm_resume(&i915->drm);
|
||||
}
|
||||
|
||||
|
@ -1536,6 +1565,9 @@ static int i915_pm_restore_early(struct device *kdev)
|
|||
|
||||
static int i915_pm_restore(struct device *kdev)
|
||||
{
|
||||
struct drm_i915_private *i915 = kdev_to_i915(kdev);
|
||||
|
||||
i915_ggtt_mark_pte_lost(i915, true);
|
||||
return i915_pm_resume(kdev);
|
||||
}
|
||||
|
||||
|
|
|
@ -116,8 +116,9 @@ show_client_class(struct seq_file *m,
|
|||
total += busy_add(ctx, class);
|
||||
rcu_read_unlock();
|
||||
|
||||
seq_printf(m, "drm-engine-%s:\t%llu ns\n",
|
||||
uabi_class_names[class], total);
|
||||
if (capacity)
|
||||
seq_printf(m, "drm-engine-%s:\t%llu ns\n",
|
||||
uabi_class_names[class], total);
|
||||
|
||||
if (capacity > 1)
|
||||
seq_printf(m, "drm-engine-capacity-%s:\t%u\n",
|
||||
|
|
|
@ -11,7 +11,7 @@
|
|||
#include <linux/spinlock.h>
|
||||
#include <linux/xarray.h>
|
||||
|
||||
#include "gt/intel_engine_types.h"
|
||||
#include <uapi/drm/i915_drm.h>
|
||||
|
||||
#define I915_LAST_UABI_ENGINE_CLASS I915_ENGINE_CLASS_COMPUTE
|
||||
|
||||
|
|
|
@ -879,6 +879,7 @@ static inline struct intel_gt *to_gt(struct drm_i915_private *i915)
|
|||
#define INTEL_DISPLAY_STEP(__i915) (RUNTIME_INFO(__i915)->step.display_step)
|
||||
#define INTEL_GRAPHICS_STEP(__i915) (RUNTIME_INFO(__i915)->step.graphics_step)
|
||||
#define INTEL_MEDIA_STEP(__i915) (RUNTIME_INFO(__i915)->step.media_step)
|
||||
#define INTEL_BASEDIE_STEP(__i915) (RUNTIME_INFO(__i915)->step.basedie_step)
|
||||
|
||||
#define IS_DISPLAY_STEP(__i915, since, until) \
|
||||
(drm_WARN_ON(&(__i915)->drm, INTEL_DISPLAY_STEP(__i915) == STEP_NONE), \
|
||||
|
@ -892,6 +893,10 @@ static inline struct intel_gt *to_gt(struct drm_i915_private *i915)
|
|||
(drm_WARN_ON(&(__i915)->drm, INTEL_MEDIA_STEP(__i915) == STEP_NONE), \
|
||||
INTEL_MEDIA_STEP(__i915) >= (since) && INTEL_MEDIA_STEP(__i915) < (until))
|
||||
|
||||
#define IS_BASEDIE_STEP(__i915, since, until) \
|
||||
(drm_WARN_ON(&(__i915)->drm, INTEL_BASEDIE_STEP(__i915) == STEP_NONE), \
|
||||
INTEL_BASEDIE_STEP(__i915) >= (since) && INTEL_BASEDIE_STEP(__i915) < (until))
|
||||
|
||||
static __always_inline unsigned int
|
||||
__platform_mask_index(const struct intel_runtime_info *info,
|
||||
enum intel_platform p)
|
||||
|
@ -1144,6 +1149,14 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
|
|||
(IS_DG2(__i915) && \
|
||||
IS_DISPLAY_STEP(__i915, since, until))
|
||||
|
||||
#define IS_PVC_BD_STEP(__i915, since, until) \
|
||||
(IS_PONTEVECCHIO(__i915) && \
|
||||
IS_BASEDIE_STEP(__i915, since, until))
|
||||
|
||||
#define IS_PVC_CT_STEP(__i915, since, until) \
|
||||
(IS_PONTEVECCHIO(__i915) && \
|
||||
IS_GRAPHICS_STEP(__i915, since, until))
|
||||
|
||||
#define IS_LP(dev_priv) (INTEL_INFO(dev_priv)->is_lp)
|
||||
#define IS_GEN9_LP(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && IS_LP(dev_priv))
|
||||
#define IS_GEN9_BC(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && !IS_LP(dev_priv))
|
||||
|
@ -1159,6 +1172,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
|
|||
})
|
||||
#define RCS_MASK(gt) \
|
||||
ENGINE_INSTANCES_MASK(gt, RCS0, I915_MAX_RCS)
|
||||
#define BCS_MASK(gt) \
|
||||
ENGINE_INSTANCES_MASK(gt, BCS0, I915_MAX_BCS)
|
||||
#define VDBOX_MASK(gt) \
|
||||
ENGINE_INSTANCES_MASK(gt, VCS0, I915_MAX_VCS)
|
||||
#define VEBOX_MASK(gt) \
|
||||
|
@ -1267,9 +1282,6 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
|
|||
#define HAS_RUNTIME_PM(dev_priv) (INTEL_INFO(dev_priv)->has_runtime_pm)
|
||||
#define HAS_64BIT_RELOC(dev_priv) (INTEL_INFO(dev_priv)->has_64bit_reloc)
|
||||
|
||||
#define HAS_MSLICES(dev_priv) \
|
||||
(INTEL_INFO(dev_priv)->has_mslices)
|
||||
|
||||
/*
|
||||
* Set this flag, when platform requires 64K GTT page sizes or larger for
|
||||
* device local memory access.
|
||||
|
@ -1308,6 +1320,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
|
|||
|
||||
#define HAS_LSPCON(dev_priv) (IS_DISPLAY_VER(dev_priv, 9, 10))
|
||||
|
||||
#define HAS_L3_CCS_READ(i915) (INTEL_INFO(i915)->has_l3_ccs_read)
|
||||
|
||||
/* DPF == dynamic parity feature */
|
||||
#define HAS_L3_DPF(dev_priv) (INTEL_INFO(dev_priv)->has_l3_dpf)
|
||||
#define NUM_L3_SLICES(dev_priv) (IS_HSW_GT3(dev_priv) ? \
|
||||
|
@ -1341,6 +1355,10 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
|
|||
|
||||
#define HAS_MBUS_JOINING(i915) (IS_ALDERLAKE_P(i915))
|
||||
|
||||
#define HAS_3D_PIPELINE(i915) (INTEL_INFO(i915)->has_3d_pipeline)
|
||||
|
||||
#define HAS_ONE_EU_PER_FUSE_BIT(i915) (INTEL_INFO(i915)->has_one_eu_per_fuse_bit)
|
||||
|
||||
/* i915_gem.c */
|
||||
void i915_gem_init_early(struct drm_i915_private *dev_priv);
|
||||
void i915_gem_cleanup_early(struct drm_i915_private *dev_priv);
|
||||
|
|
|
@ -148,14 +148,21 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
|
|||
value = intel_engines_has_context_isolation(i915);
|
||||
break;
|
||||
case I915_PARAM_SLICE_MASK:
|
||||
/* Not supported from Xe_HP onward; use topology queries */
|
||||
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
|
||||
return -EINVAL;
|
||||
|
||||
value = sseu->slice_mask;
|
||||
if (!value)
|
||||
return -ENODEV;
|
||||
break;
|
||||
case I915_PARAM_SUBSLICE_MASK:
|
||||
/* Not supported from Xe_HP onward; use topology queries */
|
||||
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
|
||||
return -EINVAL;
|
||||
|
||||
/* Only copy bits from the first slice */
|
||||
memcpy(&value, sseu->subslice_mask,
|
||||
min(sseu->ss_stride, (u8)sizeof(value)));
|
||||
value = intel_sseu_get_hsw_subslices(sseu, 0);
|
||||
if (!value)
|
||||
return -ENODEV;
|
||||
break;
|
||||
|
|
|
@ -581,6 +581,15 @@ static void error_print_engine(struct drm_i915_error_state_buf *m,
|
|||
err_printf(m, " RC PSMI: 0x%08x\n", ee->rc_psmi);
|
||||
err_printf(m, " FAULT_REG: 0x%08x\n", ee->fault_reg);
|
||||
}
|
||||
if (GRAPHICS_VER(m->i915) >= 11) {
|
||||
err_printf(m, " NOPID: 0x%08x\n", ee->nopid);
|
||||
err_printf(m, " EXCC: 0x%08x\n", ee->excc);
|
||||
err_printf(m, " CMD_CCTL: 0x%08x\n", ee->cmd_cctl);
|
||||
err_printf(m, " CSCMDOP: 0x%08x\n", ee->cscmdop);
|
||||
err_printf(m, " CTX_SR_CTL: 0x%08x\n", ee->ctx_sr_ctl);
|
||||
err_printf(m, " DMA_FADDR_HI: 0x%08x\n", ee->dma_faddr_hi);
|
||||
err_printf(m, " DMA_FADDR_LO: 0x%08x\n", ee->dma_faddr_lo);
|
||||
}
|
||||
if (HAS_PPGTT(m->i915)) {
|
||||
err_printf(m, " GFX_MODE: 0x%08x\n", ee->vm_info.gfx_mode);
|
||||
|
||||
|
@ -1095,8 +1104,12 @@ i915_vma_coredump_create(const struct intel_gt *gt,
|
|||
|
||||
for_each_sgt_daddr(dma, iter, vma_res->bi.pages) {
|
||||
mutex_lock(&ggtt->error_mutex);
|
||||
ggtt->vm.insert_page(&ggtt->vm, dma, slot,
|
||||
I915_CACHE_NONE, 0);
|
||||
if (ggtt->vm.raw_insert_page)
|
||||
ggtt->vm.raw_insert_page(&ggtt->vm, dma, slot,
|
||||
I915_CACHE_NONE, 0);
|
||||
else
|
||||
ggtt->vm.insert_page(&ggtt->vm, dma, slot,
|
||||
I915_CACHE_NONE, 0);
|
||||
mb();
|
||||
|
||||
s = io_mapping_map_wc(&ggtt->iomap, slot, PAGE_SIZE);
|
||||
|
@ -1224,6 +1237,16 @@ static void engine_record_registers(struct intel_engine_coredump *ee)
|
|||
ee->ipehr = ENGINE_READ(engine, IPEHR);
|
||||
}
|
||||
|
||||
if (GRAPHICS_VER(i915) >= 11) {
|
||||
ee->cmd_cctl = ENGINE_READ(engine, RING_CMD_CCTL);
|
||||
ee->cscmdop = ENGINE_READ(engine, RING_CSCMDOP);
|
||||
ee->ctx_sr_ctl = ENGINE_READ(engine, RING_CTX_SR_CTL);
|
||||
ee->dma_faddr_hi = ENGINE_READ(engine, RING_DMA_FADD_UDW);
|
||||
ee->dma_faddr_lo = ENGINE_READ(engine, RING_DMA_FADD);
|
||||
ee->nopid = ENGINE_READ(engine, RING_NOPID);
|
||||
ee->excc = ENGINE_READ(engine, RING_EXCC);
|
||||
}
|
||||
|
||||
intel_engine_get_instdone(engine, &ee->instdone);
|
||||
|
||||
ee->instpm = ENGINE_READ(engine, RING_INSTPM);
|
||||
|
|
|
@ -84,6 +84,13 @@ struct intel_engine_coredump {
|
|||
u32 fault_reg;
|
||||
u64 faddr;
|
||||
u32 rc_psmi; /* sleep state */
|
||||
u32 nopid;
|
||||
u32 excc;
|
||||
u32 cmd_cctl;
|
||||
u32 cscmdop;
|
||||
u32 ctx_sr_ctl;
|
||||
u32 dma_faddr_hi;
|
||||
u32 dma_faddr_lo;
|
||||
struct intel_instdone instdone;
|
||||
|
||||
/* GuC matched capture-lists info */
|
||||
|
|
|
@ -171,6 +171,7 @@
|
|||
.display.overlay_needs_physical = 1, \
|
||||
.display.has_gmch = 1, \
|
||||
.gpu_reset_clobbers_display = true, \
|
||||
.has_3d_pipeline = 1, \
|
||||
.hws_needs_physical = 1, \
|
||||
.unfenced_needs_alignment = 1, \
|
||||
.platform_engine_mask = BIT(RCS0), \
|
||||
|
@ -190,6 +191,7 @@
|
|||
.display.has_overlay = 1, \
|
||||
.display.overlay_needs_physical = 1, \
|
||||
.display.has_gmch = 1, \
|
||||
.has_3d_pipeline = 1, \
|
||||
.gpu_reset_clobbers_display = true, \
|
||||
.hws_needs_physical = 1, \
|
||||
.unfenced_needs_alignment = 1, \
|
||||
|
@ -232,6 +234,7 @@ static const struct intel_device_info i865g_info = {
|
|||
.display.has_gmch = 1, \
|
||||
.gpu_reset_clobbers_display = true, \
|
||||
.platform_engine_mask = BIT(RCS0), \
|
||||
.has_3d_pipeline = 1, \
|
||||
.has_snoop = true, \
|
||||
.has_coherent_ggtt = true, \
|
||||
.dma_mask_size = 32, \
|
||||
|
@ -323,6 +326,7 @@ static const struct intel_device_info pnv_m_info = {
|
|||
.display.has_gmch = 1, \
|
||||
.gpu_reset_clobbers_display = true, \
|
||||
.platform_engine_mask = BIT(RCS0), \
|
||||
.has_3d_pipeline = 1, \
|
||||
.has_snoop = true, \
|
||||
.has_coherent_ggtt = true, \
|
||||
.dma_mask_size = 36, \
|
||||
|
@ -374,6 +378,7 @@ static const struct intel_device_info gm45_info = {
|
|||
.display.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B), \
|
||||
.display.has_hotplug = 1, \
|
||||
.platform_engine_mask = BIT(RCS0) | BIT(VCS0), \
|
||||
.has_3d_pipeline = 1, \
|
||||
.has_snoop = true, \
|
||||
.has_coherent_ggtt = true, \
|
||||
/* ilk does support rc6, but we do not implement [power] contexts */ \
|
||||
|
@ -405,6 +410,7 @@ static const struct intel_device_info ilk_m_info = {
|
|||
.display.has_hotplug = 1, \
|
||||
.display.fbc_mask = BIT(INTEL_FBC_A), \
|
||||
.platform_engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0), \
|
||||
.has_3d_pipeline = 1, \
|
||||
.has_coherent_ggtt = true, \
|
||||
.has_llc = 1, \
|
||||
.has_rc6 = 1, \
|
||||
|
@ -456,6 +462,7 @@ static const struct intel_device_info snb_m_gt2_info = {
|
|||
.display.has_hotplug = 1, \
|
||||
.display.fbc_mask = BIT(INTEL_FBC_A), \
|
||||
.platform_engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0), \
|
||||
.has_3d_pipeline = 1, \
|
||||
.has_coherent_ggtt = true, \
|
||||
.has_llc = 1, \
|
||||
.has_rc6 = 1, \
|
||||
|
@ -692,6 +699,7 @@ static const struct intel_device_info skl_gt4_info = {
|
|||
.display.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | \
|
||||
BIT(TRANSCODER_C) | BIT(TRANSCODER_EDP) | \
|
||||
BIT(TRANSCODER_DSI_A) | BIT(TRANSCODER_DSI_C), \
|
||||
.has_3d_pipeline = 1, \
|
||||
.has_64bit_reloc = 1, \
|
||||
.display.has_ddi = 1, \
|
||||
.display.has_fpga_dbg = 1, \
|
||||
|
@ -1005,6 +1013,7 @@ static const struct intel_device_info adl_p_info = {
|
|||
.graphics.rel = 50, \
|
||||
XE_HP_PAGE_SIZES, \
|
||||
.dma_mask_size = 46, \
|
||||
.has_3d_pipeline = 1, \
|
||||
.has_64bit_reloc = 1, \
|
||||
.has_flat_ccs = 1, \
|
||||
.has_global_mocs = 1, \
|
||||
|
@ -1012,7 +1021,7 @@ static const struct intel_device_info adl_p_info = {
|
|||
.has_llc = 1, \
|
||||
.has_logical_ring_contexts = 1, \
|
||||
.has_logical_ring_elsq = 1, \
|
||||
.has_mslices = 1, \
|
||||
.has_mslice_steering = 1, \
|
||||
.has_rc6 = 1, \
|
||||
.has_reset_engine = 1, \
|
||||
.has_rps = 1, \
|
||||
|
@ -1079,7 +1088,12 @@ static const struct intel_device_info ats_m_info = {
|
|||
|
||||
#define XE_HPC_FEATURES \
|
||||
XE_HP_FEATURES, \
|
||||
.dma_mask_size = 52
|
||||
.dma_mask_size = 52, \
|
||||
.has_3d_pipeline = 0, \
|
||||
.has_guc_deprivilege = 1, \
|
||||
.has_l3_ccs_read = 1, \
|
||||
.has_mslice_steering = 0, \
|
||||
.has_one_eu_per_fuse_bit = 1
|
||||
|
||||
__maybe_unused
|
||||
static const struct intel_device_info pvc_info = {
|
||||
|
|
|
@ -31,10 +31,12 @@ static int copy_query_item(void *query_hdr, size_t query_sz,
|
|||
|
||||
static int fill_topology_info(const struct sseu_dev_info *sseu,
|
||||
struct drm_i915_query_item *query_item,
|
||||
const u8 *subslice_mask)
|
||||
intel_sseu_ss_mask_t subslice_mask)
|
||||
{
|
||||
struct drm_i915_query_topology_info topo;
|
||||
u32 slice_length, subslice_length, eu_length, total_length;
|
||||
int ss_stride = GEN_SSEU_STRIDE(sseu->max_subslices);
|
||||
int eu_stride = GEN_SSEU_STRIDE(sseu->max_eus_per_subslice);
|
||||
int ret;
|
||||
|
||||
BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask));
|
||||
|
@ -43,8 +45,8 @@ static int fill_topology_info(const struct sseu_dev_info *sseu,
|
|||
return -ENODEV;
|
||||
|
||||
slice_length = sizeof(sseu->slice_mask);
|
||||
subslice_length = sseu->max_slices * sseu->ss_stride;
|
||||
eu_length = sseu->max_slices * sseu->max_subslices * sseu->eu_stride;
|
||||
subslice_length = sseu->max_slices * ss_stride;
|
||||
eu_length = sseu->max_slices * sseu->max_subslices * eu_stride;
|
||||
total_length = sizeof(topo) + slice_length + subslice_length +
|
||||
eu_length;
|
||||
|
||||
|
@ -59,9 +61,9 @@ static int fill_topology_info(const struct sseu_dev_info *sseu,
|
|||
topo.max_eus_per_subslice = sseu->max_eus_per_subslice;
|
||||
|
||||
topo.subslice_offset = slice_length;
|
||||
topo.subslice_stride = sseu->ss_stride;
|
||||
topo.subslice_stride = ss_stride;
|
||||
topo.eu_offset = slice_length + subslice_length;
|
||||
topo.eu_stride = sseu->eu_stride;
|
||||
topo.eu_stride = eu_stride;
|
||||
|
||||
if (copy_to_user(u64_to_user_ptr(query_item->data_ptr),
|
||||
&topo, sizeof(topo)))
|
||||
|
@ -71,15 +73,15 @@ static int fill_topology_info(const struct sseu_dev_info *sseu,
|
|||
&sseu->slice_mask, slice_length))
|
||||
return -EFAULT;
|
||||
|
||||
if (copy_to_user(u64_to_user_ptr(query_item->data_ptr +
|
||||
sizeof(topo) + slice_length),
|
||||
subslice_mask, subslice_length))
|
||||
if (intel_sseu_copy_ssmask_to_user(u64_to_user_ptr(query_item->data_ptr +
|
||||
sizeof(topo) + slice_length),
|
||||
sseu))
|
||||
return -EFAULT;
|
||||
|
||||
if (copy_to_user(u64_to_user_ptr(query_item->data_ptr +
|
||||
sizeof(topo) +
|
||||
slice_length + subslice_length),
|
||||
sseu->eu_mask, eu_length))
|
||||
if (intel_sseu_copy_eumask_to_user(u64_to_user_ptr(query_item->data_ptr +
|
||||
sizeof(topo) +
|
||||
slice_length + subslice_length),
|
||||
sseu))
|
||||
return -EFAULT;
|
||||
|
||||
return total_length;
|
||||
|
|
|
@ -976,6 +976,14 @@
|
|||
#define GEN12_COMPUTE2_RING_BASE 0x1e000
|
||||
#define GEN12_COMPUTE3_RING_BASE 0x26000
|
||||
#define BLT_RING_BASE 0x22000
|
||||
#define XEHPC_BCS1_RING_BASE 0x3e0000
|
||||
#define XEHPC_BCS2_RING_BASE 0x3e2000
|
||||
#define XEHPC_BCS3_RING_BASE 0x3e4000
|
||||
#define XEHPC_BCS4_RING_BASE 0x3e6000
|
||||
#define XEHPC_BCS5_RING_BASE 0x3e8000
|
||||
#define XEHPC_BCS6_RING_BASE 0x3ea000
|
||||
#define XEHPC_BCS7_RING_BASE 0x3ec000
|
||||
#define XEHPC_BCS8_RING_BASE 0x3ee000
|
||||
#define DG1_GSC_HECI1_BASE 0x00258000
|
||||
#define DG1_GSC_HECI2_BASE 0x00259000
|
||||
#define DG2_GSC_HECI1_BASE 0x00373000
|
||||
|
@ -1846,6 +1854,7 @@
|
|||
#define BXT_RP_STATE_CAP _MMIO(0x138170)
|
||||
#define GEN9_RP_STATE_LIMITS _MMIO(0x138148)
|
||||
#define XEHPSDV_RP_STATE_CAP _MMIO(0x250014)
|
||||
#define PVC_RP_STATE_CAP _MMIO(0x281014)
|
||||
|
||||
#define GT0_PERF_LIMIT_REASONS _MMIO(0x1381a8)
|
||||
#define GT0_PERF_LIMIT_REASONS_MASK 0xde3
|
||||
|
@ -6758,6 +6767,14 @@
|
|||
#define DG1_UNCORE_GET_INIT_STATUS 0x0
|
||||
#define DG1_UNCORE_INIT_STATUS_COMPLETE 0x1
|
||||
#define GEN12_PCODE_READ_SAGV_BLOCK_TIME_US 0x23
|
||||
#define XEHP_PCODE_FREQUENCY_CONFIG 0x6e /* xehpsdv, pvc */
|
||||
/* XEHP_PCODE_FREQUENCY_CONFIG sub-commands (param1) */
|
||||
#define PCODE_MBOX_FC_SC_READ_FUSED_P0 0x0
|
||||
#define PCODE_MBOX_FC_SC_READ_FUSED_PN 0x1
|
||||
/* PCODE_MBOX_DOMAIN_* - mailbox domain IDs */
|
||||
/* XEHP_PCODE_FREQUENCY_CONFIG param2 */
|
||||
#define PCODE_MBOX_DOMAIN_NONE 0x0
|
||||
#define PCODE_MBOX_DOMAIN_MEDIAFF 0x3
|
||||
#define GEN6_PCODE_DATA _MMIO(0x138128)
|
||||
#define GEN6_PCODE_FREQ_IA_RATIO_SHIFT 8
|
||||
#define GEN6_PCODE_FREQ_RING_RATIO_SHIFT 16
|
||||
|
@ -8328,23 +8345,6 @@ enum skl_power_gate {
|
|||
#define SGGI_DIS REG_BIT(15)
|
||||
#define SGR_DIS REG_BIT(13)
|
||||
|
||||
#define XEHPSDV_TILE0_ADDR_RANGE _MMIO(0x4900)
|
||||
#define XEHPSDV_TILE_LMEM_RANGE_SHIFT 8
|
||||
|
||||
#define XEHPSDV_FLAT_CCS_BASE_ADDR _MMIO(0x4910)
|
||||
#define XEHPSDV_CCS_BASE_SHIFT 8
|
||||
|
||||
/* gamt regs */
|
||||
#define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
|
||||
#define GEN8_L3_LRA_1_GPGPU_DEFAULT_VALUE_BDW 0x67F1427F /* max/min for LRA1/2 */
|
||||
#define GEN8_L3_LRA_1_GPGPU_DEFAULT_VALUE_CHV 0x5FF101FF /* max/min for LRA1/2 */
|
||||
#define GEN9_L3_LRA_1_GPGPU_DEFAULT_VALUE_SKL 0x67F1427F /* " " */
|
||||
#define GEN9_L3_LRA_1_GPGPU_DEFAULT_VALUE_BXT 0x5FF101FF /* " " */
|
||||
|
||||
#define MMCD_MISC_CTRL _MMIO(0x4ddc) /* skl+ */
|
||||
#define MMCD_PCLA (1 << 31)
|
||||
#define MMCD_HOTSPOT_EN (1 << 27)
|
||||
|
||||
#define _ICL_PHY_MISC_A 0x64C00
|
||||
#define _ICL_PHY_MISC_B 0x64C04
|
||||
#define _DG2_PHY_MISC_TC1 0x64C14 /* TC1="PHY E" but offset as if "PHY F" */
|
||||
|
|
|
@ -60,7 +60,7 @@ static struct kmem_cache *slab_execute_cbs;
|
|||
|
||||
static const char *i915_fence_get_driver_name(struct dma_fence *fence)
|
||||
{
|
||||
return dev_name(to_request(fence)->engine->i915->drm.dev);
|
||||
return dev_name(to_request(fence)->i915->drm.dev);
|
||||
}
|
||||
|
||||
static const char *i915_fence_get_timeline_name(struct dma_fence *fence)
|
||||
|
@ -134,17 +134,42 @@ static void i915_fence_release(struct dma_fence *fence)
|
|||
i915_sw_fence_fini(&rq->semaphore);
|
||||
|
||||
/*
|
||||
* Keep one request on each engine for reserved use under mempressure,
|
||||
* Keep one request on each engine for reserved use under mempressure
|
||||
* do not use with virtual engines as this really is only needed for
|
||||
* kernel contexts.
|
||||
*
|
||||
* We do not hold a reference to the engine here and so have to be
|
||||
* very careful in what rq->engine we poke. The virtual engine is
|
||||
* referenced via the rq->context and we released that ref during
|
||||
* i915_request_retire(), ergo we must not dereference a virtual
|
||||
* engine here. Not that we would want to, as the only consumer of
|
||||
* the reserved engine->request_pool is the power management parking,
|
||||
* which must-not-fail, and that is only run on the physical engines.
|
||||
*
|
||||
* Since the request must have been executed to be have completed,
|
||||
* we know that it will have been processed by the HW and will
|
||||
* not be unsubmitted again, so rq->engine and rq->execution_mask
|
||||
* at this point is stable. rq->execution_mask will be a single
|
||||
* bit if the last and _only_ engine it could execution on was a
|
||||
* physical engine, if it's multiple bits then it started on and
|
||||
* could still be on a virtual engine. Thus if the mask is not a
|
||||
* power-of-two we assume that rq->engine may still be a virtual
|
||||
* engine and so a dangling invalid pointer that we cannot dereference
|
||||
*
|
||||
* For example, consider the flow of a bonded request through a virtual
|
||||
* engine. The request is created with a wide engine mask (all engines
|
||||
* that we might execute on). On processing the bond, the request mask
|
||||
* is reduced to one or more engines. If the request is subsequently
|
||||
* bound to a single engine, it will then be constrained to only
|
||||
* execute on that engine and never returned to the virtual engine
|
||||
* after timeslicing away, see __unwind_incomplete_requests(). Thus we
|
||||
* know that if the rq->execution_mask is a single bit, rq->engine
|
||||
* can be a physical engine with the exact corresponding mask.
|
||||
*/
|
||||
if (!intel_engine_is_virtual(rq->engine) &&
|
||||
!cmpxchg(&rq->engine->request_pool, NULL, rq)) {
|
||||
intel_context_put(rq->context);
|
||||
is_power_of_2(rq->execution_mask) &&
|
||||
!cmpxchg(&rq->engine->request_pool, NULL, rq))
|
||||
return;
|
||||
}
|
||||
|
||||
intel_context_put(rq->context);
|
||||
|
||||
kmem_cache_free(slab_requests, rq);
|
||||
}
|
||||
|
@ -611,7 +636,7 @@ bool __i915_request_submit(struct i915_request *request)
|
|||
goto active;
|
||||
}
|
||||
|
||||
if (unlikely(intel_context_is_banned(request->context)))
|
||||
if (unlikely(!intel_context_is_schedulable(request->context)))
|
||||
i915_request_set_error_once(request, -EIO);
|
||||
|
||||
if (unlikely(fatal_error(request->fence.error)))
|
||||
|
@ -921,22 +946,11 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
|
|||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Hold a reference to the intel_context over life of an i915_request.
|
||||
* Without this an i915_request can exist after the context has been
|
||||
* destroyed (e.g. request retired, context closed, but user space holds
|
||||
* a reference to the request from an out fence). In the case of GuC
|
||||
* submission + virtual engine, the engine that the request references
|
||||
* is also destroyed which can trigger bad pointer dref in fence ops
|
||||
* (e.g. i915_fence_get_driver_name). We could likely change these
|
||||
* functions to avoid touching the engine but let's just be safe and
|
||||
* hold the intel_context reference. In execlist mode the request always
|
||||
* eventually points to a physical engine so this isn't an issue.
|
||||
*/
|
||||
rq->context = intel_context_get(ce);
|
||||
rq->context = ce;
|
||||
rq->engine = ce->engine;
|
||||
rq->ring = ce->ring;
|
||||
rq->execution_mask = ce->engine->mask;
|
||||
rq->i915 = ce->engine->i915;
|
||||
|
||||
ret = intel_timeline_get_seqno(tl, rq, &seqno);
|
||||
if (ret)
|
||||
|
@ -1008,7 +1022,6 @@ err_unwind:
|
|||
GEM_BUG_ON(!list_empty(&rq->sched.waiters_list));
|
||||
|
||||
err_free:
|
||||
intel_context_put(ce);
|
||||
kmem_cache_free(slab_requests, rq);
|
||||
err_unreserve:
|
||||
intel_context_unpin(ce);
|
||||
|
|
|
@ -196,6 +196,8 @@ struct i915_request {
|
|||
struct dma_fence fence;
|
||||
spinlock_t lock;
|
||||
|
||||
struct drm_i915_private *i915;
|
||||
|
||||
/**
|
||||
* Context and ring buffer related to this request
|
||||
* Contexts are refcounted, so when this request is associated with a
|
||||
|
|
|
@ -166,7 +166,14 @@ static ssize_t error_state_read(struct file *filp, struct kobject *kobj,
|
|||
struct device *kdev = kobj_to_dev(kobj);
|
||||
struct drm_i915_private *i915 = kdev_minor_to_i915(kdev);
|
||||
struct i915_gpu_coredump *gpu;
|
||||
ssize_t ret;
|
||||
ssize_t ret = 0;
|
||||
|
||||
/*
|
||||
* FIXME: Concurrent clients triggering resets and reading + clearing
|
||||
* dumps can cause inconsistent sysfs reads when a user calls in with a
|
||||
* non-zero offset to complete a prior partial read but the
|
||||
* gpu_coredump has been cleared or replaced.
|
||||
*/
|
||||
|
||||
gpu = i915_first_error_state(i915);
|
||||
if (IS_ERR(gpu)) {
|
||||
|
@ -178,8 +185,10 @@ static ssize_t error_state_read(struct file *filp, struct kobject *kobj,
|
|||
const char *str = "No error state collected\n";
|
||||
size_t len = strlen(str);
|
||||
|
||||
ret = min_t(size_t, count, len - off);
|
||||
memcpy(buf, str + off, ret);
|
||||
if (off < len) {
|
||||
ret = min_t(size_t, count, len - off);
|
||||
memcpy(buf, str + off, ret);
|
||||
}
|
||||
}
|
||||
|
||||
return ret;
|
||||
|
@ -259,4 +268,6 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv)
|
|||
|
||||
device_remove_bin_file(kdev, &dpf_attrs_1);
|
||||
device_remove_bin_file(kdev, &dpf_attrs);
|
||||
|
||||
kobject_put(dev_priv->sysfs_gt);
|
||||
}
|
||||
|
|
|
@ -23,6 +23,7 @@
|
|||
*/
|
||||
|
||||
#include <linux/sched/mm.h>
|
||||
#include <linux/dma-fence-array.h>
|
||||
#include <drm/drm_gem.h>
|
||||
|
||||
#include "display/intel_frontbuffer.h"
|
||||
|
@ -550,13 +551,6 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
|
|||
if (WARN_ON_ONCE(vma->obj->flags & I915_BO_ALLOC_GPU_ONLY))
|
||||
return IOMEM_ERR_PTR(-EINVAL);
|
||||
|
||||
if (!i915_gem_object_is_lmem(vma->obj)) {
|
||||
if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
|
||||
err = -ENODEV;
|
||||
goto err;
|
||||
}
|
||||
}
|
||||
|
||||
GEM_BUG_ON(!i915_vma_is_ggtt(vma));
|
||||
GEM_BUG_ON(!i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND));
|
||||
GEM_BUG_ON(i915_vma_verify_bind_complete(vma));
|
||||
|
@ -569,20 +563,33 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
|
|||
* of pages, that way we can also drop the
|
||||
* I915_BO_ALLOC_CONTIGUOUS when allocating the object.
|
||||
*/
|
||||
if (i915_gem_object_is_lmem(vma->obj))
|
||||
if (i915_gem_object_is_lmem(vma->obj)) {
|
||||
ptr = i915_gem_object_lmem_io_map(vma->obj, 0,
|
||||
vma->obj->base.size);
|
||||
else
|
||||
} else if (i915_vma_is_map_and_fenceable(vma)) {
|
||||
ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
|
||||
vma->node.start,
|
||||
vma->node.size);
|
||||
} else {
|
||||
ptr = (void __iomem *)
|
||||
i915_gem_object_pin_map(vma->obj, I915_MAP_WC);
|
||||
if (IS_ERR(ptr)) {
|
||||
err = PTR_ERR(ptr);
|
||||
goto err;
|
||||
}
|
||||
ptr = page_pack_bits(ptr, 1);
|
||||
}
|
||||
|
||||
if (ptr == NULL) {
|
||||
err = -ENOMEM;
|
||||
goto err;
|
||||
}
|
||||
|
||||
if (unlikely(cmpxchg(&vma->iomap, NULL, ptr))) {
|
||||
io_mapping_unmap(ptr);
|
||||
if (page_unmask_bits(ptr))
|
||||
__i915_gem_object_release_map(vma->obj);
|
||||
else
|
||||
io_mapping_unmap(ptr);
|
||||
ptr = vma->iomap;
|
||||
}
|
||||
}
|
||||
|
@ -596,7 +603,7 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
|
|||
i915_vma_set_ggtt_write(vma);
|
||||
|
||||
/* NB Access through the GTT requires the device to be awake. */
|
||||
return ptr;
|
||||
return page_mask_bits(ptr);
|
||||
|
||||
err_unpin:
|
||||
__i915_vma_unpin(vma);
|
||||
|
@ -614,6 +621,8 @@ void i915_vma_unpin_iomap(struct i915_vma *vma)
|
|||
{
|
||||
GEM_BUG_ON(vma->iomap == NULL);
|
||||
|
||||
/* XXX We keep the mapping until __i915_vma_unbind()/evict() */
|
||||
|
||||
i915_vma_flush_writes(vma);
|
||||
|
||||
i915_vma_unpin_fence(vma);
|
||||
|
@ -1762,7 +1771,10 @@ static void __i915_vma_iounmap(struct i915_vma *vma)
|
|||
if (vma->iomap == NULL)
|
||||
return;
|
||||
|
||||
io_mapping_unmap(vma->iomap);
|
||||
if (page_unmask_bits(vma->iomap))
|
||||
__i915_gem_object_release_map(vma->obj);
|
||||
else
|
||||
io_mapping_unmap(vma->iomap);
|
||||
vma->iomap = NULL;
|
||||
}
|
||||
|
||||
|
@ -1823,6 +1835,21 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
|
|||
if (unlikely(err))
|
||||
return err;
|
||||
|
||||
/*
|
||||
* Reserve fences slot early to prevent an allocation after preparing
|
||||
* the workload and associating fences with dma_resv.
|
||||
*/
|
||||
if (fence && !(flags & __EXEC_OBJECT_NO_RESERVE)) {
|
||||
struct dma_fence *curr;
|
||||
int idx;
|
||||
|
||||
dma_fence_array_for_each(curr, idx, fence)
|
||||
;
|
||||
err = dma_resv_reserve_fences(vma->obj->base.resv, idx);
|
||||
if (unlikely(err))
|
||||
return err;
|
||||
}
|
||||
|
||||
if (flags & EXEC_OBJECT_WRITE) {
|
||||
struct intel_frontbuffer *front;
|
||||
|
||||
|
@ -1832,31 +1859,23 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
|
|||
i915_active_add_request(&front->write, rq);
|
||||
intel_frontbuffer_put(front);
|
||||
}
|
||||
}
|
||||
|
||||
if (!(flags & __EXEC_OBJECT_NO_RESERVE)) {
|
||||
err = dma_resv_reserve_fences(vma->obj->base.resv, 1);
|
||||
if (unlikely(err))
|
||||
return err;
|
||||
}
|
||||
if (fence) {
|
||||
struct dma_fence *curr;
|
||||
enum dma_resv_usage usage;
|
||||
int idx;
|
||||
|
||||
if (fence) {
|
||||
dma_resv_add_fence(vma->obj->base.resv, fence,
|
||||
DMA_RESV_USAGE_WRITE);
|
||||
obj->read_domains = 0;
|
||||
if (flags & EXEC_OBJECT_WRITE) {
|
||||
usage = DMA_RESV_USAGE_WRITE;
|
||||
obj->write_domain = I915_GEM_DOMAIN_RENDER;
|
||||
obj->read_domains = 0;
|
||||
}
|
||||
} else {
|
||||
if (!(flags & __EXEC_OBJECT_NO_RESERVE)) {
|
||||
err = dma_resv_reserve_fences(vma->obj->base.resv, 1);
|
||||
if (unlikely(err))
|
||||
return err;
|
||||
} else {
|
||||
usage = DMA_RESV_USAGE_READ;
|
||||
}
|
||||
|
||||
if (fence) {
|
||||
dma_resv_add_fence(vma->obj->base.resv, fence,
|
||||
DMA_RESV_USAGE_READ);
|
||||
obj->write_domain = 0;
|
||||
}
|
||||
dma_fence_array_for_each(curr, idx, fence)
|
||||
dma_resv_add_fence(vma->obj->base.resv, curr, usage);
|
||||
}
|
||||
|
||||
if (flags & EXEC_OBJECT_NEEDS_FENCE && vma->fence)
|
||||
|
@ -1899,9 +1918,11 @@ struct dma_fence *__i915_vma_evict(struct i915_vma *vma, bool async)
|
|||
/* release the fence reg _after_ flushing */
|
||||
i915_vma_revoke_fence(vma);
|
||||
|
||||
__i915_vma_iounmap(vma);
|
||||
clear_bit(I915_VMA_CAN_FENCE_BIT, __i915_vma_flags(vma));
|
||||
}
|
||||
|
||||
__i915_vma_iounmap(vma);
|
||||
|
||||
GEM_BUG_ON(vma->fence);
|
||||
GEM_BUG_ON(i915_vma_has_userfault(vma));
|
||||
|
||||
|
|
|
@ -143,6 +143,7 @@ enum intel_ppgtt_type {
|
|||
func(needs_compact_pt); \
|
||||
func(gpu_reset_clobbers_display); \
|
||||
func(has_reset_engine); \
|
||||
func(has_3d_pipeline); \
|
||||
func(has_4tile); \
|
||||
func(has_flat_ccs); \
|
||||
func(has_global_mocs); \
|
||||
|
@ -150,12 +151,14 @@ enum intel_ppgtt_type {
|
|||
func(has_heci_pxp); \
|
||||
func(has_heci_gscfi); \
|
||||
func(has_guc_deprivilege); \
|
||||
func(has_l3_ccs_read); \
|
||||
func(has_l3_dpf); \
|
||||
func(has_llc); \
|
||||
func(has_logical_ring_contexts); \
|
||||
func(has_logical_ring_elsq); \
|
||||
func(has_media_ratio_mode); \
|
||||
func(has_mslices); \
|
||||
func(has_mslice_steering); \
|
||||
func(has_one_eu_per_fuse_bit); \
|
||||
func(has_pooled_eu); \
|
||||
func(has_pxp); \
|
||||
func(has_rc6); \
|
||||
|
|
|
@ -7634,10 +7634,9 @@ static void xehpsdv_init_clock_gating(struct drm_i915_private *dev_priv)
|
|||
|
||||
static void dg2_init_clock_gating(struct drm_i915_private *i915)
|
||||
{
|
||||
/* Wa_22010954014:dg2_g10 */
|
||||
if (IS_DG2_G10(i915))
|
||||
intel_uncore_rmw(&i915->uncore, XEHP_CLOCK_GATE_DIS, 0,
|
||||
SGSI_SIDECLK_DIS);
|
||||
/* Wa_22010954014:dg2 */
|
||||
intel_uncore_rmw(&i915->uncore, XEHP_CLOCK_GATE_DIS, 0,
|
||||
SGSI_SIDECLK_DIS);
|
||||
|
||||
/*
|
||||
* Wa_14010733611:dg2_g10
|
||||
|
@ -7648,6 +7647,17 @@ static void dg2_init_clock_gating(struct drm_i915_private *i915)
|
|||
SGR_DIS | SGGI_DIS);
|
||||
}
|
||||
|
||||
static void pvc_init_clock_gating(struct drm_i915_private *dev_priv)
|
||||
{
|
||||
/* Wa_14012385139:pvc */
|
||||
if (IS_PVC_BD_STEP(dev_priv, STEP_A0, STEP_B0))
|
||||
intel_uncore_rmw(&dev_priv->uncore, XEHP_CLOCK_GATE_DIS, 0, SGR_DIS);
|
||||
|
||||
/* Wa_22010954014:pvc */
|
||||
if (IS_PVC_BD_STEP(dev_priv, STEP_A0, STEP_B0))
|
||||
intel_uncore_rmw(&dev_priv->uncore, XEHP_CLOCK_GATE_DIS, 0, SGSI_SIDECLK_DIS);
|
||||
}
|
||||
|
||||
static void cnp_init_clock_gating(struct drm_i915_private *dev_priv)
|
||||
{
|
||||
if (!HAS_PCH_CNP(dev_priv))
|
||||
|
@ -8064,6 +8074,7 @@ static const struct drm_i915_clock_gating_funcs platform##_clock_gating_funcs =
|
|||
.init_clock_gating = platform##_init_clock_gating, \
|
||||
}
|
||||
|
||||
CG_FUNCS(pvc);
|
||||
CG_FUNCS(dg2);
|
||||
CG_FUNCS(xehpsdv);
|
||||
CG_FUNCS(adlp);
|
||||
|
@ -8102,7 +8113,9 @@ CG_FUNCS(nop);
|
|||
*/
|
||||
void intel_init_clock_gating_hooks(struct drm_i915_private *dev_priv)
|
||||
{
|
||||
if (IS_DG2(dev_priv))
|
||||
if (IS_PONTEVECCHIO(dev_priv))
|
||||
dev_priv->clock_gating_funcs = &pvc_clock_gating_funcs;
|
||||
else if (IS_DG2(dev_priv))
|
||||
dev_priv->clock_gating_funcs = &dg2_clock_gating_funcs;
|
||||
else if (IS_XEHPSDV(dev_priv))
|
||||
dev_priv->clock_gating_funcs = &xehpsdv_clock_gating_funcs;
|
||||
|
|
|
@ -135,6 +135,8 @@ static const struct intel_step_info adlp_n_revids[] = {
|
|||
[0x0] = { COMMON_GT_MEDIA_STEP(A0), .display_step = STEP_D0 },
|
||||
};
|
||||
|
||||
static void pvc_step_init(struct drm_i915_private *i915, int pci_revid);
|
||||
|
||||
void intel_step_init(struct drm_i915_private *i915)
|
||||
{
|
||||
const struct intel_step_info *revids = NULL;
|
||||
|
@ -142,7 +144,10 @@ void intel_step_init(struct drm_i915_private *i915)
|
|||
int revid = INTEL_REVID(i915);
|
||||
struct intel_step_info step = {};
|
||||
|
||||
if (IS_DG2_G10(i915)) {
|
||||
if (IS_PONTEVECCHIO(i915)) {
|
||||
pvc_step_init(i915, revid);
|
||||
return;
|
||||
} else if (IS_DG2_G10(i915)) {
|
||||
revids = dg2_g10_revid_step_tbl;
|
||||
size = ARRAY_SIZE(dg2_g10_revid_step_tbl);
|
||||
} else if (IS_DG2_G11(i915)) {
|
||||
|
@ -235,6 +240,69 @@ void intel_step_init(struct drm_i915_private *i915)
|
|||
RUNTIME_INFO(i915)->step = step;
|
||||
}
|
||||
|
||||
#define PVC_BD_REVID GENMASK(5, 3)
|
||||
#define PVC_CT_REVID GENMASK(2, 0)
|
||||
|
||||
static const int pvc_bd_subids[] = {
|
||||
[0x0] = STEP_A0,
|
||||
[0x3] = STEP_B0,
|
||||
[0x4] = STEP_B1,
|
||||
[0x5] = STEP_B3,
|
||||
};
|
||||
|
||||
static const int pvc_ct_subids[] = {
|
||||
[0x3] = STEP_A0,
|
||||
[0x5] = STEP_B0,
|
||||
[0x6] = STEP_B1,
|
||||
[0x7] = STEP_C0,
|
||||
};
|
||||
|
||||
static int
|
||||
pvc_step_lookup(struct drm_i915_private *i915, const char *type,
|
||||
const int *table, int size, int subid)
|
||||
{
|
||||
if (subid < size && table[subid] != STEP_NONE)
|
||||
return table[subid];
|
||||
|
||||
drm_warn(&i915->drm, "Unknown %s id 0x%02x\n", type, subid);
|
||||
|
||||
/*
|
||||
* As on other platforms, try to use the next higher ID if we land on a
|
||||
* gap in the table.
|
||||
*/
|
||||
while (subid < size && table[subid] == STEP_NONE)
|
||||
subid++;
|
||||
|
||||
if (subid < size) {
|
||||
drm_dbg(&i915->drm, "Using steppings for %s id 0x%02x\n",
|
||||
type, subid);
|
||||
return table[subid];
|
||||
}
|
||||
|
||||
drm_dbg(&i915->drm, "Using future steppings\n");
|
||||
return STEP_FUTURE;
|
||||
}
|
||||
|
||||
/*
|
||||
* PVC needs special handling since we don't lookup the
|
||||
* revid in a table, but rather specific bitfields within
|
||||
* the revid for various components.
|
||||
*/
|
||||
static void pvc_step_init(struct drm_i915_private *i915, int pci_revid)
|
||||
{
|
||||
int ct_subid, bd_subid;
|
||||
|
||||
bd_subid = FIELD_GET(PVC_BD_REVID, pci_revid);
|
||||
ct_subid = FIELD_GET(PVC_CT_REVID, pci_revid);
|
||||
|
||||
RUNTIME_INFO(i915)->step.basedie_step =
|
||||
pvc_step_lookup(i915, "Base Die", pvc_bd_subids,
|
||||
ARRAY_SIZE(pvc_bd_subids), bd_subid);
|
||||
RUNTIME_INFO(i915)->step.graphics_step =
|
||||
pvc_step_lookup(i915, "Compute Tile", pvc_ct_subids,
|
||||
ARRAY_SIZE(pvc_ct_subids), ct_subid);
|
||||
}
|
||||
|
||||
#define STEP_NAME_CASE(name) \
|
||||
case STEP_##name: \
|
||||
return #name;
|
||||
|
|
|
@ -11,9 +11,10 @@
|
|||
struct drm_i915_private;
|
||||
|
||||
struct intel_step_info {
|
||||
u8 graphics_step;
|
||||
u8 graphics_step; /* Represents the compute tile on Xe_HPC */
|
||||
u8 display_step;
|
||||
u8 media_step;
|
||||
u8 basedie_step;
|
||||
};
|
||||
|
||||
#define STEP_ENUM_VAL(name) STEP_##name,
|
||||
|
@ -25,6 +26,7 @@ struct intel_step_info {
|
|||
func(B0) \
|
||||
func(B1) \
|
||||
func(B2) \
|
||||
func(B3) \
|
||||
func(C0) \
|
||||
func(C1) \
|
||||
func(D0) \
|
||||
|
|
|
@ -938,36 +938,32 @@ find_fw_domain(struct intel_uncore *uncore, u32 offset)
|
|||
return entry->domains;
|
||||
}
|
||||
|
||||
#define GEN_FW_RANGE(s, e, d) \
|
||||
{ .start = (s), .end = (e), .domains = (d) }
|
||||
/*
|
||||
* Shadowed register tables describe special register ranges that i915 is
|
||||
* allowed to write to without acquiring forcewake. If these registers' power
|
||||
* wells are down, the hardware will save values written by i915 to a shadow
|
||||
* copy and automatically transfer them into the real register the next time
|
||||
* the power well is woken up. Shadowing only applies to writes; forcewake
|
||||
* must still be acquired when reading from registers in these ranges.
|
||||
*
|
||||
* The documentation for shadowed registers is somewhat spotty on older
|
||||
* platforms. However missing registers from these lists is non-fatal; it just
|
||||
* means we'll wake up the hardware for some register accesses where we didn't
|
||||
* really need to.
|
||||
*
|
||||
* The ranges listed in these tables must be sorted by offset.
|
||||
*
|
||||
* When adding new tables here, please also add them to
|
||||
* intel_shadow_table_check() in selftests/intel_uncore.c so that they will be
|
||||
* scanned for obvious mistakes or typos by the selftests.
|
||||
*/
|
||||
|
||||
/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
|
||||
static const struct intel_forcewake_range __vlv_fw_ranges[] = {
|
||||
GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
|
||||
GEN_FW_RANGE(0x5000, 0x7fff, FORCEWAKE_RENDER),
|
||||
GEN_FW_RANGE(0xb000, 0x11fff, FORCEWAKE_RENDER),
|
||||
GEN_FW_RANGE(0x12000, 0x13fff, FORCEWAKE_MEDIA),
|
||||
GEN_FW_RANGE(0x22000, 0x23fff, FORCEWAKE_MEDIA),
|
||||
GEN_FW_RANGE(0x2e000, 0x2ffff, FORCEWAKE_RENDER),
|
||||
GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
|
||||
};
|
||||
|
||||
#define __fwtable_reg_read_fw_domains(uncore, offset) \
|
||||
({ \
|
||||
enum forcewake_domains __fwd = 0; \
|
||||
if (NEEDS_FORCE_WAKE((offset))) \
|
||||
__fwd = find_fw_domain(uncore, offset); \
|
||||
__fwd; \
|
||||
})
|
||||
|
||||
/* *Must* be sorted by offset! See intel_shadow_table_check(). */
|
||||
static const struct i915_range gen8_shadowed_regs[] = {
|
||||
{ .start = 0x2030, .end = 0x2030 },
|
||||
{ .start = 0xA008, .end = 0xA00C },
|
||||
{ .start = 0x12030, .end = 0x12030 },
|
||||
{ .start = 0x1a030, .end = 0x1a030 },
|
||||
{ .start = 0x22030, .end = 0x22030 },
|
||||
/* TODO: Other registers are not yet used */
|
||||
};
|
||||
|
||||
static const struct i915_range gen11_shadowed_regs[] = {
|
||||
|
@ -1080,6 +1076,45 @@ static const struct i915_range dg2_shadowed_regs[] = {
|
|||
{ .start = 0x1F8510, .end = 0x1F8550 },
|
||||
};
|
||||
|
||||
static const struct i915_range pvc_shadowed_regs[] = {
|
||||
{ .start = 0x2030, .end = 0x2030 },
|
||||
{ .start = 0x2510, .end = 0x2550 },
|
||||
{ .start = 0xA008, .end = 0xA00C },
|
||||
{ .start = 0xA188, .end = 0xA188 },
|
||||
{ .start = 0xA278, .end = 0xA278 },
|
||||
{ .start = 0xA540, .end = 0xA56C },
|
||||
{ .start = 0xC4C8, .end = 0xC4C8 },
|
||||
{ .start = 0xC4E0, .end = 0xC4E0 },
|
||||
{ .start = 0xC600, .end = 0xC600 },
|
||||
{ .start = 0xC658, .end = 0xC658 },
|
||||
{ .start = 0x22030, .end = 0x22030 },
|
||||
{ .start = 0x22510, .end = 0x22550 },
|
||||
{ .start = 0x1C0030, .end = 0x1C0030 },
|
||||
{ .start = 0x1C0510, .end = 0x1C0550 },
|
||||
{ .start = 0x1C4030, .end = 0x1C4030 },
|
||||
{ .start = 0x1C4510, .end = 0x1C4550 },
|
||||
{ .start = 0x1C8030, .end = 0x1C8030 },
|
||||
{ .start = 0x1C8510, .end = 0x1C8550 },
|
||||
{ .start = 0x1D0030, .end = 0x1D0030 },
|
||||
{ .start = 0x1D0510, .end = 0x1D0550 },
|
||||
{ .start = 0x1D4030, .end = 0x1D4030 },
|
||||
{ .start = 0x1D4510, .end = 0x1D4550 },
|
||||
{ .start = 0x1D8030, .end = 0x1D8030 },
|
||||
{ .start = 0x1D8510, .end = 0x1D8550 },
|
||||
{ .start = 0x1E0030, .end = 0x1E0030 },
|
||||
{ .start = 0x1E0510, .end = 0x1E0550 },
|
||||
{ .start = 0x1E4030, .end = 0x1E4030 },
|
||||
{ .start = 0x1E4510, .end = 0x1E4550 },
|
||||
{ .start = 0x1E8030, .end = 0x1E8030 },
|
||||
{ .start = 0x1E8510, .end = 0x1E8550 },
|
||||
{ .start = 0x1F0030, .end = 0x1F0030 },
|
||||
{ .start = 0x1F0510, .end = 0x1F0550 },
|
||||
{ .start = 0x1F4030, .end = 0x1F4030 },
|
||||
{ .start = 0x1F4510, .end = 0x1F4550 },
|
||||
{ .start = 0x1F8030, .end = 0x1F8030 },
|
||||
{ .start = 0x1F8510, .end = 0x1F8550 },
|
||||
};
|
||||
|
||||
static int mmio_range_cmp(u32 key, const struct i915_range *range)
|
||||
{
|
||||
if (key < range->start)
|
||||
|
@ -1107,11 +1142,70 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg)
|
|||
return FORCEWAKE_RENDER;
|
||||
}
|
||||
|
||||
#define __fwtable_reg_read_fw_domains(uncore, offset) \
|
||||
({ \
|
||||
enum forcewake_domains __fwd = 0; \
|
||||
if (NEEDS_FORCE_WAKE((offset))) \
|
||||
__fwd = find_fw_domain(uncore, offset); \
|
||||
__fwd; \
|
||||
})
|
||||
|
||||
#define __fwtable_reg_write_fw_domains(uncore, offset) \
|
||||
({ \
|
||||
enum forcewake_domains __fwd = 0; \
|
||||
const u32 __offset = (offset); \
|
||||
if (NEEDS_FORCE_WAKE((__offset)) && !is_shadowed(uncore, __offset)) \
|
||||
__fwd = find_fw_domain(uncore, __offset); \
|
||||
__fwd; \
|
||||
})
|
||||
|
||||
#define GEN_FW_RANGE(s, e, d) \
|
||||
{ .start = (s), .end = (e), .domains = (d) }
|
||||
|
||||
/*
|
||||
* All platforms' forcewake tables below must be sorted by offset ranges.
|
||||
* Furthermore, new forcewake tables added should be "watertight" and have
|
||||
* no gaps between ranges.
|
||||
*
|
||||
* When there are multiple consecutive ranges listed in the bspec with
|
||||
* the same forcewake domain, it is customary to combine them into a single
|
||||
* row in the tables below to keep the tables small and lookups fast.
|
||||
* Likewise, reserved/unused ranges may be combined with the preceding and/or
|
||||
* following ranges since the driver will never be making MMIO accesses in
|
||||
* those ranges.
|
||||
*
|
||||
* For example, if the bspec were to list:
|
||||
*
|
||||
* ...
|
||||
* 0x1000 - 0x1fff: GT
|
||||
* 0x2000 - 0x2cff: GT
|
||||
* 0x2d00 - 0x2fff: unused/reserved
|
||||
* 0x3000 - 0xffff: GT
|
||||
* ...
|
||||
*
|
||||
* these could all be represented by a single line in the code:
|
||||
*
|
||||
* GEN_FW_RANGE(0x1000, 0xffff, FORCEWAKE_GT)
|
||||
*
|
||||
* When adding new forcewake tables here, please also add them to
|
||||
* intel_uncore_mock_selftests in selftests/intel_uncore.c so that they will be
|
||||
* scanned for obvious mistakes or typos by the selftests.
|
||||
*/
|
||||
|
||||
static const struct intel_forcewake_range __gen6_fw_ranges[] = {
|
||||
GEN_FW_RANGE(0x0, 0x3ffff, FORCEWAKE_RENDER),
|
||||
};
|
||||
|
||||
/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
|
||||
static const struct intel_forcewake_range __vlv_fw_ranges[] = {
|
||||
GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
|
||||
GEN_FW_RANGE(0x5000, 0x7fff, FORCEWAKE_RENDER),
|
||||
GEN_FW_RANGE(0xb000, 0x11fff, FORCEWAKE_RENDER),
|
||||
GEN_FW_RANGE(0x12000, 0x13fff, FORCEWAKE_MEDIA),
|
||||
GEN_FW_RANGE(0x22000, 0x23fff, FORCEWAKE_MEDIA),
|
||||
GEN_FW_RANGE(0x2e000, 0x2ffff, FORCEWAKE_RENDER),
|
||||
GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
|
||||
};
|
||||
|
||||
static const struct intel_forcewake_range __chv_fw_ranges[] = {
|
||||
GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
|
||||
GEN_FW_RANGE(0x4000, 0x4fff, FORCEWAKE_RENDER | FORCEWAKE_MEDIA),
|
||||
|
@ -1131,16 +1225,6 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = {
|
|||
GEN_FW_RANGE(0x30000, 0x37fff, FORCEWAKE_MEDIA),
|
||||
};
|
||||
|
||||
#define __fwtable_reg_write_fw_domains(uncore, offset) \
|
||||
({ \
|
||||
enum forcewake_domains __fwd = 0; \
|
||||
const u32 __offset = (offset); \
|
||||
if (NEEDS_FORCE_WAKE((__offset)) && !is_shadowed(uncore, __offset)) \
|
||||
__fwd = find_fw_domain(uncore, __offset); \
|
||||
__fwd; \
|
||||
})
|
||||
|
||||
/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
|
||||
static const struct intel_forcewake_range __gen9_fw_ranges[] = {
|
||||
GEN_FW_RANGE(0x0, 0xaff, FORCEWAKE_GT),
|
||||
GEN_FW_RANGE(0xb00, 0x1fff, 0), /* uncore range */
|
||||
|
@ -1176,7 +1260,6 @@ static const struct intel_forcewake_range __gen9_fw_ranges[] = {
|
|||
GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
|
||||
};
|
||||
|
||||
/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
|
||||
static const struct intel_forcewake_range __gen11_fw_ranges[] = {
|
||||
GEN_FW_RANGE(0x0, 0x1fff, 0), /* uncore range */
|
||||
GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),
|
||||
|
@ -1215,14 +1298,6 @@ static const struct intel_forcewake_range __gen11_fw_ranges[] = {
|
|||
GEN_FW_RANGE(0x1d4000, 0x1dbfff, 0)
|
||||
};
|
||||
|
||||
/*
|
||||
* *Must* be sorted by offset ranges! See intel_fw_table_check().
|
||||
*
|
||||
* Note that the spec lists several reserved/unused ranges that don't
|
||||
* actually contain any registers. In the table below we'll combine those
|
||||
* reserved ranges with either the preceding or following range to keep the
|
||||
* table small and lookups fast.
|
||||
*/
|
||||
static const struct intel_forcewake_range __gen12_fw_ranges[] = {
|
||||
GEN_FW_RANGE(0x0, 0x1fff, 0), /*
|
||||
0x0 - 0xaff: reserved
|
||||
|
@ -1327,8 +1402,6 @@ static const struct intel_forcewake_range __gen12_fw_ranges[] = {
|
|||
/*
|
||||
* Graphics IP version 12.55 brings a slight change to the 0xd800 range,
|
||||
* switching it from the GT domain to the render domain.
|
||||
*
|
||||
* *Must* be sorted by offset ranges! See intel_fw_table_check().
|
||||
*/
|
||||
#define XEHP_FWRANGES(FW_RANGE_D800) \
|
||||
GEN_FW_RANGE(0x0, 0x1fff, 0), /* \
|
||||
|
@ -1490,6 +1563,103 @@ static const struct intel_forcewake_range __dg2_fw_ranges[] = {
|
|||
XEHP_FWRANGES(FORCEWAKE_RENDER)
|
||||
};
|
||||
|
||||
static const struct intel_forcewake_range __pvc_fw_ranges[] = {
|
||||
GEN_FW_RANGE(0x0, 0xaff, 0),
|
||||
GEN_FW_RANGE(0xb00, 0xbff, FORCEWAKE_GT),
|
||||
GEN_FW_RANGE(0xc00, 0xfff, 0),
|
||||
GEN_FW_RANGE(0x1000, 0x1fff, FORCEWAKE_GT),
|
||||
GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),
|
||||
GEN_FW_RANGE(0x2700, 0x2fff, FORCEWAKE_GT),
|
||||
GEN_FW_RANGE(0x3000, 0x3fff, FORCEWAKE_RENDER),
|
||||
GEN_FW_RANGE(0x4000, 0x813f, FORCEWAKE_GT), /*
|
||||
0x4000 - 0x4aff: gt
|
||||
0x4b00 - 0x4fff: reserved
|
||||
0x5000 - 0x51ff: gt
|
||||
0x5200 - 0x52ff: reserved
|
||||
0x5300 - 0x53ff: gt
|
||||
0x5400 - 0x7fff: reserved
|
||||
0x8000 - 0x813f: gt */
|
||||
GEN_FW_RANGE(0x8140, 0x817f, FORCEWAKE_RENDER),
|
||||
GEN_FW_RANGE(0x8180, 0x81ff, 0),
|
||||
GEN_FW_RANGE(0x8200, 0x94cf, FORCEWAKE_GT), /*
|
||||
0x8200 - 0x82ff: gt
|
||||
0x8300 - 0x84ff: reserved
|
||||
0x8500 - 0x887f: gt
|
||||
0x8880 - 0x8a7f: reserved
|
||||
0x8a80 - 0x8aff: gt
|
||||
0x8b00 - 0x8fff: reserved
|
||||
0x9000 - 0x947f: gt
|
||||
0x9480 - 0x94cf: reserved */
|
||||
GEN_FW_RANGE(0x94d0, 0x955f, FORCEWAKE_RENDER),
|
||||
GEN_FW_RANGE(0x9560, 0x967f, 0), /*
|
||||
0x9560 - 0x95ff: always on
|
||||
0x9600 - 0x967f: reserved */
|
||||
GEN_FW_RANGE(0x9680, 0x97ff, FORCEWAKE_RENDER), /*
|
||||
0x9680 - 0x96ff: render
|
||||
0x9700 - 0x97ff: reserved */
|
||||
GEN_FW_RANGE(0x9800, 0xcfff, FORCEWAKE_GT), /*
|
||||
0x9800 - 0xb4ff: gt
|
||||
0xb500 - 0xbfff: reserved
|
||||
0xc000 - 0xcfff: gt */
|
||||
GEN_FW_RANGE(0xd000, 0xd3ff, 0),
|
||||
GEN_FW_RANGE(0xd400, 0xdbff, FORCEWAKE_GT),
|
||||
GEN_FW_RANGE(0xdc00, 0xdcff, FORCEWAKE_RENDER),
|
||||
GEN_FW_RANGE(0xdd00, 0xde7f, FORCEWAKE_GT), /*
|
||||
0xdd00 - 0xddff: gt
|
||||
0xde00 - 0xde7f: reserved */
|
||||
GEN_FW_RANGE(0xde80, 0xe8ff, FORCEWAKE_RENDER), /*
|
||||
0xde80 - 0xdeff: render
|
||||
0xdf00 - 0xe1ff: reserved
|
||||
0xe200 - 0xe7ff: render
|
||||
0xe800 - 0xe8ff: reserved */
|
||||
GEN_FW_RANGE(0xe900, 0x11fff, FORCEWAKE_GT), /*
|
||||
0xe900 - 0xe9ff: gt
|
||||
0xea00 - 0xebff: reserved
|
||||
0xec00 - 0xffff: gt
|
||||
0x10000 - 0x11fff: reserved */
|
||||
GEN_FW_RANGE(0x12000, 0x12fff, 0), /*
|
||||
0x12000 - 0x127ff: always on
|
||||
0x12800 - 0x12fff: reserved */
|
||||
GEN_FW_RANGE(0x13000, 0x23fff, FORCEWAKE_GT), /*
|
||||
0x13000 - 0x135ff: gt
|
||||
0x13600 - 0x147ff: reserved
|
||||
0x14800 - 0x153ff: gt
|
||||
0x15400 - 0x19fff: reserved
|
||||
0x1a000 - 0x1ffff: gt
|
||||
0x20000 - 0x21fff: reserved
|
||||
0x22000 - 0x23fff: gt */
|
||||
GEN_FW_RANGE(0x24000, 0x2417f, 0), /*
|
||||
24000 - 0x2407f: always on
|
||||
24080 - 0x2417f: reserved */
|
||||
GEN_FW_RANGE(0x24180, 0x3ffff, FORCEWAKE_GT), /*
|
||||
0x24180 - 0x241ff: gt
|
||||
0x24200 - 0x251ff: reserved
|
||||
0x25200 - 0x252ff: gt
|
||||
0x25300 - 0x25fff: reserved
|
||||
0x26000 - 0x27fff: gt
|
||||
0x28000 - 0x2ffff: reserved
|
||||
0x30000 - 0x3ffff: gt */
|
||||
GEN_FW_RANGE(0x40000, 0x1bffff, 0),
|
||||
GEN_FW_RANGE(0x1c0000, 0x1c3fff, FORCEWAKE_MEDIA_VDBOX0), /*
|
||||
0x1c0000 - 0x1c2bff: VD0
|
||||
0x1c2c00 - 0x1c2cff: reserved
|
||||
0x1c2d00 - 0x1c2dff: VD0
|
||||
0x1c2e00 - 0x1c3eff: reserved
|
||||
0x1c3f00 - 0x1c3fff: VD0 */
|
||||
GEN_FW_RANGE(0x1c4000, 0x1cffff, FORCEWAKE_MEDIA_VDBOX1), /*
|
||||
0x1c4000 - 0x1c6aff: VD1
|
||||
0x1c6b00 - 0x1c7eff: reserved
|
||||
0x1c7f00 - 0x1c7fff: VD1
|
||||
0x1c8000 - 0x1cffff: reserved */
|
||||
GEN_FW_RANGE(0x1d0000, 0x23ffff, FORCEWAKE_MEDIA_VDBOX2), /*
|
||||
0x1d0000 - 0x1d2aff: VD2
|
||||
0x1d2b00 - 0x1d3eff: reserved
|
||||
0x1d3f00 - 0x1d3fff: VD2
|
||||
0x1d4000 - 0x23ffff: reserved */
|
||||
GEN_FW_RANGE(0x240000, 0x3dffff, 0),
|
||||
GEN_FW_RANGE(0x3e0000, 0x3effff, FORCEWAKE_GT),
|
||||
};
|
||||
|
||||
static void
|
||||
ilk_dummy_write(struct intel_uncore *uncore)
|
||||
{
|
||||
|
@ -2125,7 +2295,11 @@ static int uncore_forcewake_init(struct intel_uncore *uncore)
|
|||
|
||||
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
|
||||
|
||||
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
|
||||
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 60)) {
|
||||
ASSIGN_FW_DOMAINS_TABLE(uncore, __pvc_fw_ranges);
|
||||
ASSIGN_SHADOW_TABLE(uncore, pvc_shadowed_regs);
|
||||
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
|
||||
} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
|
||||
ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges);
|
||||
ASSIGN_SHADOW_TABLE(uncore, dg2_shadowed_regs);
|
||||
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
|
||||
|
@ -2470,118 +2644,6 @@ intel_uncore_forcewake_for_reg(struct intel_uncore *uncore,
|
|||
return fw_domains;
|
||||
}
|
||||
|
||||
/**
|
||||
* uncore_rw_with_mcr_steering_fw - Access a register after programming
|
||||
* the MCR selector register.
|
||||
* @uncore: pointer to struct intel_uncore
|
||||
* @reg: register being accessed
|
||||
* @rw_flag: FW_REG_READ for read access or FW_REG_WRITE for write access
|
||||
* @slice: slice number (ignored for multi-cast write)
|
||||
* @subslice: sub-slice number (ignored for multi-cast write)
|
||||
* @value: register value to be written (ignored for read)
|
||||
*
|
||||
* Return: 0 for write access. register value for read access.
|
||||
*
|
||||
* Caller needs to make sure the relevant forcewake wells are up.
|
||||
*/
|
||||
static u32 uncore_rw_with_mcr_steering_fw(struct intel_uncore *uncore,
|
||||
i915_reg_t reg, u8 rw_flag,
|
||||
int slice, int subslice, u32 value)
|
||||
{
|
||||
u32 mcr_mask, mcr_ss, mcr, old_mcr, val = 0;
|
||||
|
||||
lockdep_assert_held(&uncore->lock);
|
||||
|
||||
if (GRAPHICS_VER(uncore->i915) >= 11) {
|
||||
mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
|
||||
mcr_ss = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
|
||||
|
||||
/*
|
||||
* Wa_22013088509
|
||||
*
|
||||
* The setting of the multicast/unicast bit usually wouldn't
|
||||
* matter for read operations (which always return the value
|
||||
* from a single register instance regardless of how that bit
|
||||
* is set), but some platforms have a workaround requiring us
|
||||
* to remain in multicast mode for reads. There's no real
|
||||
* downside to this, so we'll just go ahead and do so on all
|
||||
* platforms; we'll only clear the multicast bit from the mask
|
||||
* when exlicitly doing a write operation.
|
||||
*/
|
||||
if (rw_flag == FW_REG_WRITE)
|
||||
mcr_mask |= GEN11_MCR_MULTICAST;
|
||||
} else {
|
||||
mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
|
||||
mcr_ss = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
|
||||
}
|
||||
|
||||
old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
|
||||
|
||||
mcr &= ~mcr_mask;
|
||||
mcr |= mcr_ss;
|
||||
intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
|
||||
|
||||
if (rw_flag == FW_REG_READ)
|
||||
val = intel_uncore_read_fw(uncore, reg);
|
||||
else
|
||||
intel_uncore_write_fw(uncore, reg, value);
|
||||
|
||||
mcr &= ~mcr_mask;
|
||||
mcr |= old_mcr & mcr_mask;
|
||||
|
||||
intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
|
||||
|
||||
return val;
|
||||
}
|
||||
|
||||
static u32 uncore_rw_with_mcr_steering(struct intel_uncore *uncore,
|
||||
i915_reg_t reg, u8 rw_flag,
|
||||
int slice, int subslice,
|
||||
u32 value)
|
||||
{
|
||||
enum forcewake_domains fw_domains;
|
||||
u32 val;
|
||||
|
||||
fw_domains = intel_uncore_forcewake_for_reg(uncore, reg,
|
||||
rw_flag);
|
||||
fw_domains |= intel_uncore_forcewake_for_reg(uncore,
|
||||
GEN8_MCR_SELECTOR,
|
||||
FW_REG_READ | FW_REG_WRITE);
|
||||
|
||||
spin_lock_irq(&uncore->lock);
|
||||
intel_uncore_forcewake_get__locked(uncore, fw_domains);
|
||||
|
||||
val = uncore_rw_with_mcr_steering_fw(uncore, reg, rw_flag,
|
||||
slice, subslice, value);
|
||||
|
||||
intel_uncore_forcewake_put__locked(uncore, fw_domains);
|
||||
spin_unlock_irq(&uncore->lock);
|
||||
|
||||
return val;
|
||||
}
|
||||
|
||||
u32 intel_uncore_read_with_mcr_steering_fw(struct intel_uncore *uncore,
|
||||
i915_reg_t reg, int slice, int subslice)
|
||||
{
|
||||
return uncore_rw_with_mcr_steering_fw(uncore, reg, FW_REG_READ,
|
||||
slice, subslice, 0);
|
||||
}
|
||||
|
||||
u32 intel_uncore_read_with_mcr_steering(struct intel_uncore *uncore,
|
||||
i915_reg_t reg, int slice, int subslice)
|
||||
{
|
||||
return uncore_rw_with_mcr_steering(uncore, reg, FW_REG_READ,
|
||||
slice, subslice, 0);
|
||||
}
|
||||
|
||||
void intel_uncore_write_with_mcr_steering(struct intel_uncore *uncore,
|
||||
i915_reg_t reg, u32 value,
|
||||
int slice, int subslice)
|
||||
{
|
||||
uncore_rw_with_mcr_steering(uncore, reg, FW_REG_WRITE,
|
||||
slice, subslice, value);
|
||||
}
|
||||
|
||||
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
|
||||
#include "selftests/mock_uncore.c"
|
||||
#include "selftests/intel_uncore.c"
|
||||
|
|
|
@ -210,14 +210,6 @@ intel_uncore_has_fifo(const struct intel_uncore *uncore)
|
|||
return uncore->flags & UNCORE_HAS_FIFO;
|
||||
}
|
||||
|
||||
u32 intel_uncore_read_with_mcr_steering_fw(struct intel_uncore *uncore,
|
||||
i915_reg_t reg,
|
||||
int slice, int subslice);
|
||||
u32 intel_uncore_read_with_mcr_steering(struct intel_uncore *uncore,
|
||||
i915_reg_t reg, int slice, int subslice);
|
||||
void intel_uncore_write_with_mcr_steering(struct intel_uncore *uncore,
|
||||
i915_reg_t reg, u32 value,
|
||||
int slice, int subslice);
|
||||
void
|
||||
intel_uncore_mmio_debug_init_early(struct intel_uncore_mmio_debug *mmio_debug);
|
||||
void intel_uncore_init_early(struct intel_uncore *uncore,
|
||||
|
|
|
@ -69,6 +69,7 @@ static int intel_shadow_table_check(void)
|
|||
{ gen11_shadowed_regs, ARRAY_SIZE(gen11_shadowed_regs) },
|
||||
{ gen12_shadowed_regs, ARRAY_SIZE(gen12_shadowed_regs) },
|
||||
{ dg2_shadowed_regs, ARRAY_SIZE(dg2_shadowed_regs) },
|
||||
{ pvc_shadowed_regs, ARRAY_SIZE(pvc_shadowed_regs) },
|
||||
};
|
||||
const struct i915_range *range;
|
||||
unsigned int i, j;
|
||||
|
@ -115,6 +116,7 @@ int intel_uncore_mock_selftests(void)
|
|||
{ __gen11_fw_ranges, ARRAY_SIZE(__gen11_fw_ranges), true },
|
||||
{ __gen12_fw_ranges, ARRAY_SIZE(__gen12_fw_ranges), true },
|
||||
{ __xehp_fw_ranges, ARRAY_SIZE(__xehp_fw_ranges), true },
|
||||
{ __pvc_fw_ranges, ARRAY_SIZE(__pvc_fw_ranges), true },
|
||||
};
|
||||
int err, i;
|
||||
|
||||
|
|
|
@ -10,24 +10,24 @@ struct agp_bridge_data;
|
|||
struct pci_dev;
|
||||
struct sg_table;
|
||||
|
||||
void intel_gtt_get(u64 *gtt_total,
|
||||
phys_addr_t *mappable_base,
|
||||
resource_size_t *mappable_end);
|
||||
void intel_gmch_gtt_get(u64 *gtt_total,
|
||||
phys_addr_t *mappable_base,
|
||||
resource_size_t *mappable_end);
|
||||
|
||||
int intel_gmch_probe(struct pci_dev *bridge_pdev, struct pci_dev *gpu_pdev,
|
||||
struct agp_bridge_data *bridge);
|
||||
void intel_gmch_remove(void);
|
||||
|
||||
bool intel_enable_gtt(void);
|
||||
bool intel_gmch_enable_gtt(void);
|
||||
|
||||
void intel_gtt_chipset_flush(void);
|
||||
void intel_gtt_insert_page(dma_addr_t addr,
|
||||
unsigned int pg,
|
||||
unsigned int flags);
|
||||
void intel_gtt_insert_sg_entries(struct sg_table *st,
|
||||
unsigned int pg_start,
|
||||
unsigned int flags);
|
||||
void intel_gtt_clear_range(unsigned int first_entry, unsigned int num_entries);
|
||||
void intel_gmch_gtt_flush(void);
|
||||
void intel_gmch_gtt_insert_page(dma_addr_t addr,
|
||||
unsigned int pg,
|
||||
unsigned int flags);
|
||||
void intel_gmch_gtt_insert_sg_entries(struct sg_table *st,
|
||||
unsigned int pg_start,
|
||||
unsigned int flags);
|
||||
void intel_gmch_gtt_clear_range(unsigned int first_entry, unsigned int num_entries);
|
||||
|
||||
/* Special gtt memory types */
|
||||
#define AGP_DCACHE_MEMORY 1
|
||||
|
|
|
@ -3443,6 +3443,22 @@ struct drm_i915_gem_create_ext {
|
|||
* At which point we get the object handle in &drm_i915_gem_create_ext.handle,
|
||||
* along with the final object size in &drm_i915_gem_create_ext.size, which
|
||||
* should account for any rounding up, if required.
|
||||
*
|
||||
* Note that userspace has no means of knowing the current backing region
|
||||
* for objects where @num_regions is larger than one. The kernel will only
|
||||
* ensure that the priority order of the @regions array is honoured, either
|
||||
* when initially placing the object, or when moving memory around due to
|
||||
* memory pressure
|
||||
*
|
||||
* On Flat-CCS capable HW, compression is supported for the objects residing
|
||||
* in I915_MEMORY_CLASS_DEVICE. When such objects (compressed) have other
|
||||
* memory class in @regions and migrated (by i915, due to memory
|
||||
* constraints) to the non I915_MEMORY_CLASS_DEVICE region, then i915 needs to
|
||||
* decompress the content. But i915 doesn't have the required information to
|
||||
* decompress the userspace compressed objects.
|
||||
*
|
||||
* So i915 supports Flat-CCS, on the objects which can reside only on
|
||||
* I915_MEMORY_CLASS_DEVICE regions.
|
||||
*/
|
||||
struct drm_i915_gem_create_ext_memory_regions {
|
||||
/** @base: Extension link. See struct i915_user_extension. */
|
||||
|
|
Загрузка…
Ссылка в новой задаче