Граф коммитов

296 Коммитов

Автор SHA1 Сообщение Дата
Daniel Vetter e8e6e6012d Linux 3.14-rc6
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTHSaRAAoJEHm+PkMAQRiG7G8IAJHElwFDNSQE7Y9MmbicrAMG
 kfjhBtBpTaVrJKQXegCNUwDaLLyC4oLIxDheW84oPXbrEGDLqPtBov/hrcFkHVr4
 lh/ZYk02nYtcfpN0JnL/Yj2oKHVmBWs0vFlM7StSFsJCj10DoCVQQdmAJ8XODTPo
 CXMapk+UikTX1TlIO8+B5toyl3R1OqPmW211UV1vQVLKy66hu+MKVN/V+/EyopL0
 1jO81EDpaRaeIJh1/okcyUoIq9pqLkAWNpeQ7uyXZ+Sfivt9RXwLYKmAB3lP20Hc
 ZMIIoHSCyYRFjxLlQvt02bA9nY4wTY7YN5kZ2kk65y7TFfhcGsCw1Sc69iyCoKs=
 =CJcA
 -----END PGP SIGNATURE-----

Merge tag 'v3.14-rc6' into drm-intel-next-queued

Linux 3.14-rc6

I need the hdmi/dvi-dual link fixes in 3.14 to avoid ugly conflicts
when merging Ville's new hdmi cloning support into my -next tree

Conflicts:
	drivers/gpu/drm/i915/Makefile
	drivers/gpu/drm/i915/intel_dp.c

Makefile cleanup conflicts with an acpi build fix, intel_dp.c is
trivial.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-10 21:43:46 +01:00
Daniel Vetter 93a25a9e2d drm/i915: Disable full ppgtt by default
There are too many oustanding issues:

- Fence handling in the current code is broken. There's a patch series
  from me, but it's blocked on and extended review (which includes
  writing the testcases).

- IOMMU mapping handling is broken, we need to properly refcount it -
  currently it gets destroyed when the first vma is unbound, so way
  too early.

- There's a pending reset issue on snb. Since Mika's reset work and
  full ppgtt have been pulled in in separate branches and ended up
  intermittingly breaking each another it's unclear who's the exact
  culprit here.

- We still have persistent evidince of crazy recursion bugs through
  vma_unbind and ppgtt_relase, e.g.

  https://bugs.freedesktop.org/show_bug.cgi?id=73383

  This issue (and a few others meanwhile resolved) have blocked our
  performance measuring/tuning group since 3 months.

- Secure batch dispatching is broken. This is blocking Brad Volkin's
  command checker work since 3 months.

All these issues are confirmed to only happen when full ppgtt is
enabled, falling back to aliasing ppgtt resolves them. But even
aliasing ppgtt itself still has a regression:

- We currently unconditionally bind objects into the aliasing ppgtt,
  which means all priviledged objects like ringbuffers are visible to
  unpriviledged access again. On top of that this also breaks the
  command checker for aliasing ppgtt, since it can't hide the
  validated batch any more.

Furthermore topic/full-ppgtt has never been reviewed:

- Lifetime rules around vma unbinding/release are unclear, resulting
  into this awesome hack called ppgtt_release. Which seems to take the
  blame for most of the recursion fallout.

- Context/ring init works different on gpu reset than anywhere else.
  Such differeneces have in the past always lead to really hard to
  track down bugs.

- Aliasing ppgtt is treated in a bunch of places as a real address
  space, but it isn't - the real address space is always the global
  gtt in that case. This results in a bit a mess between contexts and
  ppgtt object, further complication the context/ppgtt/vma lifetime
  rules.

- We don't have any docs describing the overall concepts introduced
  with full ppgtt. A short, concise overview describing vmas and some
  of the strange bits around them (like the unbound vmas used by
  execbuf, or the new binding rules) really is needed.

Note that a lot of the post topic/full-ppgtt merge fallout has already
been addressed, this entire list here of 10 issues really only contains
the still outstanding issues.

Finally the 3.15 merge window is approaching and I think we need to
use the remaining time to ensure that our fallback option of using
aliasing ppgtt is in solid shape. Hence I think it's time to throw the
switch. While at it demote the helper from static inline status
because really.

Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-07 22:36:47 +01:00
Ben Widawsky 5abbcca30d drm/i915/bdw: Kill ppgtt->num_pt_pages
With the original PPGTT implementation if the number of PDPs was not a
power of two, the number of pages for the page tables would end up being
rounded up. The code actually had a bug here afaict, but this is a
theoretical bug as I don't believe this can actually occur with the
current code/HW..

With the rework of the page table allocations, there is no longer a
distinction between number of page table pages, and number of page
directory entries. To avoid confusion, kill the redundant (and newer)
struct member.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:30:00 +01:00
Ben Widawsky b146520ff9 drm/i915: Split GEN6 PPGTT initialization up
Simply to match the GEN8 style of PPGTT initialization, split up the
allocations and mappings. Unlike GEN8, we skip a separate dma_addr_t
allocation function, as it is much simpler pre-gen8.

With this code it would be easy to make a more general PPGTT
initialization function with per GEN alloc/map/etc. or use a common
helper, similar to the ringbuffer code. I don't see a benefit to doing
this just yet, but who knows...

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:30:00 +01:00
Ben Widawsky a00d825de9 drm/i915: Split GEN6 PPGTT cleanup
This cleanup is similar to the GEN8 cleanup (though less necessary).
Having everything split will make cleaning the initialization path error
paths easier to understand.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:29:59 +01:00
Ben Widawsky c4ac524c15 drm/i915: Update i915_gem_gtt.c copyright
I keep meaning to do this... by now almost the entire file has been
written by an Intel employee (including Daniel post-2010).

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:29:58 +01:00
Ben Widawsky 7907f45bf9 Revert "drm/i915/bdw: Limit GTT to 2GB"
This reverts commit 3a2ffb65ee.

Now that the code is fixed to use smaller allocations, it should be safe
to let the full GGTT be used on BDW.

The testcase for this is anything which uses more than half of the GTT,
thus eclipsing the old limit.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:29:57 +01:00
Ben Widawsky 7ad47cf252 drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.

In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.

To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.

NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.

v2/NOTE2: This patch predated commit:
6f1cc99351
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Dec 31 15:50:31 2013 +0000

    drm/i915: Avoid dereference past end of page arr

It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)

v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)

v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)

v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-05 21:29:41 +01:00
Ben Widawsky 782f149523 drm/i915: Make clear/insert vfuncs args absolute
This patch converts insert_entries and clear_range, both functions which
are specific to the VM. These functions tend to encapsulate the gen
specific PTE writes. Passing absolute addresses to the insert_entries,
and clear_range will help make the logic clearer within the functions as
to what's going on. Currently, all callers simply do the appropriate
page shift, which IMO, ends up looking weird with an upcoming change for
the gen8 page table allocations.

Up until now, the PPGTT was a funky 2 level page table. GEN8 changes
this to look more like a 3 level page table, and to that extent we need
a significant amount more memory simply for the page tables. To address
this, the allocations will be split up in finer amounts.

v2: Replace size_t with uint64_t (Chris, Imre)

v3: Fix size in gen8_ppgtt_init (Ben)
Fix Size in i915_gem_suspend_gtt_mappings/restore (Imre)

Reviewed-by: Imre Deak <imre.deak@intel.com> (v2)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:57:52 +01:00
Ben Widawsky bf2b4ed291 drm/i915/bdw: Split ppgtt initialization up
Like cleanup in an earlier patch, the code becomes much more readable,
and easier to extend if we extract out helper functions for the various
stages of init.

Note that with this patch it becomes really simple, and tempting to begin
using the 'goto out' idiom with explicit free/fini semantics. I've
kept the error path as similar as possible to the cleanup() function to
make sure cleanup is as robust as possible

v2: Remove comment "NB:From here on, ppgtt->base.cleanup() should
function properly"
Update commit message to reflect above

v3: Rebased on top of bugfixes found in the previous patch by Imre
Moved number of pd pages assertion to the proper place (Imre)

v4:
Allocate dma address space for num_pd_pages, not num_pd_entries (Ben)
Don't use gen8_pt_dma_addr after free on error path (Imre)
With new fix from v4 of the previous patch.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:54:59 +01:00
Ben Widawsky f3a964b96d drm/i915/bdw: Reorganize PPGTT init
Create 3 clear stages in PPGTT init. This will help with upcoming
changes be more readable. The 3 stages are, allocation, dma mapping, and
writing the P[DT]Es

One nice benefit to the patches is that it makes 2 very clear error
points, allocation, and mapping, and avoids having to do any handling
after writing PTEs (something which was likely buggy before). This
simplified error handling I suspect will be helpful when we move to
deferred/dynamic page table allocation and mapping.

The patches also attempts to break up some of the steps into more
logical reviewable chunks, particularly when we free.

v2: Don't call cleanup on the error path since that takes down the
drm_mm and list entry, which aren't setup at this point.

v3: Fixes addressing Imre's comments from:
<1392821989.19792.13.camel@intelbox>

Don't do dynamic allocation for the page table DMA addresses. I can't
remember why I did it in the first place. This addresses one of Imre's
other issues.

Fix error path leak of page tables.

v4: Fix the fix of the error path leak. Original fix still leaked page
tables. (Imre)

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:54:31 +01:00
Ben Widawsky b18b6bde30 drm/i915/bdw: Free PPGTT struct
GEN8 never freed the PPGTT struct. As GEN8 doesn't use full PPGTT, the
leak is small and only found on a module reload. ie. I don't think this
needs to go to stable.

v2: The very naive, kfree in gen8 ppgtt cleanup, is subject to a double
free on PPGTT initialization failure. (Spotted by Imre). Instead this
patch pulls the ppgtt struct freeing out of the cleanup and leaves it to
the allocators/callers or the one doing the last kref_put as in standard
convention

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-04 15:53:58 +01:00
Daniel Vetter d47c3ea2bd drm/i915: Allow blocking in the PDE alloc when running low on gtt space
There's no need not to, really.

Split out from Chris vma-bind rework.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-14 14:18:15 +01:00
Daniel Vetter 1ec9e26dda drm/i915: Consolidate binding parameters into flags
Anything more than just one bool parameter is just a pain to read,
symbolic constants are much better.

Split out from Chris' vma-binding rework patch.

v2: Undo the behaviour change in object_pin that Chris spotted.

v3: Split out misplaced hunk to handle set_cache_level errors,
spotted by Jani.

v4: Keep the current over-zealous binding logic in the execbuffer code
working with a quick hack while the overall binding code gets shuffled
around.

v5: Reorder the PIN_ flags for more natural patch splitup.

v6: Pull out the PIN_GLOBAL split-up again.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-14 14:16:58 +01:00
Ben Widawsky b45a67157a drm/i915/bdw: Split up PPGTT cleanup
This will make the code more readable, and extensible which is needed
for upcoming feature work. Eventually, we'll do the same for init.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-13 11:51:21 +01:00
Linus Torvalds 9b0cd304f2 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "Been a bit busy, first week of kids school, and waiting on other trees
  to go in before I could send this, so its a bit later than I'd
  normally like.

  Highlights:
   - core:
      timestamp fixes, lots of misc cleanups
   - new drivers:
      bochs virtual vga
   - vmwgfx:
      major overhaul for their nextgen virt gpu.
   - i915:
      runtime D3 on HSW, watermark fixes, power well work, fbc fixes,
      bdw is no longer prelim.
   - nouveau:
      gk110/208 acceleration, more pm groundwork, old overlay support
   - radeon:
      dpm rework and clockgating for CIK, pci config reset, big endian
      fixes
   - tegra:
      panel support and DSI support, build as module, prime.
   - armada, omap, gma500, rcar, exynos, mgag200, cirrus, ast:
      fixes
   - msm:
      hdmi support for mdp5"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (595 commits)
  drm/nouveau: resume display if any later suspend bits fail
  drm/nouveau: fix lock unbalance in nouveau_crtc_page_flip
  drm/nouveau: implement hooks for needed for drm vblank timestamping support
  drm/nouveau/disp: add a method to fetch info needed by drm vblank timestamping
  drm/nv50: fill in crtc mode struct members from crtc_mode_fixup
  drm/radeon/dce8: workaround for atom BlankCrtc table
  drm/radeon/DCE4+: clear bios scratch dpms bit (v2)
  drm/radeon: set si_notify_smc_display_change properly
  drm/radeon: fix DAC interrupt handling on DCE5+
  drm/radeon: clean up active vram sizing
  drm/radeon: skip async dma init on r6xx
  drm/radeon/runpm: don't runtime suspend non-PX cards
  drm/radeon: add ring to fence trace functions
  drm/radeon: add missing trace point
  drm/radeon: fix VMID use tracking
  drm: ast,cirrus,mgag200: use drm_can_sleep
  drm/gma500: Lock struct_mutex around cursor updates
  drm/i915: Fix the offset issue for the stolen GEM objects
  DRM: armada: fix missing DRM_KMS_FB_HELPER select
  drm/i915: Decouple GPU error reporting from ring initialisation
  ...
2014-01-29 20:49:12 -08:00
Jani Nikula d330a9530c drm/i915: move module parameters into a struct, in a new file
With 20+ module parameters, I think referring to them via a struct
improves clarity over just having a bunch of globals. While at it, move
the parameter initialization and definitions into a new file
i915_params.c to reduce clutter in i915_drv.c.

Apart from the ill-named i915_enable_rc6, i915_enable_fbc and
i915_enable_ppgtt parameters, for which we lose the "i915_" prefix
internally, the module parameters now look the same both on the kernel
command line and in code. For example, "i915.modeset".

The downsides of the change are losing static on a couple of variables
and not having the initialization and module_param_named() right next to
each other. On the other hand, all module parameters are now defined in
one place at i915_params.c. Plus you can do this to find all module
parameter references:

$ git grep "i915\." -- drivers/gpu/drm/i915

v2:
- move the definitions into a new file
- s/i915_params/i915/
- make i915_try_reset i915.reset, for consistency

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-27 17:16:45 +01:00
Daniel Vetter 0e5539b923 Merge branch 'topic/ppgtt' into drm-intel-next-queued
Because whatever.*

* This should contain a fairly long list of issues and still
unresolved resgressions, but I didn't really get a vote.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-25 21:14:57 +01:00
Linus Torvalds e1ba84597c PCI changes for the v3.14 merge window:
Resource management
     - Change pci_bus_region addresses to dma_addr_t (Bjorn Helgaas)
     - Support 64-bit AGP BARs (Bjorn Helgaas, Yinghai Lu)
     - Add pci_bus_address() to get bus address of a BAR (Bjorn Helgaas)
     - Use pci_resource_start() for CPU address of AGP BARs (Bjorn Helgaas)
     - Enforce bus address limits in resource allocation (Yinghai Lu)
     - Allocate 64-bit BARs above 4G when possible (Yinghai Lu)
     - Convert pcibios_resource_to_bus() to take pci_bus, not pci_dev (Yinghai Lu)
 
   PCI device hotplug
     - Major rescan/remove locking update (Rafael J. Wysocki)
     - Make ioapic builtin only (not modular) (Yinghai Lu)
     - Fix release/free issues (Yinghai Lu)
     - Clean up pciehp (Bjorn Helgaas)
     - Announce pciehp slot info during enumeration (Bjorn Helgaas)
 
   MSI
     - Add pci_msi_vec_count(), pci_msix_vec_count() (Alexander Gordeev)
     - Add pci_enable_msi_range(), pci_enable_msix_range() (Alexander Gordeev)
     - Deprecate "tri-state" interfaces: fail/success/fail+info (Alexander Gordeev)
     - Export MSI mode using attributes, not kobjects (Greg Kroah-Hartman)
     - Drop "irq" param from *_restore_msi_irqs() (DuanZhenzhong)
 
   SR-IOV
     - Clear NumVFs when disabling SR-IOV in sriov_init() (ethan.zhao)
 
   Virtualization
     - Add support for save/restore of extended capabilities (Alex Williamson)
     - Add Virtual Channel to save/restore support (Alex Williamson)
     - Never treat a VF as a multifunction device (Alex Williamson)
     - Add pci_try_reset_function(), et al (Alex Williamson)
 
   AER
     - Ignore non-PCIe error sources (Betty Dall)
     - Support ACPI HEST error sources for domains other than 0 (Betty Dall)
     - Consolidate HEST error source parsers (Bjorn Helgaas)
     - Add a TLP header print helper (Borislav Petkov)
 
   Freescale i.MX6
     - Remove unnecessary code (Fabio Estevam)
     - Make reset-gpio optional (Marek Vasut)
     - Report "link up" only after link training completes (Marek Vasut)
     - Start link in Gen1 before negotiating for Gen2 mode (Marek Vasut)
     - Fix PCIe startup code (Richard Zhu)
 
   Marvell MVEBU
     - Remove duplicate of_clk_get_by_name() call (Andrew Lunn)
     - Drop writes to bridge Secondary Status register (Jason Gunthorpe)
     - Obey bridge PCI_COMMAND_MEM and PCI_COMMAND_IO bits (Jason Gunthorpe)
     - Support a bridge with no IO port window (Jason Gunthorpe)
     - Use max_t() instead of max(resource_size_t,) (Jingoo Han)
     - Remove redundant of_match_ptr (Sachin Kamat)
     - Call pci_ioremap_io() at startup instead of dynamically (Thomas Petazzoni)
 
   NVIDIA Tegra
     - Disable Gen2 for Tegra20 and Tegra30 (Eric Brower)
 
   Renesas R-Car
     - Add runtime PM support (Valentine Barshak)
     - Fix rcar_pci_probe() return value check (Wei Yongjun)
 
   Synopsys DesignWare
     - Fix crash in dw_msi_teardown_irq() (Bjørn Erik Nilsen)
     - Remove redundant call to pci_write_config_word() (Bjørn Erik Nilsen)
     - Fix missing MSI IRQs (Harro Haan)
     - Add dw_pcie prefix before cfg_read/write (Pratyush Anand)
     - Fix I/O transfers by using CPU (not realio) address (Pratyush Anand)
     - Whitespace cleanup (Jingoo Han)
 
   EISA
     - Call put_device() if device_register() fails (Levente Kurusa)
     - Revert EISA initialization breakage ((Bjorn Helgaas)
 
   Miscellaneous
     - Remove unused code, including PCIe 3.0 interfaces (Stephen Hemminger)
     - Prevent bus conflicts while checking for bridge apertures (Bjorn Helgaas)
     - Stop clearing bridge Secondary Status when setting up I/O aperture (Bjorn Helgaas)
     - Use dev_is_pci() to identify PCI devices (Yijing Wang)
     - Deprecate DEFINE_PCI_DEVICE_TABLE (Joe Perches)
     - Update documentation 00-INDEX (Erik Ekman)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJS3ujEAAoJEFmIoMA60/r8A4EQAK9AZSUSVNWvlKdC1PrBfT3w
 7fVILx5A4KWsOU8eoFwCPQLrgvUtMltg16yN2tbCjqpKEdrVc36biMO9bwhnXSyZ
 KopHKMWnn0sza/z2H8mcGy+0azGdWcIjcErX/a8WeS6zyWBjm+yzckrHNVpPu4Ca
 SpCBhfgBMjKyIZyLtP6juFSH34S2DfQex4oUSyPC+gjqPy5wW/xw/kBxZfOXl+yU
 P9pQT+geMIc31pETMdG9wd/TT+47YAui4ieSggoVxfVrphCXv6S8mOMCMuQc2bAy
 MHy9uFm1jbvKZZIYrzJ+9HFiiU/6MNiOO3Ygua52xuSp1Zrcjwi2CLD9/QBXbDVs
 pTKU5JIO7q43llkQUpIXTwBvEApSZRhuqzXegsMAYIg4AWmbfm/2fXkfWlQThYMp
 J48blAJZ4t0vhMr9usgwbtdBe8F5euExOxpwH0QMCMABbuu8/B3TLm39+LTcIbsw
 Efgm3N9iUTyiV5fe9Rr62nflhyqXjTevPl4wbZZe4OOdm0MXZY+/BzuNJhg3wyY8
 QKz2J3FB6OR7BCLHCp80l50s5+Ih4F5kmOXwFKjT1D1MFRaNaPDmp9BY6TitU6hg
 zj55gP4c8x6n3alakbf972Yhgs/4oi1va8cZL+pCYWb8nPO5ldaMiT7QBBLUreQV
 BtDtC7u/AFWJ5e73+jVO
 =La1R
 -----END PGP SIGNATURE-----

Merge tag 'pci-v3.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "PCI changes for the v3.14 merge window:

  Resource management
    - Change pci_bus_region addresses to dma_addr_t (Bjorn Helgaas)
    - Support 64-bit AGP BARs (Bjorn Helgaas, Yinghai Lu)
    - Add pci_bus_address() to get bus address of a BAR (Bjorn Helgaas)
    - Use pci_resource_start() for CPU address of AGP BARs (Bjorn Helgaas)
    - Enforce bus address limits in resource allocation (Yinghai Lu)
    - Allocate 64-bit BARs above 4G when possible (Yinghai Lu)
    - Convert pcibios_resource_to_bus() to take pci_bus, not pci_dev (Yinghai Lu)

  PCI device hotplug
    - Major rescan/remove locking update (Rafael J. Wysocki)
    - Make ioapic builtin only (not modular) (Yinghai Lu)
    - Fix release/free issues (Yinghai Lu)
    - Clean up pciehp (Bjorn Helgaas)
    - Announce pciehp slot info during enumeration (Bjorn Helgaas)

  MSI
    - Add pci_msi_vec_count(), pci_msix_vec_count() (Alexander Gordeev)
    - Add pci_enable_msi_range(), pci_enable_msix_range() (Alexander Gordeev)
    - Deprecate "tri-state" interfaces: fail/success/fail+info (Alexander Gordeev)
    - Export MSI mode using attributes, not kobjects (Greg Kroah-Hartman)
    - Drop "irq" param from *_restore_msi_irqs() (DuanZhenzhong)

  SR-IOV
    - Clear NumVFs when disabling SR-IOV in sriov_init() (ethan.zhao)

  Virtualization
    - Add support for save/restore of extended capabilities (Alex Williamson)
    - Add Virtual Channel to save/restore support (Alex Williamson)
    - Never treat a VF as a multifunction device (Alex Williamson)
    - Add pci_try_reset_function(), et al (Alex Williamson)

  AER
    - Ignore non-PCIe error sources (Betty Dall)
    - Support ACPI HEST error sources for domains other than 0 (Betty Dall)
    - Consolidate HEST error source parsers (Bjorn Helgaas)
    - Add a TLP header print helper (Borislav Petkov)

  Freescale i.MX6
    - Remove unnecessary code (Fabio Estevam)
    - Make reset-gpio optional (Marek Vasut)
    - Report "link up" only after link training completes (Marek Vasut)
    - Start link in Gen1 before negotiating for Gen2 mode (Marek Vasut)
    - Fix PCIe startup code (Richard Zhu)

  Marvell MVEBU
    - Remove duplicate of_clk_get_by_name() call (Andrew Lunn)
    - Drop writes to bridge Secondary Status register (Jason Gunthorpe)
    - Obey bridge PCI_COMMAND_MEM and PCI_COMMAND_IO bits (Jason Gunthorpe)
    - Support a bridge with no IO port window (Jason Gunthorpe)
    - Use max_t() instead of max(resource_size_t,) (Jingoo Han)
    - Remove redundant of_match_ptr (Sachin Kamat)
    - Call pci_ioremap_io() at startup instead of dynamically (Thomas Petazzoni)

  NVIDIA Tegra
    - Disable Gen2 for Tegra20 and Tegra30 (Eric Brower)

  Renesas R-Car
    - Add runtime PM support (Valentine Barshak)
    - Fix rcar_pci_probe() return value check (Wei Yongjun)

  Synopsys DesignWare
    - Fix crash in dw_msi_teardown_irq() (Bjørn Erik Nilsen)
    - Remove redundant call to pci_write_config_word() (Bjørn Erik Nilsen)
    - Fix missing MSI IRQs (Harro Haan)
    - Add dw_pcie prefix before cfg_read/write (Pratyush Anand)
    - Fix I/O transfers by using CPU (not realio) address (Pratyush Anand)
    - Whitespace cleanup (Jingoo Han)

  EISA
    - Call put_device() if device_register() fails (Levente Kurusa)
    - Revert EISA initialization breakage ((Bjorn Helgaas)

  Miscellaneous
    - Remove unused code, including PCIe 3.0 interfaces (Stephen Hemminger)
    - Prevent bus conflicts while checking for bridge apertures (Bjorn Helgaas)
    - Stop clearing bridge Secondary Status when setting up I/O aperture (Bjorn Helgaas)
    - Use dev_is_pci() to identify PCI devices (Yijing Wang)
    - Deprecate DEFINE_PCI_DEVICE_TABLE (Joe Perches)
    - Update documentation 00-INDEX (Erik Ekman)"

* tag 'pci-v3.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (119 commits)
  Revert "EISA: Initialize device before its resources"
  Revert "EISA: Log device resources in dmesg"
  vfio-pci: Use pci "try" reset interface
  PCI: Check parent kobject in pci_destroy_dev()
  xen/pcifront: Use global PCI rescan-remove locking
  powerpc/eeh: Use global PCI rescan-remove locking
  PCI: Fix pci_check_and_unmask_intx() comment typos
  PCI: Add pci_try_reset_function(), pci_try_reset_slot(), pci_try_reset_bus()
  MPT / PCI: Use pci_stop_and_remove_bus_device_locked()
  platform / x86: Use global PCI rescan-remove locking
  PCI: hotplug: Use global PCI rescan-remove locking
  pcmcia: Use global PCI rescan-remove locking
  ACPI / hotplug / PCI: Use global PCI rescan-remove locking
  ACPI / PCI: Use global PCI rescan-remove locking in PCI root hotplug
  PCI: Add global pci_lock_rescan_remove()
  PCI: Cleanup pci.h whitespace
  PCI: Reorder so actual code comes before stubs
  PCI/AER: Support ACPI HEST AER error sources for PCI domains other than 0
  ACPICA: Add helper macros to extract bus/segment numbers from HEST table.
  PCI: Make local functions static
  ...
2014-01-22 16:39:28 -08:00
Daniel Vetter 0d9d349d87 Merge commit origin/master into drm-intel-next
Conflicts are getting out of hand, and now we have to shuffle even
more in -next which was also shuffled in -fixes (the call for
drm_mode_config_reset needs to move yet again).

So do a proper backmerge. I wanted to wait with this for the 3.13
relaese, but alas let's just do this now.

Conflicts:
	drivers/gpu/drm/i915/i915_reg.h
	drivers/gpu/drm/i915/intel_ddi.c
	drivers/gpu/drm/i915/intel_display.c
	drivers/gpu/drm/i915/intel_pm.c

Besides the conflict around the forcewake get/put (where we chaged the
called function in -fixes and added a new parameter in -next) code all
the current conflicts are of the adjacent lines changed type.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-16 22:06:30 +01:00
Daniel Vetter 0e46ce2e7a drm/i915: fix ppgtt dump code for DEBUG_FS=n
A regression in the topic/ppgtt branch introduce in

commit 87d60b63e0
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Fri Dec 6 14:11:29 2013 -0800

    drm/i915: Add PPGTT dumper

The issue is that we're missing the definitions for the seq_file
functions and hence compilation fails.

v2: Just include the right header instead of splattering #ifdefs all
over the place (Chris).

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Reported-by: Antti Koskipaa <antti.koskipaa@linux.intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-10 08:21:37 +01:00
Bjorn Helgaas 21c346075c drm/i915: Rename gtt_bus_addr to gtt_phys_addr
We're dealing with CPU physical addresses here, which may be different from
bus addresses, so rename gtt_bus_addr to gtt_phys_addr to avoid confusion.

No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-07 11:36:43 -07:00
Chris Wilson 6f1cc99351 drm/i915: Avoid dereference past end of page array in gen8_ppgtt_insert_entries()
The bug from gen6_ppgtt_insert_entries() was replicated into
gen8_ppgtt_insert_entries(). This applies the fix for the OOPS from the
previous patch to the gen8 routine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-07 08:24:16 +01:00
Chris Wilson cc79714fbc drm/i915: Avoid dereference past end of page array in gen6_ppgtt_insert_entries()
[   89.237347] BUG: unable to handle kernel paging request at ffff880096326000
[   89.237369] IP: [<ffffffff81347227>] gen6_ppgtt_insert_entries+0x117/0x170
[   89.237382] PGD 2272067 PUD 25df0e067 PMD 25de5c067 PTE 8000000096326060
[   89.237394] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[   89.237404] CPU: 1 PID: 1981 Comm: gem_concurrent_ Not tainted 3.13.0-rc4+ #639
[   89.237411] Hardware name: Intel Corporation 2012 Client Platform/Emerald Lake 2, BIOS ACRVMBY1.86C.0078.P00.1201161002 01/16/2012
[   89.237420] task: ffff88024c038030 ti: ffff88024b130000 task.ti: ffff88024b130000
[   89.237425] RIP: 0010:[<ffffffff81347227>]  [<ffffffff81347227>] gen6_ppgtt_insert_entries+0x117/0x170
[   89.237435] RSP: 0018:ffff88024b131ae0  EFLAGS: 00010286
[   89.237440] RAX: ffff880096325000 RBX: 0000000000000400 RCX: 0000000000001000
[   89.237445] RDX: 0000000000000200 RSI: 0000000000000001 RDI: 0000000000000010
[   89.237451] RBP: ffff88024b131b30 R08: ffff88024cc3aef0 R09: 0000000000000000
[   89.237456] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88024cc3ae00
[   89.237462] R13: ffff88024a578000 R14: 0000000000000001 R15: ffff88024a578ffc
[   89.237469] FS:  00007ff5475d8900(0000) GS:ffff88025d020000(0000) knlGS:0000000000000000
[   89.237475] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   89.237480] CR2: ffff880096326000 CR3: 000000024d531000 CR4: 00000000001407e0
[   89.237485] Stack:
[   89.237488]  ffff880000000000 0000020000000000 ffff88024b23f2c0 0000000100000000
[   89.237499]  0000000000000001 000000000007ffff ffff8801e7bf5ac0 ffff8801e7bf5ac0
[   89.237510]  ffff88024cc3ae00 ffff880248a2ee40 ffff88024b131b58 ffffffff813455ed
[   89.237521] Call Trace:
[   89.237528]  [<ffffffff813455ed>] ppgtt_bind_vma+0x3d/0x60
[   89.237534]  [<ffffffff8133d8dc>] i915_gem_object_pin+0x55c/0x6a0
[   89.237541]  [<ffffffff8134275b>] i915_gem_execbuffer_reserve_vma.isra.14+0x5b/0x110
[   89.237548]  [<ffffffff81342a88>] i915_gem_execbuffer_reserve+0x278/0x2c0
[   89.237555]  [<ffffffff81343d29>] i915_gem_do_execbuffer.isra.22+0x699/0x1250
[   89.237562]  [<ffffffff81344d91>] ? i915_gem_execbuffer2+0x51/0x290
[   89.237569]  [<ffffffff81344de6>] i915_gem_execbuffer2+0xa6/0x290
[   89.237575]  [<ffffffff813014f2>] drm_ioctl+0x4d2/0x610
[   89.237582]  [<ffffffff81080bf1>] ? cpuacct_account_field+0xa1/0xc0
[   89.237588]  [<ffffffff81080b55>] ? cpuacct_account_field+0x5/0xc0
[   89.237597]  [<ffffffff811371c0>] do_vfs_ioctl+0x300/0x520
[   89.237603]  [<ffffffff810757a1>] ? vtime_account_user+0x91/0xa0
[   89.237610]  [<ffffffff810e40eb>] ?  context_tracking_user_exit+0x9b/0xe0
[   89.237617]  [<ffffffff81083d7d>] ? trace_hardirqs_on+0xd/0x10
[   89.237623]  [<ffffffff81137425>] SyS_ioctl+0x45/0x80
[   89.237630]  [<ffffffff815afffa>] tracesys+0xd4/0xd9
[   89.237634] Code: 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 0f 1f 84 00 00 00 00 00 83 45 bc 01 49 8b 84 24 78 01 00 00 65 ff 0c 25 e0 b8 00 00 8b 55 bc <4c> 8b 2c d0 65 ff 04 25 e0 b8 00 00 49 8b 45 00 48 c1 e8 2d 48
[   89.237741] RIP  [<ffffffff81347227>] gen6_ppgtt_insert_entries+0x117/0x170
[   89.237749]  RSP <ffff88024b131ae0>
[   89.237753] CR2: ffff880096326000
[   89.237758] ---[ end trace 27416ba8b18d496c ]---

This bug dates back to the original introduction of the
gen6_ppgtt_insert_entries()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Dropped cc: stable since without full ppgtt there's no way
we'll access the last page directory with this function since that
range is occupied (only in the allocator) with the ppgtt pdes. Without
aliasing we can start to use that range and blow up.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-07 08:22:42 +01:00
Chris Wilson c0a7f81899 drm/i915: Mention when we enable the Ironlake iommu workarounds
The iommu and gfx on Ironlake do not like each other and require a
big hammer to prevent hard machine hangs. In

commit 5c0422878f
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Mon Oct 17 15:51:55 2011 -0700

    drm/i915: ILK + VT-d workaround

we added the workaround, but never emitted any debug message that it was
active. Doing so should help identify known performance regressions.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-07 08:10:54 +01:00
Ben Widawsky 058840c7a0 drm/i915/bdw: Flush system agent on gen8 also
gem_gtt_cpu_tlb seems to indicate that it is needed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72869

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-01-06 11:11:04 +01:00
Ben Widawsky 87d60b63e0 drm/i915: Add PPGTT dumper
Dump the aliasing PPGTT with it. The aliasing PPGTT should actually
always be empty.

TODO: Broadwell. Since we don't yet use full PPGTT on Broadwell, not
having the dumper is okay.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 16:26:16 +01:00
Ben Widawsky d2ff7192f3 drm/i915: Remove extraneous mm_switch in ppgtt enable
Originally this commit message said:
Now that do_switch does the mm switch, and we always enable the aliasing
PPGTT, and contexts at the same time, there is no need to continue doing
this during PPGTT enabling.

Since originally writing the patch however, I introduced the concept of
synchronous mm switching (using MMIO). Since this is generally not
recommended in the spec (for reasons unknown), I've isolated its usage
as much as possible. As such the "extraneous" switch only ever will
occur when we have full PPGTT.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 16:24:57 +01:00
Ben Widawsky 7e0d96bc03 drm/i915: Use multiple VMs -- the point of no return
As with processes which run on the CPU, the goal of multiple VMs is to
provide process isolation. Specific to GEN, there is also the ability to
map more objects per process (2GB each instead of 2Gb-2k total).

For the most part, all the pipes have been laid, and all we need to do
is remove asserts and actually start changing address spaces with the
context switch. Since prior to this we've converted the setting of the
page tables to a streamed version, this is quite easy.

One important thing to point out (since it'd been hotly contested) is
that with this patch, every context created will have it's own address
space (provided the HW can do it).

v2: Disable BDW on rebase

NOTE: I tried to make this commit as small as possible. I needed one
place where I could "turn everything on" and that is here. It could be
split into finer commits, but I didn't really see much point.

Cc: Eric Anholt <eric@anholt.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 16:24:52 +01:00
Daniel Vetter 3d7f0f9dcc Merge commit drm-intel-fixes into topic/ppgtt
I need the tricky do_switch fix before I can merge the final piece of
the ppgtt enabling puzzle. Otherwise the conflict will be a real pain
to resolve since the do_switch hunk from -fixes must be placed at the
exact right place within a hunk in the next patch.

Conflicts:
	drivers/gpu/drm/i915/i915_gem_context.c
	drivers/gpu/drm/i915/i915_gem_execbuffer.c
	drivers/gpu/drm/i915/intel_display.c

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 16:23:37 +01:00
Ben Widawsky bdf4fd7ea0 drm/i915: Do aliasing PPGTT init with contexts
We have a default context which suits the aliasing PPGTT well. Tie them
together so it looks like any other context/PPGTT pair. This makes the
code cleaner as it won't have to special case aliasing as often.

The patch has one slightly tricky part in the default context creation
function. In the future (and on aliased setup) we create a new VM for a
context (potentially). However, if we have aliasing PPGTT, which occurs
at this point in time for all platforms GEN6+, we can simply manage the
refcounting to allow things to behave as normal. Now is a good time to
recall that the aliasing_ppgtt doesn't have a real VM, it uses the GGTT
drm_mm.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:32:14 +01:00
Ben Widawsky 80da216171 drm/i915: Restore PDEs for all VMs
In following with the old restore code, we must now restore ever PPGTT's
PDEs, since they aren't proper GEM ojbects.

v2: Rebased on BDW. Only do restore pdes for gen6 & 7

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:31:32 +01:00
Ben Widawsky 9f273d48aa drm/i915: Write PDEs at init instead of enable
We won't be calling enable() for all PPGTTs. We do need to write PDEs
for all PPGTTs however. By moving the writing to init (which is called
for all PPGTTs) we should accomplish this.

ADD NOTE ABOUT PDE restore

TODO: Eventually, we should allocate the page tables on demand.

v2: Rebased on BDW. Only do PDEs for pre-gen8

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:31:26 +01:00
Ben Widawsky c7c48dfdff drm/i915: Add VM to context
Pretty straightforward so far except for the bit about the refcounting.
The PPGTT will potentially be shared amongst multiple contexts. Because
contexts themselves have a refcounted lifecycle, the easiest way to
manage this will be to refcount the PPGTT. To acheive this, we piggy
back off of the existing context refcount, and will increment and
decrement the PPGTT refcount with context creation, and destruction.

To put it more clearly, if context A, and context B both use PPGTT 0, we
can't free the PPGTT until both A, and B are destroyed.

Note that because the PPGTT is permanently pinned (for now), it really
just matters for the PPGTT destruction, as opposed to making space under
memory pressure.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:31:20 +01:00
Ben Widawsky 246cbfb5fb drm/i915: Reorganize intel_enable_ppgtt
This patch consolidates the way in which we handle the various supported
PPGTT by module parameter in addition to what the hardware supports. It
strives to make doing the right thing in the code as simple as possible,
with the USES_ macros.

I've opted to add the full PPGTT argument simply so one can see how I
intend to use this function. It will not/cannot be used until later.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:31:06 +01:00
Ben Widawsky d6660add64 drm/i915: Generalize PPGTT init
Rearrange the initialization code to try to special case the aliasing
PPGTT less, and provide usable interfaces for the general case later.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:29:24 +01:00
Ben Widawsky 90252e5c68 drm/i915: Flush TLBs after !RCS PP_DIR_BASE
I've found this by accident. The docs don't really come out and say you
need to do this. What the docs do tell you is you need to flush the TLBs
before you set the PP_DIR_BASE, and that the RCS will invalidate its
TLBs upon setting the new PP_DIR_BASE. It makes no such comment about
any of the other rings.

Empirically, this indeed fixes a really obvious bug whereby the batches
being sent to the blitter were not executing (we were executing the
HSWP somehow instead).

NOTE: This should make no difference with the current code. It only
applies when we start using multiple VMs.

NOTE2: HSW appears to be immune to this.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:29:13 +01:00
Ben Widawsky 48a10389c8 drm/i915: Use LRI for switching PP_DIR_BASE
The docs seem to suggest this is the appropriate method (though it
doesn't say so outright). In other words, we probably should have done
this before. We certainly must do this for switching VMs on the fly,
since synchronizing the rings to MMIO updates isn't acceptable.

v2:
Make the reset code actually work for all rings. Note that this was
fixed in subsequent commits, but was indeed broken for this commit.

Add a posting read to the reset case. It probably should have existed
before hand, but since we have no failures; there is no reason to make
it a separate commit.

Make IS_GEN6 not use the ring because I am seeing crashes when using it.
It is a bit of a hack in this patch, it will get fixed up in a couple of
patches.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:28:45 +01:00
Ben Widawsky eeb9488e75 drm/i915: Extract mm switching to function
In order to do the full context switch with address space, it's
convenient to have a way to switch the address space. We already have
this in our code - just pull it out to be called by the context switch
code later.

v2: Rebased on BDW support. Required adding BDW.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:28:33 +01:00
Ben Widawsky b4a74e3adf drm/i915: Use platform specific ppgtt enable
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:59 +01:00
Ben Widawsky e3cc19957f drm/i915: One hopeful eviction on PPGTT alloc
The patch before this changed the way in which we allocate space for the
PPGTT PDEs. It began carving out the PPGTT PDEs (which live in the
Global GTT) from the GGTT's drm_mm. Prior to that patch, the PDEs were
hidden from the drm_mm, and therefore could never fail to be allocated.

In unfortunate cases, the drm_mm may be full when we want to allocate
the space. This can technically occur whenever we try to allocate, which
happens in two places currently. Practically, it can only really ever
happen at GPU reset.

Later, when we allocate more PDEs for multiple PPGTTs this will
potentially even more useful.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:58 +01:00
Ben Widawsky c8d4c0d668 drm/i915: Use drm_mm for PPGTT PDEs
When PPGTT support was originally enabled, it was only designed to
support 1 PPGTT. It therefore made sense to simply hide the GGTT space
required to enable this from the drm_mm allocator.

Since we intend to support full PPGTT, which means more than 1, and they
can be created and destroyed ad hoc it will be required to use the
proper allocation techniques we already have.

The first step here is to make the existing single PPGTT use the
allocator.

The astute observer will notice that we are reserving space in the GGTT
for the PDEs for the lifetime of the address space, and would be right
to question whether or not this is a good idea. It does not make a
difference with this current patch only the aliasing PPGTT (indeed the
PDEs should still be hidden from the shrinker). For the future, we are
allocating from top to bottom to avoid using the precious "gtt
space" The GGTT space at that point should only be used for scanout, HW
contexts, ringbuffers, HWSP, PDEs, and a couple of other small buffers
(potentially) used by the kernel. Everything else should be mapped into
a PPGTT. To put the consumption in more tangible terms, it takes
approximately 4 sets of PDEs to equal one 19x10 framebuffer (with no
fancy stride or alignment constraints). 3/4 of the total [average] GGTT
can be used for PDEs, and hopefully never touch the 1/4 that the
framebuffer needs.

The astute, and persistent observer might ask about the page tables
which are also pinned for the address space. This waste is unfortunate.
We use 2MB of memory per address space. We leave wrapping the PDEs as a
real GEM object as a TODO.

v2: Align PDEs to 64b in GTT
Allocate the node dynamically so we can use drm_mm_put_block
Now tested on IGT
Allocate node at the top to avoid fragmentation (Chris)

v3: Use Chris' top down allocator

v4: Embed drm_mm_node into ppgtt struct (Jesse)
Remove hunks which didn't belong (Jesse)

v5: Don't subtract guard page since we now killed the guard page prior
to this patch. (Ben)

v6: Rebased and removed guard page stuff.
Added a chunk to the commit message
Allow adding a context to mappable region

v7: Undo v3, so we can make the drm patch last in the series

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v4)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

squash: drm/i915: allow PPGTT to use mappable
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:57 +01:00
Ben Widawsky a3d67d2396 drm/i915: PPGTT vfuncs should take a ppgtt argument
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:56 +01:00
Ben Widawsky 6f65e29aca drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.

What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.

drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage

drm/i915: Add bind/unbind object functions to VMA

As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.

Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.

v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding.  This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)

v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.

v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)

v5: Update the comment to not suck (Chris)

v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

drm/i915: Use the new vm [un]bind functions

Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.

Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.

v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.

v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)

At this point the original mailing list thread diverges. ie.

v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

drm/i915: reduce vm->insert_entries() usage

FKA: drm/i915: eliminate vm->insert_entries()

With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.

v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:50 +01:00
Ben Widawsky e178f7057b drm/i915: Provide PDP updates via MMIO
The initial implementation of this function used MMIO to write the PDPs.
Upon review it was determined (correctly) that the docs say to use LRI.
The issue is there are times where we want to do a synchronous write
(GPU reset).

I've tested this, and it works. I've verified with as many people as
possible that it should work.

This should fix the failing reset problems.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:43 +01:00
Dave Airlie 62a3a12667 Merge branch 'bdw-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes
As promised bdw fixes come separate for now. Just a few minior things.

* 'bdw-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
  drm/i915/bdw: PIPE_[BC] I[ME]R moved to powerwell
  drm/i915/bdw: Limit GTT to 2GB
  drm/i915/bdw: Add comment about gen8 HWS PGA
  drm/i915/bdw: Free correct number of ppgtt pages
  drm/i915/bdw: Do gen6 style reset for gen8
  drm/i915/bdw: GEN8 backlight support
  drm/i915/bdw: Add BDW to ULT macro
2013-12-12 10:38:08 +10:00
Ben Widawsky 5ed1678206 drm/i915: Move the gtt mm takedown to cleanup
Our VM code already has a cleanup function, and this is a nice place to
put the drm_mm_takedown. This should have no functional impact, it just
leaves the unload function a bit cleaer, and is more logical IMO

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-26 10:21:27 +01:00
Ben Widawsky 686e1f6f87 drm/i915: Add a few missed bits to the mm
This should really have been added in BDW integration, as well as:

commit 93bd8649db
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Tue Jul 16 16:50:06 2013 -0700

    drm/i915: Put the mm in the parent address space

It didn't really matter before, but it will in the future.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-26 10:06:22 +01:00
Ben Widawsky d595bd4bbd drm/i915: Fix BDW PPGTT error path
When we fail for some reason on loading the PDPs, it would be wise to
disable the PPGTT in the ring registers. If we do not do this, we have
undefined results.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-26 09:59:05 +01:00
Chris Wilson c51e9701c4 drm/i915: Prefer setting PTE cache age to 3
We have conflicting benchmark data that suggest either age 0 or age 3 is
better. However, the earlier benchmark on which we based the switch to
age 0

(commit 0d8ff15e9a
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Thu Jul 4 11:02:03 2013 -0700

    drm/i915/hsw: Set correct Haswell PTE encodings)

actually seems to prefer the default PTE encoding as age 3. Presumably,
this is in part due to the use of MOCS to override the PTE encodings
when appropriate.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69870
Tested-by: mengmeng.meng@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Eric Anholt <eric@anholt.net
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-25 09:50:15 +01:00
Daniel Vetter c09cd6e969 Merge branch 'backlight-rework' into drm-intel-next-queued
Pull in Jani's backlight rework branch. This was merged through a
separate branch to be able to sort out the Broadwell conflicts
properly before pulling it into the main development branch.

Conflicts:
	drivers/gpu/drm/i915/intel_display.c

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-15 10:02:39 +01:00
Ben Widawsky 3a2ffb65ee drm/i915/bdw: Limit GTT to 2GB
Because of the way in which we're allocating the pages for the Aliasing
PPGTT, we cannot actually successfully alloc enough space for anything
greater than 2GB.

Instead of a quick hack to fix this, we should defer until we have the
real solution in place (allocating much less contiguous space).

This wasn't found sooner because we didn't not have any systems
supporting more than a 2GB GTT.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-14 09:33:11 +01:00
Ben Widawsky 230f955f73 drm/i915/bdw: Free correct number of ppgtt pages
I am unclear how this got messed up in the shuffle, but it did.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-14 09:33:10 +01:00
Jesse Barnes b53c8c3577 drm/i915: drop duplicate ggtt vma list add in setup_global_gtt
Preallocated objects will already have been added to the vma_list when
creating their ggtt vma entry, and coincidentally also marked as holding
a ggtt mapping. Repeating the vma_list manipulation when setting up the
ggtt after preallocation is a recipe for an unhappy kernel.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Use the improve commit message suggest by Chris.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-13 01:05:01 +01:00
Ville Syrjälä b42218c19f drm/i915/bdw: Don't muck with gtt_size on Gen8 when PPGTT setup fails
v2: Resolve rebase conflicts and switch to gen < 8 color for GenX
checking.

v3: Rebase on top of the address space refactoring.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:49 +01:00
Ben Widawsky 28cf541543 drm/i915/bdw: unleash PPGTT
v2: Squash in fix from Ben: Set PPGTT batches as necessary

This fixes the regression in the last couple of days when we enabled
PPGTT.

v3: Squash in fixup to still use GTT for secure batches from Ville:

BDW doesn't have a separate secure vs. non-secure bit in
MI_BATCH_BUFFER_START. So for secure batches we have to simply
leave the PPGTT bit unset. Fortunately older generations (except
HSW) had similar limitations so execbuffer already creates a GTT
mapping for all secure batches.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:48 +01:00
Ben Widawsky 94e409c144 drm/i915/bdw: Implement PPGTT enable
Legacy PPGTT on GEN8 requires programming 4 PDP registers per ring.
Since all rings are using the same address space with the current code
the logic is simply to program all the tables we've setup for the PPGTT.

v2: Turn on PPGTT in GFX_MODE

v3: v2 was the wrong patch

v4: Resolve conflicts due to patch series reordering.

v5: Squash in fixup from Ben: Use LRI to write PDPs

The docs (and simulator seems to back up) suggest that we can only
program legacy PPGTT PDPs with LRI commands.

v6: Rebase around context differences conflicts.

v7: Use #defines for per ring PDPs. (Damien)

v8: Don't use typede'f private_t.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (up to v3 and v7)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:47 +01:00
Ben Widawsky 9df15b499b drm/i915/bdw: Implement PPGTT insert
GEN8 insertion is very similar to GEN6.

v2: Rebase on top of Imre's for_each_sg_page helpers.

v3: Fixup my conversion (spotted by Ville).

v4: Rebase on top of the address space refactoring.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:47 +01:00
Ben Widawsky 459108b8cc drm/i915/bdw: Implement PPGTT clear range
GEN8 PPGTT range clearing is very similar to GEN6 if we assume that our
PDEs are all valid, which they should be.

v2: Rebase on top of the address space refactoring.

v3: Rebase on top of the bool use_scratch addition to the clear_range interface.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:46 +01:00
Ben Widawsky b1fe667329 drm/i915/bdw: Initialize the PDEs
The upcoming clear and insert routines will expect that PDEs all point
to valid Page Directories. Doing that lazily doesn't really buy us
anything.

The page allocation is done regardless earlier in init so it shouldn't
hurt set the PDEs.

v2: Squash in patches to implement fixed PDE write function:

- If I had done this in the first place, the bug that's going to be
  fixed in an upcoming patch would have been much easier to find.

- Use WB for PDEs.

  The PAT bit is used for page size. 2ME PDEs aren't even supported in
  BDW, so this was completely invalid. The solution is to make our
  PDEs WB+LLC instead of the pervious WB+eLLC. As far as I can guess,
  this change won't matter for performance.

  Thanks to Ville for the quick correction when discussing on IRC.

v3: Return the pde type for pde encoding (Damien)

Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:46 +01:00
Ben Widawsky 37aca44ad5 drm/i915/bdw: PPGTT init & cleanup
Aside from the potential size increase of the PPGTT, the primary
difference from previous hardware is the Page Directories are no longer
carved out of the Global GTT.

Note that the PDE allocation is done as a 8MB contiguous allocation,
this needs to be eventually fixed (since driver reloading will be a
pain otherwise). Also, this will be a no-go for real PPGTT support.

v2: Move vtable initialization

v3: Resolve conflicts due to patch series reordering.

v4: Rebase on top of the address space refactoring of the PPGTT
support. Drop Imre's r-b tag for v2, too outdated by now.

v5: Free the correct amount of memory, "get_order takes size not a page
count." (Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:45 +01:00
Ben Widawsky fbe5d36e77 drm/i915/bdw: Support BDW caching
BDW caching works differently than the previous generations. Instead of
having bits in the PTE which directly control how the page is cached,
the 3 PTE bits PWT PCD and PAT provide an index into a PAT defined by
register 0x40e0. This style of caching is functionally equivalent to how
it works on HSW and before.

v2: Tiny bikeshed as discussed on internal irc.

v3: Squash in patch from Ville to mirror the x86 PAT setup more like
in arch/x86/mm/pat.c. Primarily, the 0th index will be WB, and not
uncached.

v4: Comment for reason to not use a 64b write on the PPAT.

v5: Add a FIXME comment that the caching bits in the PAT registers
might be wrong due to doc confusion.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:45 +01:00
Ben Widawsky 94ec8f6130 drm/i915/bdw: Add GTT functions
With the PTE clarifications, the bind and clear functions can now be
added for gen8.

v2: Use for_each_sg_pages in gen8_ggtt_insert_entries.

v3: Drop dev argument to pte encode functions, upstream lost it. Also
rebase on top of the scratch page movement.

v4: Rebase on top of the new address space vfuncs.

v5: Add the bool use_scratch argument to clear_range and the bool valid argument
to the PTE encode function to follow upstream changes.

v6: Add a FIXME(BDW) about the size mismatch of the readback check
that Jon Bloomfield spotted.

v7: Squash in fixup patch from Ben for the posting read to match the
64bit ptes and so shut up the WARN.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:44 +01:00
Ben Widawsky d31eb10e6c drm/i915/bdw: Create gen8_gtt_pte_t
With gen6 PTE type in place, pave the way for the new gen8 type.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:44 +01:00
Ben Widawsky 63340133f3 drm/i915/bdw: Make gen8_gmch_probe
Probing gen8 is similar to gen6. To make the code cleaner and more
maintainable however we can use the probe functions to split it out.

v2: Rebased on top of update gtt probe infrastructure.

v3: Rebased on top of Kenneth' Graunke's ->pte_encode refactoring.

V4: Resolve conflicts with Ben's latest ppgtt patches, also switch to
gen < 8 testing instead of gen <= 7.

v5: Resolve conflicts with address space vfunc changes in upstream.

v6: Use 39b DMA mask. At least, for this mode, it is the correct mask.
(Imre)

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:43 +01:00
Ben Widawsky 9459d25237 drm/i915/bdw: support GMS and GGMS changes
All the BARs have the ability to grow.

v2: Pulled out the simulator workaround to a separate patch.
Rebased.

v3: Rebase onto latest vlv patches from Jesse.

v4: Rebased on top of the early stolen quirk patch from Jesse.

v5: Use the new macro names.
s/INTEL_BDW_PCI_IDS_D/INTEL_BDW_D_IDS
s/INTEL_BDW_PCI_IDS_M/INTEL_BDW_M_IDS
It's Jesse's fault for not following the convention I originally set.

Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:39 +01:00
Daniel Vetter 8fe6bd239a drm/i915/bdw: Disable PPGTT for now
This will be changed once the gen8 code is fully implemented.

v2: Use ENOSYS instead of ENXIO as suggested by Chris.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:35 +01:00
Daniel Vetter 7f16e5c141 Merge tag 'v3.12' into drm-intel-next
I want to merge in the new Broadwell support as a late hw enabling
pull request. But since the internal branch was based upon our
drm-intel-nightly integration branch I need to resolve all the
oustanding conflicts in drm/i915 with a backmerge to make the 60+
patches apply properly.

We'll propably have some fun because Linus will come up with a
slightly different merge solution.

Conflicts:
	drivers/gpu/drm/i915/i915_dma.c
	drivers/gpu/drm/i915/i915_drv.c
	drivers/gpu/drm/i915/intel_crt.c
	drivers/gpu/drm/i915/intel_ddi.c
	drivers/gpu/drm/i915/intel_display.c
	drivers/gpu/drm/i915/intel_dp.c
	drivers/gpu/drm/i915/intel_drv.h

All rather simple adjacent lines changed or partial backports from
-next to -fixes, with the exception of the thaw code in i915_dma.c.
That one needed a bit of shuffling to restore the intent.

Oh and the massive header file reordering in intel_drv.h is a bit
trouble. But not much.

v2: Also don't forget the fixup for the silent conflict that results
in compile fail ...

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-04 16:28:52 +01:00
Ben Widawsky 828c79087c drm/i915: Disable GGTT PTEs on GEN6+ suspend
Once the machine gets to a certain point in the suspend process, we
expect the GPU to be idle. If it is not, we might corrupt memory.
Empirically (with an early version of this patch) we have seen this is
not the case. We cannot currently explain why the latent GPU writes
occur.

In the technical sense, this patch is a workaround in that we have an
issue we can't explain, and the patch indirectly solves the issue.
However, it's really better than a workaround because we understand why
it works, and it really should be a safe thing to do in all cases.

The noticeable effect other than the debug messages would be an increase
in the suspend time. I have not measure how expensive it actually is.

I think it would be good to spend further time to root cause why we're
seeing these latent writes, but it shouldn't preclude preventing the
fallout.

NOTE: It should be safe (and makes some sense IMO) to also keep the
VALID bit unset on resume when we clear_range(). I've opted not to do
this as properly clearing those bits at some later point would be extra
work.

v2: Fix bugzilla link

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=65496
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=59321
Tested-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-By: Todd Previte <tprevite@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-18 15:44:47 +02:00
Ben Widawsky b35b380ed4 drm/i915: Make PTE valid encoding optional
We need this to work around a corruption when the boot kernel image
loads the hibernated kernel image from swap on Haswell systems -
somehow not everything is properly shut off.

This is just the prep work, the next patch will implement the actual
workaround.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Add a commit message suitable for -fixes and add cc: stable]
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-18 15:40:21 +02:00
Daniel Vetter a1e2265332 drm/i915: Use kcalloc more
No buffer overflows here, but better safe than sorry.

v2:
- Fixup the sizeof conversion, I've missed the pointer deref (Jani).
- Drop the redundant GFP_ZERO, kcalloc alreads memsets (Jani).
- Use kmalloc_array for the execbuf fastpath to avoid the memset
  (Chris). I've opted to leave all other conversions as-is since they
  aren't in a fastpath and dealing with cleared memory instead of
  random garbage is just generally nicer.

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Drop the contentious kmalloc_array hunk in execbuf.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-01 07:45:01 +02:00
Chris Wilson 651d794fae drm/i915: Use Write-Through cacheing for the display plane on Iris
Haswell GT3e has the unique feature of supporting Write-Through cacheing
of objects within the eLLC/LLC. The purpose of this is to enable the display
plane to remain coherent whilst objects lie resident in the eLLC/LLC - so
that we, in theory, get the best of both worlds, perfect display and fast
access.

However, we still need to be careful as the CPU does not see the WT when
accessing the cache. In particular, this means that we need to flush the
cache lines after writing to an object through the CPU, and on
transitioning from a cached state to WT.

v2: Actually do the clflush on transition to WT, nagging by Ville.
v3: Flush the CPU cache after writes into WT objects.
v4: Rease onto LLC updates and report WT as "uncached" for
get_cache_level_ioctl to remain symmetric with set_cache_level_ioctl.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22 13:31:38 +02:00
Chris Wilson 2c22569bba drm/i915: Update rules for writing through the LLC with the cpu
As mentioned in the previous commit, reads and writes from both the CPU
and GPU go through the LLC. This gives us coherency between the CPU and
GPU irrespective of the attribute settings either device sets. We can
use to avoid having to clflush even uncached memory.

Except for the scanout.

The scanout resides within another functional block that does not use
the LLC but reads directly from main memory. So in order to maintain
coherency with the scanout, writes to uncached memory must be flushed.
In order to optimize writes elsewhere, we start tracking whether an
framebuffer is attached to an object.

v2: Use pin_display tracking rather than fb_count (to ensure we flush
cursors as well etc) and only force the clflush along explicit writes to
the scanout paths (i.e. pin_to_display_plane and pwrite into scanout).

v3: Force the flush after hitting the slowpath in pwrite, as after
dropping the lock the object's cache domain may be invalidated. (Ville)

Based on a patch by Ville Syrjälä.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-10 11:20:49 +02:00
Chris Wilson 350ec881d9 drm/i915: Rename I915_CACHE_MLC_LLC to L3_LLC for Ivybridge
MLC_LLC was never validated for Sandybridge and was superseded by a new
level of cacheing for the GPU in Ivybridge. Update our names to be
consistent with usage, and in the process stop setting the unwanted bit
on Sandybridge.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: s/BUG/WARN_ON(1) bikeshed.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-06 16:35:30 +02:00
Ben Widawsky 40d74980d3 drm/i915: Use ggtt_vm to save some typing
Just some small cleanups, and a rename of vm->ggtt_vm requested by
Daniel.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:10 +02:00
Ben Widawsky a70a3148b0 drm/i915: Make proper functions for VMs
Earlier in the conversion sequence we attempted to quickly wedge in the
transitional interface as static inlines.

Now that we're sure these interfaces are sane, for easier debug and to
decrease code size (since many of these functions may be called quite a
bit), make them real functions

While at it, kill off the set_color interface. We'll always have the
VMA, or easily get to it.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:08 +02:00
Ben Widawsky 87a6b688cc drm/i915/hsw: Change default LLC age to 3
The default LLC age was changed:
commit 0d8ff15e9a
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Thu Jul 4 11:02:03 2013 -0700

drm/i915/hsw: Set correct Haswell PTE encodings.

On the surface it would seem setting a default age wouldn't matter
because all GEM BOs are aged similarly, so the order in which objects
are evicted would not be subject to aging. The current working theory as
to why this caused a regression though is that LLC is a bit special in
that it is shared with the CPU. Presumably (not verified) the CPU
fetches cachelines with age 3, and therefore recently cached GPU objects
would be evicted before similar CPU object first when the LLC is full.
It stands to reason therefore that this would negatively impact CPU
bound benchmarks - but those seem to be low on the priority list.

eLLC OTOH does not have this same property as LLC. It should be used
entirely for the GPU, and so the age really shouldn't matter.
Furthermore, we have no evidence to suggest one is better than another
on eLLC. Since we've never properly supported eLLC before no, there
should be no regression. If the GPU client really wants "younger"
objects, they should use MOCS.

v2: Drop the extra #define (Chad)

v3: Actually git add

v4: Pimped commit message

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67062
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:06 +02:00
Chris Wilson 08c45263a6 drm/i915: Use the same pte_encoding for ppgtt as for gtt
The PTE layouts are the same for both ppgtt and gtt, so we can simplify
the setup for ppgtt by copying the encoding function pointer from gtt.
This prevents bugs where we update one function pointer, but forget the
other.

For instance,

commit 4d15c145a6
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Thu Jul 4 11:02:06 2013 -0700

    drm/i915: Use eLLC/LLC by default when available

only extends the gtt to use eLLC/LLC cacheing and forgets to also update
the ppgtt function pointer.

v2: Actually mention the bug being fixed (Kenneth)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-04 21:29:57 +02:00
Ben Widawsky 2f63315692 drm/i915: Create VMAs
Formerly: "drm/i915: Create VMAs (part 1)"

In a previous patch, the notion of a VM was introduced. A VMA describes
an area of part of the VM address space. A VMA is similar to the concept
in the linux mm. However, instead of representing regular memory, a VMA
is backed by a GEM BO. There may be many VMAs for a given object, one
for each VM the object is to be used in. This may occur through flink,
dma-buf, or a number of other transient states.

Currently the code depends on only 1 VMA per object, for the global GTT
(and aliasing PPGTT). The following patches will address this and make
the rest of the infrastructure more suited

v2: s/i915_obj/i915_gem_obj (Chris)

v3: Only move an object to the now global unbound list if there are no
more VMAs for the object which are bound into a VM (ie. the list is
empty).

v4: killed obj->gtt_space
some reworks due to rebase

v5: Free vma on error path (Imre)

v6: Another missed vma free in i915_gem_object_bind_to_gtt error path
(Imre)
Fixed vma freeing in stolen preallocation (Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
[danvet: Squash in fixup from Ben to not deref a non-existing vma in
set_cache_level, reported by Chris.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-18 08:46:13 +02:00
Ben Widawsky 93bd8649db drm/i915: Put the mm in the parent address space
Every address space should support object allocation. It therefore makes
sense to have the allocator be part of the "superclass" which GGTT and
PPGTT will derive.

Since our maximum address space size is only 2GB we're not yet able to
avoid doing allocation/eviction; but we'd hope one day this becomes
almost irrelvant.

v2: Rebased

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-17 22:23:43 +02:00
Ben Widawsky 853ba5d223 drm/i915: Move gtt and ppgtt under address space umbrella
The GTT and PPGTT can be thought of more generally as GPU address
spaces. Many of their actions (insert entries), state (LRU lists), and
many of their characteristics (size) can be shared. Do that.

The change itself doesn't actually impact most of the VMA/VM rework
coming up, it just fits in with the grand scheme of abstracting the GPU
VM operations. GGTT will usually be a special case where we either know
an object must be in the GGTT (dislay engine, workarounds, etc.).

The scratch page is left as part of the VM (even though it's currently
shared with the ppgtt code) because in the future when we have Full
PPGTT, I intend to create a separate scratch page for each.

v2: Drop usage of i915_gtt_vm (Daniel)
Make cleanup also part of the parent class (Ben)
Modified commit msg
Rebased

v3: Properly share scratch page (Imre)
Finish commit message (Daniel, Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-17 22:21:47 +02:00
Ben Widawsky 4d15c145a6 drm/i915: Use eLLC/LLC by default when available
DRI clients really should be using MOCS to get fine grained streaming
cache controls. With that note, I *hope* that this patch doesn't improve
performance overwhelmingly, because if it does - it means there is a
problem elsewhere.

In any case, the kernel, and old userspace should get some benefit from
this, so let's do it. eLLC is always a good default, and really not
using it is the special case for MOCS.

References: http://www.intel.com/newsroom/kits/restricted/ha$well!/pdfs/4th_Gen_Intel_Core_PressBriefing_5-29.pdf (page 57)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-16 08:08:39 +02:00
Ben Widawsky 0d8ff15e9a drm/i915/hsw: Set correct Haswell PTE encodings.
The cacheability controls have changed, and the bits have been
rearranged in general.

Note that age 0 is the oldest (most likely to get evicted) and age 3
is the youngest (most likely to stick around for a bit). We've picked
0 for no reason, but atm it shouldn't matter anyway (since we don't
yet try to differentiate between different objects).

v2: Remove comments for snb/ivb cache leves, that's a separate change.

v3: Resolve conflicts due to patch series reordering.

v4: Rebased on top of Kenneth Graunke's ->pte_encode refactoring.

v5: Removed eLLC bits for separate patch.

In the internal repository this was:
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Add comment about cache ages as requested by Ben provoked due
to a question from Damien.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-16 07:57:42 +02:00
Ben Widawsky c6cfb32567 drm/i915: Embed drm_mm_node in i915 gem obj
Embedding the node in the obj is more natural in the transition to VMAs
which will also have embedded nodes. This change also helps transition
away from put_block to remove node.

Though it's quite an uncommon occurrence, it's somewhat convenient to not
fail at bind time because we cannot allocate the node. Though in
practice there are other allocations (like the request structure) which
would probably make this point not terribly useful.

Quoting Daniel:
Note that the only difference between put_block and remove_node is
that the former fills up the preallocation cache. Which we don't need
anyway and hence is just wasted space.

v2: Clean up the stolen preallocation code.
Rebased on the reserve_node patches
renames ggtt_ stuff to gtt_ stuff
WARN_ON if the object is already bound (which doesn't mean it's in the
bound list, tricky)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:36 +02:00
Ben Widawsky edd41a870f drm/i915: Kill obj->gtt_offset
With the getters in place from the previous patch this members serves no
purpose other than saving one spare pointer chase, which will be killed
in the next patch anyway.

Moving to VMAs, this members adds unnecessary confusion since an object
may exist at different offsets in different VMs.

v2: Properly preserve the stolen offset. This code is a bit hacky but it
all goes away when we embed the drm_mm_node and removes the need for the
incorrect patch I submitted previously: "Use gtt_space->start for stolen
reservation"

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:35 +02:00
Ben Widawsky f343c5f647 drm/i915: Getter/setter for object attributes
Soon we want to gut a lot of our existing assumptions how many address
spaces an object can live in, and in doing so, embed the drm_mm_node in
the object (and later the VMA).

It's possible in the future we'll want to add more getter/setter
methods, but for now this is enough to enable the VMAs.

v2: Reworked commit message (Ben)
Added comments to the main functions (Ben)
sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/*.[ch]
(Daniel)

v3: Rebased on new reserve_node patch
Changed DRM_DEBUG_KMS to actually work (will need fixing later)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:34 +02:00
Ben Widawsky 338710e7af drm: Change create block to reserve node
With the previous patch we no longer actually create a node, we simply
find the correct hole and occupy it. This very well could have been
squashed with the last patch, but since I already had David's review, I
figured it's easiest to keep it distinct.

Also update the users in i915. Conveniently this is the only user of the
interface.

CC: David Airlie <airlied@linux.ie>
CC: <dri-devel@lists.freedesktop.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:33 +02:00
Ben Widawsky b3a070cccb drm: pre allocate node for create_block
For an upcoming patch where we introduce the i915 VMA, it's ideal to
have the drm_mm_node as part of the VMA struct (ie. it's pre-allocated).
Part of the conversion to VMAs is to kill off obj->gtt_space. Doing this
will break a bunch of code, but amongst them are 2 callers of
drm_mm_create_block(), both related to stolen memory.

It also allows us to embed the drm_mm_node into the object currently
which provides a nice transition over to the new code.

v2: Reordered to do before ripping out obj->gtt_offset.
Some minor cleanups made available because of reordering.

v3: s/continue/break on failed stolen node allocation (David)
Set obj->gtt_space on failed node allocation (David)
Only unref stolen (fix double free) on failed create_stolen (David)
Free node, and NULL it in failed create_stolen (David)
Add back accidentally removed newline (David)

CC: <dri-devel@lists.freedesktop.org>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:32 +02:00
Ben Widawsky b2f21b4dfd drm/i915: Use gtt shortform where possible
Just for compactness.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:27:59 +02:00
Ben Widawsky 80a74f7f9c drm/i915: Drop dev from pte_encode
The original pte_encode function needed the dev argument so we could do
platform specific handling via IS_GENX, etc. With the merging of a pte
encoding function there should never been a need to quirk away gen
specific details.

The patch doesn't do much but makes the upcoming reworks in gtt/ppgtt/mm
slightly (albeit, ever so) easier.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:27:59 +02:00
Ben Widawsky 6716724006 drm/i915: Combine scratch members into a struct
There isn't any special reason to do this other than it makes it obvious
that the two members are connected.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:27:58 +02:00
Ben Widawsky 84f1356058 drm/i915: Really share scratch page
A previous patch had set up the ppgtt and ggtt to use the same scratch
page, but still kept around both pointers. Kill it, it's not needed and
gets in our way for upcoming cleanups.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:27:57 +02:00
Ben Widawsky 6670a5a5c7 drm/i915: make PDE|PTE platform specific
Nothing outside of i915_gem_gtt.c and more specifically, the relevant
gen specific init function should need to know about number of PDEs, or
PTEs per PD. Exposing this will only lead to circumventing using the
upcoming VM abstraction.

To accomplish this, move the defines into the .c file, rename the PDE
define to be GEN6, and make the PTE count less of a magic number.

The remaining code in the global gtt setup is a bit messy, but an
upcoming patch will clean that one up.

v2: Don't hardcode number of PDEs (Daniel + Jesse)
Reworded commit message to reflect change.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:27:57 +02:00
Ben Widawsky 35c20a60c7 drm/i915: Rename the gtt_list to global_list
Since it will be used for the global bound/unbound list with full PPGTT,
this helps clarify things for upcoming code rework.

Recommended-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-06-03 10:51:14 +02:00
Daniel Vetter e1b73cba13 Linux 3.10-rc2
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQEcBAABAgAGBQJRmpexAAoJEHm+PkMAQRiGrRIH/1uWFW38RvaCV/PXm/ia6Z+x
 BfBJfBIvPxGwb4n7aQNQlhU25xkfrPZ6szO4WiBH5/KPH3xYi2I2OZ1AzffkYqMF
 BWkPmsPK6EsTdp16zsi6JtH2aXArG4SpYA7ZamPvDkmfigHuiZg7GlL/9eHTRPNV
 P7Q8JToOrcnP8RoGgNj0uFiQeQbc62Kmoq7WuPtUhVlpQCCCknXgOJiYgz9w6Xe9
 /i79YFS8WRrzAquExT1NbIOh4ZMqB9MvuroaVWy8JDDLUyz7QUvOCe3tCDNguwgi
 FdWvU6nfkdQq5SLaWCWXDE9Rp/pL1MvfBn9vCOwFcp42aw0aQ0PgJVIXvsqufd0=
 =jgDI
 -----END PGP SIGNATURE-----

Merge tag 'v3.10-rc2' into drm-intel-next-queued

Backmerge Linux 3.10-rc2 since the various (rather trivial) conflicts
grew a bit out of hand. intel_dp.c has the only real functional
conflict since the logic changed while dev_priv->edp.bpp was moved
around.

Also squash in a whitespace fixup from Ben Widawsky for
i915_gem_gtt.c, git seems to do something pretty strange in there
(which I don't fully understand tbh).

Conflicts:
	drivers/gpu/drm/i915/i915_reg.h
	drivers/gpu/drm/i915/intel_dp.c

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-21 09:52:16 +02:00
Ben Widawsky c4ae25ecdf Revert "drm/i915: Calculate correct stolen size for GEN7+"
This reverts commit 03752f5b7b.

This revert requires a bit of explanation on how I understand things
work. Internally the architects/designers decide how the stolen encoding
works. We put it in a doc. BIOS writers take these docs and implement
it. Driver writers read the doc too, and read the value left by the BIOS
writers, and then we make magic.

The failing here is that in the docs we had[1] contained two different
definitions for this register for Gen7. (We have both a PCI register,
and an MMIO, and each of these were different). At the time [2] of
03752f5, we asked the architects what the correct value should be; but
that doesn't match the reality (BIOS) unfortunately.

So on all machines I can get my hands on, this revert is the right thing
to do. I've also worked with the product group to confirm that they
agree this revert is what we should do. People using HW made my "people"
who both write their own BIOS, and have access to our docs (Apple?).
Investigations are still ongoing about whether we need to add a list
of machines needing special handling, but this patch should be the
right thing for pretty much everyone.

[1] The docs are still wrong on this one. Now instead of two registers with
two definitions, we have one register with BOTH definitions, progress?
[2] The open source PRMs have the "wrong" definitions in chapter Volume
1 part6, section 1.1.12.

This digging was inspired by Paulo.

Cc: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Augment the patch saying that it's still a bit unclear
whether there are any machines out there with "wrong" firmware and
whether we need to add a list to handle them specially.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-07 18:59:09 +02:00
Ben Widawsky 3e30254205 drm/i915: Extract PDE writes
It also makes some sense IMO to have these two functions separate
irrespective of the number of callers.

Only the single caller for now, but that will change as we add more
PPGTTs.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Resolve conflict.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-06 11:49:27 +02:00
Ben Widawsky 0a73287060 drm/i915: BUG_ON bad PPGTT offset
Because PPGTT PDEs within the GTT are calculated in cachelines
(HW guys consistency ftw) we do a divide which will wreak havoc if this
is wrong, and I know that from experience).

If/when we move to multiple PPGTTs this will have to become a WARN, and
return an error. For now however it should always be considered fatal,
and only a developer could hit it.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: s/BUG/WARN]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-06 11:40:47 +02:00
Zhang, Xiong Y 43b27290dd drm/i915: correct the calculation of first_pd_entry_in_global_pt
When ppgtt is enabled, dev_priv->gtt.total has excluded the gtt space
occupied by ppgtt table in i915_gem_init_global_gtt() function. So the
calculation of first_pd_entry_in_global_pt doesn't need to subtract
I915_PPGTT_PD_ENTRIES again. Or else PPGTT directory table will be
destroyed by global gtt allocation.

This regression has been introduced in

commit a54c0c279f
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Thu Jan 24 14:45:00 2013 -0800

    drm/i915: remove intel_gtt structure

The breakage is pretty subtile since the old gtt_total_entries
included the pde range, whereas the new on did not.

Cc: stable@vger.kernel.org
Signed-off-by: Xiong Zhang<xiong.y.zhang@intel.com>
[danvet: Add regression citation and cc: stable. Thanks to Chris for
correcting my wrong guess about which commit broke things.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-27 14:07:16 +02:00
Kenneth Graunke 9119708cd4 drm/i915: Split out Haswell code from gen6_pte_encode.
Now that we have function pointers, it's cleaner to just create a new
per-platform PTE encoding function.

This should be identical in behavior to the previous code.

v2: Drop accidental inline keyword on hsw_pte_encode.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Daniel Leung <daniel.leung@linux.intel.com> [v1]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-22 11:44:21 +02:00
Kenneth Graunke 93c34e70eb drm/i915: Fix page table entries for Bay Trail.
On Bay Trail, bit 1 means "writeable by the GPU."  Failing to set that
means basically anything using the GPU will cause hangs.

v2: Drop accidental inline keyword on byt_pte_encode.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Daniel Leung <daniel.leung@linux.intel.com> [v1]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-22 11:44:11 +02:00
Kenneth Graunke 2d04befb94 drm/i915: Add PTE encoding function to the gtt/ppgtt vtables.
Sandybridge/Ivybridge, Bay Trail, and Haswell all have slightly
different page table entry formats.  Rather than polluting one function
with generation checks, simply use a function pointer and set up the
correct PTE encoding function at startup.

v2: Move the gen6_gtt_pte_t typedef to i915_drv.h so that the function
    pointers and implementations have identical signatures.  Also remove
    inline keyword on gen6_pte_encode.  Both suggested by Jani Nikula.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
Tested-by: Daniel Leung <daniel.leung@linux.intel.com> [v1]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-22 11:20:15 +02:00
Ben Widawsky 6a99476180 drm/i915: Remove stale code
Looks like a some remnant from a rebase.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:20 +02:00
Ville Syrjälä a6f429a5a2 drm/i915: Configure GAM_ECOCHK appropriatly for Gen7
IVB and HSW use different encodings for the PPGTT cacheability bits in
the GAM_ECOCHK register.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:19 +02:00
Ville Syrjälä a65c2fcd00 drm/i915: Set GAC_ECO_BITS register on Gen7+
According to BSpec GAC_ECO_BITS register exists on Gen7 platforms as
well. Configure it accordingly.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:18 +02:00
Ville Syrjälä 3b9d7888df drm/i915: Add ECOBITS_SNB_BIT
GAC_ECO_BITS has a bit similar to GAM_ECOCHK's ECOCHK_SNB_BIT. Add
the define, and enable it on SNB.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:18 +02:00
Ben Widawsky b7c36d2546 drm/i915: Allow PPGTT enable to fail
I'm really not happy that we have to support this, but this will be the
simplest way to handle cases where PPGTT init can fail, which I promise
will be coming in the future.

v2: Resolve conflicts due to patch series reordering.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:16 +02:00
Ben Widawsky 5963cf049a drm/i915: NULL aliasing_ppgtt on cleanup
This will allow us to carry on if we've cleaned up the PPGTT. The usage
for this is coming up - it simplifies handling a failed PPGTT init.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Spill the secrets about failing ppgtt init.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:15 +02:00
Ben Widawsky 6197349bde drm/i915: Abstract PPGTT enabling
Since we've already set up a nice vtable to abstract other PPGTT
functions, also abstract the actual register programming to enable
things.

This function will probably need to change a bit as we implement real
processes.

v2: Resolve conflicts due to patch series reordering.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:15 +02:00
Ben Widawsky 3ed124b21e drm/i915: Rework PPGTT init code
This rework will help if future platforms choose to be a bit different.
Should have no functional impact.

v2: Don't move around the vtable setup (Daniel)

v3: Squash in the disable-by-default patch.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:14 +02:00
Ben Widawsky 3eb1c005c6 drm/i915: Conditionally carve out GGTT PDE
It only works that way on GEN6 and GEN7. Let's not assume GENn will be
the same.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:14 +02:00
Ben Widawsky 1e7d12d467 drm/i915/ppgtt: Set scratch page "globally"
The PPGTT scratch page is used for all gens, and doing it in the global
part of our PPGTT setup makes the code a bit nicer.

This was in a patch submitted earlier as part of the PPGTT cleanups.
Grumpy maintainer must have missed it, and I didn't yell when
appropriate. Apologies for everyone :-)

v2: Update commit message

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:13 +02:00
Ben Widawsky c81dbe0563 drm/i915: random checkpatch fixes
There used to be other fixes in this patch but they've slowly disappeared as
other parts have been fixed.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:13 +02:00
Ben Widawsky e7c2b58b70 drm/i915: Call out GEN6 PTE specificity
We can assume that the PTE layout, and size changes for future
generations. To avoid confusion with the existing GEN6 PTE typedef, give
it a GEN6_ prefix.

v2: Fixup checkpatch warning and bikeshed commit message slightly.

v3: Rebase on top of Imre's for_each_sg_pages rework.

v4: Fixup conflicts in patch series reordering.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:12 +02:00
Ben Widawsky a93e41618e drm/i915: generalize pte vs. register BAR allocation
All gen6+ parts so far have 1 BAR which holds both the register space
and the GTT PTEs. Up until now, that was a 4MB BAR with half allocated
to each.

I have a strong hunch (wink, nod, wink) that future gens will also keep
a similar 50-50 split though the sizes may change. To help this along
change the code to obey the rule of half the total size instead of a
hard-coded 2MB.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:11 +02:00
Imre Deak 2db76d7c3c lib/scatterlist: sg_page_iter: support sg lists w/o backing pages
The i915 driver uses sg lists for memory without backing 'struct page'
pages, similarly to other IO memory regions, setting only the DMA
address for these. It does this, so that it can program the HW MMU
tables in a uniform way both for sg lists with and without backing pages.

Without a valid page pointer we can't call nth_page to get the current
page in __sg_page_iter_next, so add a helper that relevant users can
call separately. Also add a helper to get the DMA address of the current
page (idea from Daniel).

Convert all places in i915, to use the new API.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-27 17:13:44 +01:00
Daniel Vetter a15326a57c drm/i915: fixup pd vs pt confusion in gen6 ppgtt code
The index variable points at a page table, not a page directory or a
pde. Ben Widawsky fix this up correctly in his ppgtt cleanup, but I've
botched the job and copy&pasted the old confusion from the original
gen6 ppgtt code in

commit def886c376
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Jan 24 14:44:56 2013 -0800

    drm/i915: vfuncs for ppgtt

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-23 12:18:03 +01:00
Daniel Vetter 6ddc4fc70a style nit: Align function parameter continuation properly. 2013-03-23 12:18:02 +01:00
Imre Deak 6e995e231a drm/i915: use for_each_sg_page for setting up the gtt ptes
The existing gtt setup code is correct - and so doesn't need to be fixed to
handle compact dma scatter lists similarly to the previous patches. Still,
take the for_each_sg_page macro into use, to get somewhat simpler code.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-23 12:17:31 +01:00
Jesse Barnes 086ddccec4 drm/i915: use gen6 stolen check on VLV
It uses the same bit definitions.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-06 20:07:03 +01:00
Ben Widawsky 41907ddc1b drm/i915: Fix gen2 mappable calculations
When I refactored the code initially, I forgot that gen2 uses a
different bar for the CPU mappable aperture. The agp-less code knows
nothing of generations less than 5, so we have to expand the gtt_probe
function to include the mappable base and end.

It was originally broken by me:
commit baa09f5fd8
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Thu Jan 24 13:49:57 2013 -0800

    drm/i915: Add probe and remove to the gtt ops

Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-02-15 10:30:38 +01:00
Changlong Xie d93c623354 drm/i915: gen6_gmch_remove can be static
Signed-off-by: Changlong Xie <changlongx.xie@intel.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:12 +01:00
Ben Widawsky e78891ca76 drm/i915: Reclaim GTT space for failed PPGTT
When the PPGTT init fails, we may as well reuse the space that we were
reserving for the PPGTT PDEs.

This also fixes an extraneous mutex_unlock.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:08 +01:00
Ben Widawsky a54c0c279f drm/i915: remove intel_gtt structure
With the probe call in our dispatch table, we can now cut away the
last three remaining members in the intel_gtt shared struct and so
remove it completely.

v2: Rebased on top of Daniel's series

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: bikeshed commit message a bit.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:07 +01:00
Ben Widawsky baa09f5fd8 drm/i915: Add probe and remove to the gtt ops
The idea, and much of the code came originally from:

commit 0712f0249c3148d8cf42a3703403c278590d4de5
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Fri Jan 18 17:23:16 2013 -0800

    drm/i915: Create a vtable for i915 gtt

Daniel didn't like the color of that patch series, and so I asked him to
start something which appealed to his sense of color. The preceding
patches are those, and now this is going on top of that.

[extracted from the original commit message]

One immediately obvious thing to implement is our gmch probing. The init
function was getting massively bloated. Fundamentally, all that's needed
from GMCH probing is the GTT size, and the stolen size. It makes design
sense to put the mappable calculation in there as well, but the code
turns out a bit nicer without it (IMO)

The intel_gtt bridge thing is still here, but the subsequent patches
will finish ripping that out.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Bikeshedded one comment (GMADR is just the PCI aperture, we
use it for other things than just accessing tiled surfaces through a
linear view) and cut the newly added long lines a bit. Also one
checkpatch error.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:07 +01:00
Daniel Vetter 3440d26585 drm/i915: extract hw ppgtt setup/cleanup code
At the moment only cosmetics, but being able to initialize/cleanup
arbitrary ppgtt address spaces paves the way to have more than one of
them ... Just in case we ever get around to implementing real
per-process address spaces. Note that in that case another vfunc for
ppgtt would be beneficial though. But that can wait until the code
grows a second place which initializes ppgtts.

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:06 +01:00
Daniel Vetter 960e3e429f drm/i915: pte_encode is gen6+
All the other gen6+ hw code has the gen6_ prefix, so be consistent
about it.

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:06 +01:00
Daniel Vetter def886c376 drm/i915: vfuncs for ppgtt
Like for the global gtt we want a notch more flexibility here. Only
big change (besides a few tiny function parameter adjustments) was to
move gen6_ppgtt_insert_entries up (and remove _sg_ from its name, we
only have one kind of insert_entries since the last gtt cleanup).

We could also extract the platform ppgtt setup/teardown code a bit
better, but I don't care that much.

With this we have the hw details of pte writing nicely hidden away
behind a bit of abstraction. Which should pave the way for
different/multiple ppgtts (e.g. what we need for real ppgtt support).

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:05 +01:00
Daniel Vetter 7faf1ab2ff drm/i915: vfuncs for gtt_clear_range/insert_entries
We have a few too many differences here, so finally take the prepared
abstraction and run with it. A few smaller changes are required to get
things into shape:

- move i915_cache_level up since we need it in the gt funcs
- split up i915_ggtt_clear_range and move the two functions down to
  where the relevant insert_entries functions are
- adjustments to a few function parameter lists

Now we have 2 functions which deal with the gen6+ global gtt
(gen6_ggtt_ prefix) and 2 functions which deal with the legacy gtt
code in the intel-gtt.c fake agp driver (i915_ggtt_ prefix).

Init is still a bit a mess, but honestly I don't care about that.

One thing I've thought about while deciding on the exact interfaces is
a flag parameter for ->clear_range: We could use that to decide
between writing invalid pte entries or scratch pte entries. In case we
ever get around to fixing all our bugs which currently prevent us from
filling the gtt with empty ptes for the truly unused ranges ...

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[bwidawsk: Moved functions to the gtt struct]
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-31 11:50:05 +01:00
Ben Widawsky 8d2e630899 drm/i915: Needs_dmar, not
The reasoning behind our code taking two paths depending upon whether or
not we may have been configured for IOMMU isn't clear to me. It should
always be safe to use the pci mapping functions as they are designed to
abstract the decision we were handling in i915.

Aside from simpler code, removing another member for the intel_gtt
struct is a nice motivation.

I ran this by Chris, and he wasn't concerned about the extra kzalloc,
and memory references vs. page_to_phys calculation in the case without
IOMMU.

v2: Update commit message

v3: Remove needs_dmar addition from Zhenyu upstream

This reverts (and then other stuff)
commit 20652097da
Author: Zhenyu Wang <zhenyuw@linux.intel.com>
Date:   Thu Dec 13 23:47:47 2012 +0800

    drm/i915: Fix missed needs_dmar setting

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in follow-up fix to remove the bogus hunk which
deleted the dma_mask configuration for gen6+.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-20 13:11:12 +01:00
Ben Widawsky 9c61a32d31 drm/i915: Remove scratch page from shared
We already had a mapping in both (minus the phys_addr in AGP).

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-20 13:11:11 +01:00
Ben Widawsky a81cc00c11 drm/i915: Cut out the infamous ILK w/a from AGP layer
And, move it to where the rest of the logic is.

There is some slight functionality changes. There was extra paranoid
checks in AGP code making sure we never do idle maps on gen2 parts. That
was not duplicated as the simple PCI id check should do the right thing.

v2: use IS_GEN5 && IS_MOBILE check instead. For now, this is the same as
IS_IRONLAKE_M but is more future proof. The workaround docs hint that
more than one platform may be effected, but we've never seen such a
platform in the wild. (Rodrigo, Daniel)

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v1)
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-20 13:11:11 +01:00
Ben Widawsky 93d187993b drm/i915: Remove use of gtt_mappable_entries
Mappable_end, ie. size is almost always what you want as opposed to the
number of entries. Since we already have that information, we can scrap
the number of entries and only calculate it when needed.

If gtt_start is !0, this will have slightly different behavior. This
difference can only occur in DRI1, and exists when we try to kick out
the firmware fb. The new code seems like a bugfix to me.

The other case where we've changed the behavior is during init we check
the mappable region against our current known upper and lower limits
(64MB, and 512MB). This now matches the comment, and makes things more
convenient after removing gtt_mappable_entries.

Also worth noting is the setting of mappable_end is taken out of setup
because we do it earlier now in the DRI2 case and therefore need to add
that tiny hunk to support the DRI1 IOCTL.

v2: Move up mappable end to before legacy AGP init

v3: Add the dev_priv inclusion here from previous rebase error in patch
5

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: squash in fix for a printk format flag mismatch warning.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-20 13:09:20 +01:00
Ben Widawsky dabb7a91ae drm/i915: Remove use on gma_bus_addr on gen6+
We have enough info to not use the intel_gtt bridge stuff.

v2: Move setup of mappable_base above the legacy init stuff because we
still need that on older platforms. (Daniel)

v3: Remove the dev_priv hunk which was rebased in by accident

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17 22:47:03 +01:00
Ben Widawsky 5d4545aef5 drm/i915: Create a gtt structure
The purpose of the gtt structure is to help isolate our gtt specific
properties from the rest of the code (in doing so it help us finish the
isolation from the AGP connection).

The following members are pulled out (and renamed):
gtt_start
gtt_total
gtt_mappable_end
gtt_mappable
gtt_base_addr
gsm

The gtt structure will serve as a nice place to put gen specific gtt
routines in upcoming patches. As far as what else I feel belongs in this
structure: it is meant to encapsulate the GTT's physical properties.
This is why I've not added fields which track various drm_mm properties,
or things like gtt_mtrr (which is itself a pretty transient field).

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
[Ben modified commit messages]
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17 22:33:56 +01:00
Ben Widawsky 00fc2c3c53 drm/i915: Remove gtt_mappable_total
With the assertion from the previous patch in place, it should be safe
to get rid gtt_mappable_total. Keeps things saner to not have to track
the same info in two places.

In order to keep the diff as simple as possible and keep with the
existing gtt_setup semantics we opt to keep gtt_mappable_end. It's not
as consistent with the 'total' used in the previous patch, but that can
be fixed later.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
[Ben modified commit message]
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17 22:33:41 +01:00
Ben Widawsky 35451cb6fb drm/i915: Mappable_end can't ever be > end
Both DRI1 and DRI2 can never specify a mappable size which goes past the
GTT size.  Don't pretend otherwise.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17 22:33:02 +01:00
Ben Widawsky c1fc6521ef drm/i915: Kill gtt_end
It's duplicated in the more useful gtt_total.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17 22:27:31 +01:00
Dave Airlie b5cc6c0387 Merge tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel into drm-next
Daniel writes:
- seqno wrap fixes and debug infrastructure from Mika Kuoppala and Chris
  Wilson
- some leftover kill-agp on gen6+ patches from Ben
- hotplug improvements from Damien
- clear fb when allocated from stolen, avoids dirt on the fbcon (Chris)
- Stolen mem support from Chris Wilson, one of the many steps to get to
  real fastboot support.
- Some DDI code cleanups from Paulo.
- Some refactorings around lvds and dp code.
- some random little bits&pieces

* tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel: (93 commits)
  drm/i915: Return the real error code from intel_set_mode()
  drm/i915: Make GSM void
  drm/i915: Move GSM mapping into dev_priv
  drm/i915: Move even more gtt code to i915_gem_gtt
  drm/i915: Make next_seqno debugs entry to use i915_gem_set_seqno
  drm/i915: Introduce i915_gem_set_seqno()
  drm/i915: Always clear semaphore mboxes on seqno wrap
  drm/i915: Initialize hardware semaphore state on ring init
  drm/i915: Introduce ring set_seqno
  drm/i915: Missed conversion to gtt_pte_t
  drm/i915: Bug on unsupported swizzled platforms
  drm/i915: BUG() if fences are used on unsupported platform
  drm/i915: fixup overlay stolen memory leak
  drm/i915: clean up PIPECONF bpc #defines
  drm/i915: add intel_dp_set_signal_levels
  drm/i915: remove leftover display.update_wm assignment
  drm/i915: check for the PCH when setting pch_transcoder
  drm/i915: Clear the stolen fb before enabling
  drm/i915: Access to snooped system memory through the GTT is incoherent
  drm/i915: Remove stale comment about intel_dp_detect()
  ...

Conflicts:
	drivers/gpu/drm/i915/intel_display.c
2013-01-17 20:34:08 +10:00
Ben Widawsky 1c45140d3d drm/i915: Make GSM void
The iomapping of the register region has historically been a uint32_t
for the obvious reason that our PTE size was always 4b. In the future
however, we cannot make this assumption.

By making the type void, it makes the upcoming pointer math we will do
much easier, and hopefully gives the compiler opportunities to warn us
when we do stupid things.

v2: Cast to __iomem, caught by Ville

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Fixup __iomem issue for real.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-20 16:32:04 +01:00
Ben Widawsky 06e5598fce drm/i915: Move GSM mapping into dev_priv
This removes an unused field from the AGP structure and moves it into
the dev_priv structure (with a slightly better name). This builds upon
the kill-agp series already merged.

GSM is a well defined term in the bspec:
GSM: Graphics Stolen Memory

GTT stolen space is defined for storage of the GFX GTT entries in
physical memory. IA can not access GSM directly , it can only access via
GTTMMADR. GT can access GSM directly or through GTTMMADR.

This is not the entire stolen space.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-20 16:28:42 +01:00
Ben Widawsky d7e5008f7c drm/i915: Move even more gtt code to i915_gem_gtt
This really should have been part of the kill agp series.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-20 16:27:35 +01:00
Ben Widawsky 079a43f67f drm/i915: Missed conversion to gtt_pte_t
commit f61c060907
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Mon Oct 22 11:44:43 2012 -0700

    drm/i915: introduce gtt_pte_t

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-18 22:32:03 +01:00
Zhenyu Wang 20652097da drm/i915: Fix missed needs_dmar setting
From Ben's AGP dependence removal change, "needs_dmar" flag has not
been properly setup for new chips using new GTT init function. This
one adds missed setting of that flag to make sure we do pci mappings
with IOMMU enabled.

Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-13 21:40:24 +01:00
Chris Wilson ed2f345267 drm/i915: Avoid clearing preallocated regions from the GTT
As yet we do not do any preallocation (chicken-and-egg problem), but we
may like to preserve anything already allocated by the BIOS or grub and
reuse for own purposes after initialising the driver.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-30 23:24:49 +01:00
Ben Widawsky 2ff4aeac39 drm/i915: Fix pte updates in ggtt clear range
This bug was introduced by me:
commit e76e9aebcd
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Sun Nov 4 09:21:27 2012 -0800

    drm/i915: Stop using AGP layer for GEN6+

The existing code uses memset_io which follows memset semantics in only
guaranteeing a write of individual bytes. Since a PTE entry is 4 bytes,
this can only be correct if the scratch page address is 0.

This caused unsightly errors when we clear the range at load time,
though I'm not really sure what the heck is referencing that memory
anyway. I caught this is because I believe we have some other bug where
the display is doing reads of memory we feel should be cleared (or we
are relying on scratch pages to be a specific value).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-29 11:14:44 +01:00
Ben Widawsky b5c621584b drm/i915: Use pci_resource functions for BARs.
This was leftover crap from kill-agp. The current code is theoretically
broken for 64b bars. (I resist removing theoretically because I am too
lazy to test).

We still need to ioremap things ourselves because we want to ioremap_wc
the PTEs.

v2: Forgot to kill the tmp variable in v1

CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-21 17:47:14 +01:00
Chris Wilson d640c4b09a drm/i915: Report amount of usable graphics memory in MiB
...rather than kilo-PTE.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Apply s/Usabel/usable/ bikeshed suggested by Ben Widawsky.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-13 16:10:56 +01:00
Ben Widawsky ccdf56cdb2 drm/i915: Fix sparse warnings in from AGP kill code
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:45 +01:00
Ben Widawsky 26b1ff35c8 drm/i915: Move the remaining gtt code
It's pretty much all consolidated now that we've killed AGP. We can move
the one outlier, and defines too.

(Kill some unused defines in the process)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:44 +01:00