Combining horizontal and vertical reflections gives us 180 degrees of
rotation. Both reflection modes are already supported, and thus, we just
need to mark the 180 rotation mode as supported. The 180 rotation mode is
needed for devices like Nexus 7 tablet, which have display panel mounted
upside-down.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Support horizontal reflection mode which will allow to support 180°
rotation mode when combined with the vertical reflection.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
This makes the naming consistent with the DRM core.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
In the function tegra_dc_probe(), when get irq failed, the function
platform_get_irq() logs an error message, so remove redundant message
here.
Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
There are two PATBASE address registers, one for linear layout and other
for tiled. The driver's address registers list misses the tiled PATBASE
register.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tegra TRM documentation states that hardware should be in a default state
when power partition is turned off, i.e. reset should be asserted. This
patch adds the missing reset assertions.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The hardware documentation uses AVDD_IO_HDMI_DP and VDD_HDMI_DP_PLL to
denote the two power supplies that drive the HDMI/DP outputs of the SOR.
Use these names instead of the arbitrary AVDD_IO and VDD_PLL names that
were used previously to avoid confusion.
Signed-off-by: Thierry Reding <treding@nvidia.com>
When job hangs and there is a memory error pointing at channel's push
buffer, it is very handy to know the push buffer's state. This patch
makes the push buffer's state to be dumped into KMSG in addition to the
job's gathers.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Once channel's job is hung, it dumps the channel's state into KMSG before
tearing down the offending job. If multiple channels hang at once, then
they dump messages simultaneously, making the debug info unreadable, and
thus, useless. This patch adds mutex which allows only one channel to emit
debug messages at a time.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
This patch fixes gather's BO refcounting on a pinning error. Gather's BO
won't be leaked now if something goes wrong.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
We don't need to hold and pin original BOs of the gathers in a case of
enabled firewall because in this case gather's content is copied and the
copy is used by the executed job.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
SW can trigger MIPI pads calibration any time after power on
but calibration results will be latched and applied to the pads
by MIPI CAL unit only when the link is in LP-11 state and then
status register will be updated.
For CSI, trigger of pads calibration happen during CSI stream
enable where CSI receiver is kept ready prior to sensor or CSI
transmitter stream start.
So, pads may not be in LP-11 at this time and waiting for the
calibration to be done immediate after calibration start will
result in timeout.
This patch splits tegra_mipi_calibrate() and tegra_mipi_wait()
so triggering for calibration and waiting for it to complete can
happen at different stages.
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Use readl_relaxed_poll_timeout() in tegra_mipi_wait() to simplify
the code.
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tegra CSI driver need a separate MIPI device for each channel as
calibration of corresponding MIPI pads for each channel should
happen independently.
So, this patch updates tegra_mipi_request() API to add a device_node
pointer argument to allow creating mipi device for specific device
node rather than a device.
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
To pick up the changes from:
83d31e5271 ("KVM: nVMX: fixes for preemption timer migration")
That don't entail changes in tooling.
This silences these tools/perf build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the changes in:
b2f9f1535b ("libbpf: Fix libbpf hashmap on (I)LP32 architectures")
Silencing this warning:
Warning: Kernel ABI header at 'tools/perf/util/hashmap.h' differs from latest version at 'tools/lib/bpf/hashmap.h'
diff -u tools/perf/util/hashmap.h tools/lib/bpf/hashmap.h
I'll eventually update the warning to remove the "Kernel ABI" part
and instead state libbpf when noticing that the original is at
"tools/lib/something".
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Jakub Bogusz <qboosh@pld-linux.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Any option macro with _SET suffix should set opt->set variable which is
not happening for OPT_CALLBACK_SET(). This is causing issues with perf
record --switch-output-event. Fix that.
Before:
# ./perf record --overwrite -e sched:*switch,syscalls:sys_enter_mmap \
--switch-output-event syscalls:sys_enter_mmap
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.297 MB perf.data (657 samples) ]
After:
$ ./perf record --overwrite -e sched:*switch,syscalls:sys_enter_mmap \
--switch-output-event syscalls:sys_enter_mmap
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2020061918144542 ]
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2020061918144608 ]
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2020061918144660 ]
^C[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2020061918144784 ]
[ perf record: Woken up 0 times to write data ]
[ perf record: Dump perf.data.2020061918144803 ]
[ perf record: Captured and wrote 0.419 MB perf.data.<timestamp> ]
Fixes: 636eb4d001 ("libsubcmd: Introduce OPT_CALLBACK_SET()")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200619133412.50705-1-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Forcefully unbinding PMU drivers during perf sampling will lead to
a kernel panic, because the perf upper-layer framework call a NULL
pointer in this situation.
To solve this issue, "suppress_bind_attrs" should be set to true, so
that bind/unbind can be disabled via sysfs and prevent unbinding PMU
drivers during perf sampling.
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1594975763-32966-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Although mmiowb() is concerned only with serialising MMIO writes occuring
in contexts where a spinlock is held, the call to mmiowb_set_pending()
from the MMIO write accessors can occur in preemptible contexts, such
as during driver probe() functions where ordering between CPUs is not
usually a concern, assuming that the task migration path provides the
necessary ordering guarantees.
Unfortunately, the default implementation of mmiowb_set_pending() is not
preempt-safe, as it makes use of a a per-cpu variable to track its
internal state. This has been reported to generate the following splat
on riscv:
| BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
| caller is regmap_mmio_write32le+0x1c/0x46
| CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-hfu+ #1
| Call Trace:
| walk_stackframe+0x0/0x7a
| dump_stack+0x6e/0x88
| regmap_mmio_write32le+0x18/0x46
| check_preemption_disabled+0xa4/0xaa
| regmap_mmio_write32le+0x18/0x46
| regmap_mmio_write+0x26/0x44
| regmap_write+0x28/0x48
| sifive_gpio_probe+0xc0/0x1da
Although it's possible to fix the driver in this case, other splats have
been seen from other drivers, including the infamous 8250 UART, and so
it's better to address this problem in the mmiowb core itself.
Fix mmiowb_set_pending() by using the raw_cpu_ptr() to get at the mmiowb
state and then only updating the 'mmiowb_pending' field if we are not
preemptible (i.e. we have a non-zero nesting count).
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Palmer Dabbelt <palmer@dabbelt.com>
Reported-by: Emil Renner Berthing <kernel@esmil.dk>
Tested-by: Emil Renner Berthing <kernel@esmil.dk>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Link: https://lore.kernel.org/r/20200716112816.7356-1-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
The ZynqMP DisplayPort subsystem includes a DMA engine called DPDMA with
6 DMa channels (4 for display and 2 for audio). This driver exposes the
DPDMA through the dmaengine API, to be used by audio (ALSA) and display
(DRM) drivers for the DisplayPort subsystem.
Signed-off-by: Hyun Kwon <hyun.kwon@xilinx.com>
Signed-off-by: Tejas Upadhyay <tejasu@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://lore.kernel.org/r/20200717013337.24122-4-laurent.pinchart@ideasonboard.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
DMA engines used with displays perform 2D interleaved transfers to read
framebuffers from memory and feed the data to the display engine. As the
same framebuffer can be displayed for multiple frames, the DMA
transactions need to be repeated until a new framebuffer replaces the
current one. This feature is implemented natively by some DMA engines
that have the ability to repeat transactions and switch to a new
transaction at the end of a transfer without any race condition or frame
loss.
This patch implements support for this feature in the DMA engine API. A
new DMA_PREP_REPEAT transaction flag allows DMA clients to instruct the
DMA channel to repeat the transaction automatically until one or more
new transactions are issued on the channel (or until all active DMA
transfers are explicitly terminated with the dmaengine_terminate_*()
functions). A new DMA_REPEAT transaction type is also added for DMA
engine drivers to report their support of the DMA_PREP_REPEAT flag.
A new DMA_PREP_LOAD_EOT transaction flag is also introduced (with a
corresponding DMA_LOAD_EOT capability bit), as requested during the
review of v4. The flag instructs the DMA channel that the transaction
being queued should replace the active repeated transaction when the
latter terminates (at End Of Transaction). Not setting the flag will
result in the active repeated transaction to continue being repeated,
and the new transaction being silently ignored.
The DMA_PREP_REPEAT flag is currently supported for interleaved
transactions only. Its usage can easily be extended to cover more
transaction types simply by adding an appropriate check in the
corresponding dmaengine_prep_*() function.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/20200717013337.24122-3-laurent.pinchart@ideasonboard.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The ZynqMP includes the DisplayPort subsystem with its own DMA engine
called DPDMA. The DPDMA IP comes with 6 individual channels
(4 for display, 2 for audio). This documentation describes DT bindings
of DPDMA.
Signed-off-by: Hyun Kwon <hyun.kwon@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20200717013337.24122-2-laurent.pinchart@ideasonboard.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
dma-buf:
- sleeping atomic fix
amdgpu:
- Fix a race condition with KIQ
- Preemption fix
- Fix handling of fake MST encoders
- OLED panel fix
- Handle allocation failure in stream construction
- Renoir SMC fix
- SDMA 5.x fix
i915:
- FBC w/a stride fix
- Fix use-after-free fix on module reload
- Ignore irq enabling on the virtual engines to fix device sleep
- Use GTT when saving/restoring engine GPR
- Fix selftest sort function
vmwgfx:
- black screen fix
aspeed:
- fbcon init warn fix
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJfERzqAAoJEAx081l5xIa+W98QAKmGH78757nmvFDDx6/d7+Cs
RqMvR6oNa0DRksRDxSMIERCbdXcZsNlF5HpETeY8wlnOKD1bi8ET4i6PuX2/m1qA
77Z+Of/FrkmwaM+5RTEX6eAoSPsrgr/XJYFF19N/0ICSfS6Njv3md5ceXkrNTuHa
dR9LQHAJReRKwkVijKmVKvX5CvVkRtWPdAWZ6t5PrhIA27tz5KyzvuvkANNAyUwb
KLxkfDzVQpbISTU8sCh/rPLTMCM02ddCzwYZGaYA3s4Qm6HGV3GOWpNukuCh+rIE
Ru4HL1mxagqvqZCt8L8srAoLpM/V8yWd0z5/WRCYu7Km7Uqzi6sUeJJtBH0zGiQF
iTdoxUD+qIm24Pvw0hXdpFy55m9xlNh9/uP5mBkAga1endT4euouhPqvI09X1RG3
dTEOypWHtA+9lEAqtJgpLdDyRPnvXRnZmmghX0ovHfL9sVJ6r6IIMrXVUFVebNec
XCBbHsngal6AcWJa+lHmDRoC0tpOdTqZj71SzizVUG3AjTf9BG/6EfE2cQn71osx
Cn2P7KfJDo+M5tqQQdvClRuVaf5KzmRKOow/KIZ6uRZDpZmaOK9dojyMVqv4UKfL
BtkcE/4PeK31P5Ycp+JxKwvHk/aGUbyK/o5LrLs/YaCGOr6+UK2E2hepZsM9Nj86
hAfg3wSeYU/WYvbJEqns
=S5ph
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2020-07-17-1' of git://anongit.freedesktop.org/drm/drm into master
Pull drm fixes from Dave Airlie:
"Weekly fixes pull, big bigger than I'd normally like, but they are
fairly scattered and small individually.
The vmwgfx one is a black screen regression, otherwise the largest is
an MST encoder fix for amdgpu which results in a WARN in some cases,
and a scattering of i915 fixes.
I'm tracking two regressions at the moment that hopefully we get
nailed down this week for rc7.
dma-buf:
- sleeping atomic fix
amdgpu:
- Fix a race condition with KIQ
- Preemption fix
- Fix handling of fake MST encoders
- OLED panel fix
- Handle allocation failure in stream construction
- Renoir SMC fix
- SDMA 5.x fix
i915:
- FBC w/a stride fix
- Fix use-after-free fix on module reload
- Ignore irq enabling on the virtual engines to fix device sleep
- Use GTT when saving/restoring engine GPR
- Fix selftest sort function
vmwgfx:
- black screen fix
aspeed:
- fbcon init warn fix"
* tag 'drm-fixes-2020-07-17-1' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu/sdma5: fix wptr overwritten in ->get_wptr()
drm/amdgpu/powerplay: Modify SMC message name for setting power profile mode
drm/amd/display: handle failed allocation during stream construction
drm/amd/display: OLED panel backlight adjust not work with external display connected
drm/amdgpu/display: create fake mst encoders ahead of time (v4)
drm/amdgpu: fix preemption unit test
drm/amdgpu/gfx10: fix race condition for kiq
drm/i915: Recalculate FBC w/a stride when needed
drm/i915: Move cec_notifier to intel_hdmi_connector_unregister, v2.
drm/i915/gt: Only swap to a random sibling once upon creation
drm/i915/gt: Ignore irq enabling on the virtual engines
drm/i915/perf: Use GTT when saving/restoring engine GPR
drm/i915/selftests: Fix compare functions provided for sorting
drm/vmwgfx: fix update of display surface when resolution changes
dmabuf: use spinlock to access dmabuf->name
drm/aspeed: Call drm_fbdev_generic_setup after drm_dev_register
Support multiple panels or bridges connected to the same DPI output of
the SoC. This setup can be found for instance on the GCW Zero, where the
same DPI output interfaces the internal 320x240 TFT panel, and the ITE
IT6610 HDMI chip.
v2: No change
v3: Allow > 80-char lines where it makes sense
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-11-paul@crapouillou.net
Add support for the Image Processing Unit (IPU) found in all Ingenic
SoCs.
The IPU can upscale and downscale a source frame of arbitrary size
ranging from 4x4 to 4096x4096 on newer SoCs, with bicubic filtering
on newer SoCs, bilinear filtering on older SoCs. Nearest-neighbour can
also be obtained with proper coefficients.
Starting from the JZ4725B, the IPU supports a mode where its output is
sent directly to the LCDC, without having to be written to RAM first.
This makes it possible to use the IPU as a DRM plane on the compatible
SoCs, and have it convert and scale anything the userspace asks for to
what's available for the display.
Regarding pixel formats, older SoCs support packed YUV 4:2:2 and various
planar YUV formats. Newer SoCs introduced support for RGB.
Since the IPU is a separate hardware block, to make it work properly the
Ingenic DRM driver will now register itself as a component master in
case the IPU driver has been enabled in the config.
When enabled in the config, the CRTC will see the IPU as a second primary
plane. It cannot be enabled at the same time as the regular primary
plane. It has the same priority, which means that it will also display
below the overlay plane.
v2: - ingenic-ipu is no longer its own module. It will be built
into the ingenic-drm module.
- If enabled in the config, both the core driver and the IPU
driver will register as components; otherwise the core
driver will bypass that and call the ingenic_drm_bind()
function directly.
- Since both files now build into the same module, the
symbols previously exported as GPL are not exported anymore,
since they are only used internally.
- Fix SPDX license header in ingenic-ipu.h
- Avoid using 'for(;;);' loops without trailing statement(s)
v3: - Pass priv structure to IRQ handler; that way we don't hardcode
the expectation that the IPU plane is at index #0.
- Rework osd_changed() to account for src_* changes
- Add multiplanar YUV 4:4:4 support
- Commit fb addresses to HW at vblank, since addr registers are
not shadow registers
- Probe IPU component later so that IPU plane is last
- Fix driver not working on IPU-less hardware
- Use IPU driver's name as the IRQ name to avoid having two
'ingenic-drm' in /proc/interrupts
- Fix IPU only working for still images on JZ4725B
- Add a bit more code comments
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-10-paul@crapouillou.net
All Ingenic SoCs starting from the JZ4725B support OSD mode.
In this mode, two separate planes can be used. They can have different
positions and sizes, and one can be overlayed on top of the other.
v2: Use fallthrough; instead of /* fall-through */
v3: - Add custom atomic_tail function to handle case where HW gives no
VBLANK
- Use regmap_set_bits() / regmap_clear_bits() when possible
- Use dma_hwdesc_f{0,1} fields in priv structure instead of array
- Use dmam_alloc_coherent() instead of dma_alloc_coherent()
- Use more meaningful 0xf0 / 0xf1 values as DMA descriptors IDs
- Add a bit more code comments
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-9-paul@crapouillou.net
If you pass a string that is not terminated with a carriage return to
dev_err(), it will eventually be printed with a carriage return, but
not right away, since the kernel will wait for a pr_cont().
v2: New patch
v3: No change
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-5-paul@crapouillou.net
Full rename without any modification, except to the Makefile.
Renaming ingenic-drm.c to ingenic-drm-drv.c allow to decouple the module
name from the source file name in the Makefile. This will be useful
later when more source files are added.
v2: New patch
v3: No change
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-4-paul@crapouillou.net
Add documentation of the Device Tree bindings for the Image Processing
Unit (IPU) found in most Ingenic SoCs.
v2: Add missing 'const' in items list
v3: No change
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-3-paul@crapouillou.net
Convert the ingenic,lcd.txt to a new ingenic,lcd.yaml file.
In the process, the new ingenic,jz4780-lcd compatible string has been
added.
v2: Add info about IPU at port@8
v3: No change
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-2-paul@crapouillou.net
While I had thought I'd tested this before, it looks like this one issue
slipped by my original CRC patches. Basically, there seem to be a few
rules we need to follow when sending CRC commands to the display
controller:
* CRCs cannot be both disabled and enabled for a single head in the same
flush
* If a head with CRC reporting enabled switches from one OR to another,
there must be a flush before the OR is re-enabled regardless of the
final state of CRC reporting.
So, split nv50_crc_atomic_prepare_notifier_contexts() into two
functions:
* nv_crc_atomic_release_notifier_contexts() - checks whether the CRC
notifier contexts were released successfully after the first flush
* nv_crc_atomic_init_notifier_contexts() - prepares any CRC notifier
contexts for use before enabling reporting
Additionally, in order to force a flush when we re-assign ORs with heads
that have CRCs enabled we split our atomic check function into two:
* nv50_crc_atomic_check_head() - called from our heads' atomic checks,
determines whether a state needs to set or clear CRC reporting
* nv50_crc_atomic_check_outp() - called at the end of the atomic check
after all ORs have been added to the atomic state, and sets
nv50_atom->flush_disable if needed
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <skeggsb@gmail.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200629223635.103804-1-lyude@redhat.com
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
While most of the functionality on Nvidia GPUs doesn't require using an
explicit handle instead of the main VRAM handle + offset, there are a
couple of places that do require explicit handles, such as CRC
functionality. Since this means we're about to add another
nouveau-chosen handle, let's just go ahead and move any hard-coded
handles into a single header. This is just to keep things slightly
organized, and to make it a little bit easier if we need to add more
handles in the future.
This patch should contain no functional changes.
Changes since v3:
* Correct SPDX license identifier (checkpatch)
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-9-lyude@redhat.com
In order to make sure that we flush disable updates at the right time
when disabling CRCs, we'll need to be able to look at the outp state to
see if we're changing it at the same time that we're disabling CRCs.
So, expose the struct in disp.h.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-8-lyude@redhat.com
While we're not quite ready yet to add support for flexible wndw
mappings, we are going to need to at least keep track of the static wndw
mappings we're currently using in each head's atomic state. We'll likely
use this in the future to implement real flexible window mapping, but
the primary reason we'll need this is for CRC support.
See: on nvidia hardware, each CRC entry in the CRC notifier dma context
has a "tag". This tag corresponds to the nth update on a specific
EVO/NvDisplay channel, which itself is referred to as the "controlling
channel". For gf119+ this can be the core channel, ovly channel, or base
channel. Since we don't expose CRC entry tags to userspace, we simply
ignore this feature and always use the core channel as the controlling
channel. Simple.
Things get a little bit more complicated on gv100+ though. GV100+ only
lets us set the controlling channel to a specific wndw channel, and that
wndw must be owned by the head that we're grabbing CRCs when we enable
CRC generation. Thus, we always need to make sure that each atomic head
state has at least one wndw that is mapped to the head, which will be
used as the controlling channel.
Note that since we don't have flexible wndw mappings yet, we don't
expect to run into any scenarios yet where we'd have a head with no
mapped wndws. When we do add support for flexible wndw mappings however,
we'll need to make sure that we handle reprogramming CRC capture if our
controlling wndw is moved to another head (and potentially reject the
new head state entirely if we can't find another available wndw to
replace it).
With that being said, nouveau currently tracks wndw visibility on heads.
It does not keep track of the actual ownership mappings, which are
(currently) statically programmed. To fix this, we introduce another
bitmask into nv50_head_atom.wndw to keep track of ownership separately
from visibility. We then introduce a nv50_head callback to handle
populating the wndw ownership map, and call it during the atomic check
phase when core->assign_windows is set to true.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-7-lyude@redhat.com
While we expose the ability to turn off hardware dithering for nouveau,
we actually make the mistake of turning it on anyway, due to
dithering_depth containing a non-zero value if our dithering depth isn't
also set to 6 bpc.
So, fix it by never enabling dithering when it's disabled.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-6-lyude@redhat.com
Currently, we modify the depth value stored in the atomic state when
performing a commit in order to workaround the fact we haven't
implemented support for depths higher then 10 yet. This isn't idempotent
though, as it will happen every atomic commit where we modify the OR
state even if the head's depth in the atomic state hasn't been modified.
Normally this wouldn't matter, since we don't modify OR state outside of
modesets, but since the CRC capture region is implemented as part of the
OR state in hardware we'll want to make sure all commits modifying OR
state are idempotent so as to avoid changing the depth unexpectedly.
So, fix this by simply not writing the reduced depth value we come up
with to the atomic state.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-5-lyude@redhat.com
Add some kind of vblank workers. The interface is similar to regular
delayed works, and is mostly based off kthread_work. It allows for
scheduling delayed works that execute once a particular vblank sequence
has passed. It also allows for accurate flushing of scheduled vblank
works - in that flushing waits for both the vblank sequence and job
execution to complete, or for the work to get cancelled - whichever
comes first.
Whatever hardware programming we do in the work must be fast (must at
least complete during the vblank or scanout period, sometimes during the
first few scanlines of the vblank). As such we use a high-priority
per-CRTC thread to accomplish this.
Changes since v7:
* Stuff drm_vblank_internal.h and drm_vblank_work_internal.h contents
into drm_internal.h
* Get rid of unnecessary spinlock in drm_crtc_vblank_on()
* Remove !vblank->worker check
* Grab vbl_lock in drm_vblank_work_schedule()
* Mention self-rearming work items in drm_vblank_work_schedule() kdocs
* Return 1 from drm_vblank_work_schedule() if the work was scheduled
successfully, 0 or error code otherwise
* Use drm_dbg_core() instead of DRM_DEV_ERROR() in
drm_vblank_work_schedule()
* Remove vblank->worker checks in drm_vblank_destroy_worker() and
drm_vblank_flush_worker()
Changes since v6:
* Get rid of ->pending and seqcounts, and implement flushing through
simpler means - danvet
* Get rid of work_lock, just use drm_device->event_lock
* Move drm_vblank_work item cleanup into drm_crtc_vblank_off() so that
we ensure that all vblank work has finished before disabling vblanks
* Add checks into drm_crtc_vblank_reset() so we yell if it gets called
while there's vblank workers active
* Grab event_lock in both drm_crtc_vblank_on()/drm_crtc_vblank_off(),
the main reason for this is so that other threads calling
drm_vblank_work_schedule() are blocked from attempting to schedule
while we're in the middle of enabling/disabling vblanks.
* Move drm_handle_vblank_works() call below drm_handle_vblank_events()
* Simplify drm_vblank_work_cancel_sync()
* Fix drm_vblank_work_cancel_sync() documentation
* Move wake_up_all() calls out of spinlock where we can. The only one I
left was the call to wake_up_all() in drm_vblank_handle_works() as
this seemed like it made more sense just living in that function
(which is all technically under lock)
* Move drm_vblank_work related functions into their own source files
* Add drm_vblank_internal.h so we can export some functions we don't
want drivers using, but that we do need to use in drm_vblank_work.c
* Add a bunch of documentation
Changes since v4:
* Get rid of kthread interfaces we tried adding and move all of the
locking into drm_vblank.c. For implementing drm_vblank_work_flush(),
we now use a wait_queue and sequence counters in order to
differentiate between multiple work item executions.
* Get rid of drm_vblank_work_cancel() - this would have been pretty
difficult to actually reimplement and it occurred to me that neither
nouveau or i915 are even planning to use this function. Since there's
also no async cancel function for most of the work interfaces in the
kernel, it seems a bit unnecessary anyway.
* Get rid of to_drm_vblank_work() since we now are also able to just
pass the struct drm_vblank_work to work item callbacks anyway
Changes since v3:
* Use our own spinlocks, don't integrate so tightly with kthread_works
Changes since v2:
* Use kthread_workers instead of reinventing the wheel.
Cc: Tejun Heo <tj@kernel.org>
Cc: dri-devel@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Co-developed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-4-lyude@redhat.com
This got me confused for a bit while looking over this code: I had been
planning on adding some blocking function calls into this function, but
seeing the irqsave/irqrestore variants of spin_(un)lock() didn't make it
very clear whether or not that would actually be safe.
So I went ahead and reviewed every single driver in the kernel that uses
this function, and they all fall into three categories:
* Driver probe code
* ->atomic_disable() callbacks
* Legacy modesetting callbacks
All of these will be guaranteed to have IRQs enabled, which means it's
perfectly safe to block here. Just to make things a little less
confusing to others in the future, let's switch over to
spin_lock_irq()/spin_unlock_irq() to make that fact a little more
obvious.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-3-lyude@redhat.com
Since we'll be allocating resources for kthread_create_worker() in the
next commit (which could fail and require us to clean up the mess),
let's simplify the cleanup process a bit by registering a
drm_vblank_init_release() action for each drm_vblank_crtc so they're
still cleaned up if we fail to initialize one of them.
Changes since v3:
* Use drmm_add_action_or_reset() - Daniel Vetter
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: dri-devel@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-2-lyude@redhat.com
task_h_load() can return 0 in some situations like running stress-ng
mmapfork, which forks thousands of threads, in a sched group on a 224 cores
system. The load balance doesn't handle this correctly because
env->imbalance never decreases and it will stop pulling tasks only after
reaching loop_max, which can be equal to the number of running tasks of
the cfs. Make sure that imbalance will be decreased by at least 1.
misfit task is the other feature that doesn't handle correctly such
situation although it's probably more difficult to face the problem
because of the smaller number of CPUs and running tasks on heterogenous
system.
We can't simply ensure that task_h_load() returns at least one because it
would imply to handle underflow in other places.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: <stable@vger.kernel.org> # v4.4+
Link: https://lkml.kernel.org/r/20200710152426.16981-1-vincent.guittot@linaro.org
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Acked-by: Jyri Sarha <jsarha@ti.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200713123913.34205-1-grandmaster@al2klimov.de
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200713124923.34282-1-grandmaster@al2klimov.de
Few fixes for issues noticed during testing:
- Two DEBUG_ATOMIC_SLEEP fixes for ti-sysc interconnect target module
driver
- A regression fix for ti-sysc no-idle handling that caused issues
compared to earlier platform data based booting
- A fix for memory leak for omap_hwmod_allocate_module
- Fix d_can driver probe for am437x
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEkgNvrZJU/QSQYIcQG9Q+yVyrpXMFAl8PUn4RHHRvbnlAYXRv
bWlkZS5jb20ACgkQG9Q+yVyrpXN7XRAA0PRDhYAQF9OoeWL337gf0fXRchYA42Pk
C5R5KXdkSZZOlexWCsnmXGBIfAgup2B9sBTr6kFWQW1oPAIzO2oF/ALkw0U+aMsd
A5UMKOmHJCaHwMyiHu2kpwnTFCRUFdGpTYsCxOxAjFgk8R8YdSD2xnWnDwdgJSDX
Lk2SQ+ycfRf1aDT06UaSiwn1vQAOmMOEi0SV4KpyO4eH93mRjsNB7N2aUcKeq2m/
y4JLBF486QCJjJg91YG4wHoM/4O/fIeDWH2nGW2C3BU9f6F7uDip31mNBMB8ra2R
7ijIxLnQRypB5fgCEjimQ+2wWE2Mt1NRfxaIu2Kc9dZxmjqknzEV7oFQg48bQbtQ
X7K8idTFyvdlhMZS9ajKpbbrxSGLqo53sgLWJHCNC0QBFX0clHmcWXrwZn6BNtbr
lQ/+u6lFRthN6jQzudxf3nI/W68yskmfKWnVl5TEkA8tctxyZYjl5Td6DRLCI988
f23NKqTAssgHH5MWR5jDpJrZSThlty9ZqRWLjTON4rYeqVM5vXKtieSIDpV5ZmFo
VYVdv43tg4wnKBFLWZUurVmXhVmlawzsVyAAG83XkjNmPcQCGpI4sByGtujltLGQ
Bo6EwgMo3aOi9sKMOiEviAHhZbzxjLGwqt13L3b/9NX0e9T14gKhil9WswQpSkZ0
UnB5N2rwOUU=
=ekbH
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v5.8/fixes-rc5-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes
Fixes for omaps for v5.8-rc cycle
Few fixes for issues noticed during testing:
- Two DEBUG_ATOMIC_SLEEP fixes for ti-sysc interconnect target module
driver
- A regression fix for ti-sysc no-idle handling that caused issues
compared to earlier platform data based booting
- A fix for memory leak for omap_hwmod_allocate_module
- Fix d_can driver probe for am437x
* tag 'omap-for-v5.8/fixes-rc5-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: Fix dcan driver probe failed on am437x platform
ARM: OMAP2+: Fix possible memory leak in omap_hwmod_allocate_module
bus: ti-sysc: Do not disable on suspend for no-idle
bus: ti-sysc: Fix sleeping function called from invalid context for RTC quirk
bus: ti-sysc: Fix wakeirq sleeping function called from invalid context
Link: https://lore.kernel.org/r/pull-1594840100-132735@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>