Currently the main difference between legacy XOR engine and newer one, is
the way the engine modes are setup (either in the descriptor or through
the controller registers). In order to be able to take into account new
generation of the XOR engine for the ARM64 SoC, we need to identify them
by type, and then depending to the type the engine setup will be
selected.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This commit adds suspend/resume support to the mv_xor driver. The
config and interrupt mask registers must be saved and restored, and
upon resume, the MBus windows configuration must also be done again.
Tested on Armada 388 GP, with a RAID 5 array, accessed before and
after a suspend to RAM cycle.
Based on work from Ofer Heifetz and Lior Amsalem.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Since commit 3e4f52e2da ("dma: mv_xor: Simplify the DMA_MEMCPY
operation"), this field is no longer used, so get rid of it.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch change the way free descriptors are marked.
Instead of having a field for descriptor in use, all the descriptors in the
all_slots list are free for use.
This simplify the allocation method and reduce the locking needed.
Signed-off-by: Lior Amsalem <alior@marvell.com>
Reviewed-by: Ofer Heifetz <oferh@marvell.com>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Now that we have 2 channels assigned to 2 CPUs and all requests are chained
on same channels, we need much more descriptors available to satisfy
async_tx workload.
3072 descriptors was found in our lab as the number of descriptors which
allow the async_tx stack to work without waiting for free descriptors on
submission of new requests.
Signed-off-by: Lior Amsalem <alior@marvell.com>
Reviewed-by: Nadav Haklai <nadavh@marvell.com>
Tested-by: Nadav Haklai <nadavh@marvell.com>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The Marvell Armada 38x SoC introduce new features to the XOR engine,
especially the fact that the engine mode (MEMCPY/XOR/PQ/etc) can be part of
the descriptor and not set through the controller registers.
This new feature allows mixing of different commands (even PQ) on the same
channel/chain without the need to stop the engine to reconfigure the engine
mode.
Refactor the driver to be able to use that new feature on the Armada 38x,
while keeping the old behaviour on the older SoCs.
Signed-off-by: Lior Amsalem <alior@marvell.com>
Reviewed-by: Ofer Heifetz <oferh@marvell.com>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch fixes a bug in the XOR driver where the cleanup function can be
called and free descriptors that never been processed by the engine (which
result in data errors).
The cleanup function will free descriptors based on the ownership bit in
the descriptors.
Fixes: ff7b04796d ("dmaengine: DMA engine driver for Marvell XOR engine")
Signed-off-by: Lior Amsalem <alior@marvell.com>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Reviewed-by: Ofer Heifetz <oferh@marvell.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Free Software Foundation mailing address has been moved in the past and some
of the addresses here are outdated. Remove them from file headers since the
COPYING file in the kernel sources includes it.
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The driver is capable of supporting DMA_INTERRUPT by issuing a dummy 128-byte
transfer. This helps removing a poll in the async_tx stack, replacing it with
a completion interrupt.
Signed-off-by: Lior Amsalem <alior@marvell.com>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The driver currently defines the USE_TIMER macro, but the timer-feature
is never used in the code. The XOR and CRC32 results are never used.
The 'unmap_xxx' fields are no longer needed, they were made obsolete
in commit: 54f8d501e8 dmaengine: remove DMA unmap from drivers.
Let's remove all this dead code.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This commit unmasks the end-of-chain interrupt and removes the
end-of-descriptor command setting on all transactions, except those
explicitly flagged with DMA_PREP_INTERRUPT.
This allows to raise an interrupt only on chain completion, instead of
on each descriptor completion, which reduces interrupt count.
Signed-off-by: Lior Amsalem <alior@marvell.com>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This commit replaces the current magic numbers in the interrupt handling
with proper macros, which makes more readable and self-documenting.
While here replace the BUG() with a noisy WARN_ON(). There's no reason
to tear down the entire system for an DMA IRQ error.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Although the driver supported multiple-slot allocation, only one slot was
ever allocated for each transaction. So, given we have no users of the
multi-slot support, we can remove it and greatly simplify the code.
Signed-off-by: Lior Amsalem <alior@marvell.com>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Despite requesting two memory resources, called 'base' and 'high_base', the
driver uses explicitly only the former. The latter is being used implicitly
by addressing at offset +0x200, which in practice accesses high_base.
In other words, the current driver breaks if the second memory resource
is ever place at an offset different from +0x200.
This patch fixes the above by defining the registers with the offset from
high_base, and use high_base explicitly where appropriate.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The mv_xor driver had never been used in a big-endian context, and
therefore was not using the hardware features to support such an
execution environment. The hardware provides a "descriptor swap" bit
that automatically swaps the bytes of the DMA descriptors, within
blocks of 8 bytes. This requires a different DMA descriptor layout on
big-endian systems, as well as enabling this "descriptor swap" bit.
This mechanism is exactly identical to the one already used in the
mv643xx_eth network driver and the mvneta network driver.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Dan Williams <djbw@fb.com>
There have never been any real users of MEMSET operations since they
have been introduced in January 2007 by commit 7405f74bad ("dmaengine:
refactor dmaengine around dma_async_tx_descriptor"). Therefore remove
support for them for now, it can be always brought back when needed.
[sebastian.hesselbarth@gmail.com: fix drivers/dma/mv_xor]
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Acked-by: Dan Williams <djbw@fb.com>
Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Olof Johansson <olof@lixom.net>
Cc: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The XOR channels on Marvell SoCs have a Window Override Control
register that allow to do some fancy things with addresses. Those
features are not used by the driver, but some U-Boot versions anyway
modify those registers.
For some reason, the U-Boot on OpenBlocks AX3-4 was setting an invalid
value in those registers when the addition 2 GB DRAM chip was plugged
into the board, causing the XOR driver to fail in using the XOR
engines.
By setting those registers to 0 during the driver initialization, we
ensure that the registers are configured according with the driver
operation model.
Thanks to Lior Amsalem <alior@marvell.com> for his help in debugging
this problem.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Even though the driver cannot be unloaded at the moment, it is still
good to properly free the IRQ handlers in the channel removal function.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
The pool_size is always PAGE_SIZE, and since it is a software
configuration paramter (and not a hardware description parameter), we
cannot make it part of the Device Tree binding, so we'd better remove
it from the platform_data as well.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Now that mv_xor_device is no longer used to designate the per-channel
DMA devices, use it know to designate the XOR engine themselves
(currently composed of two XOR channels).
So, now we have the nice organization where:
- mv_xor_device represents each XOR engine in the system
- mv_xor_chan represents each XOR channel of a given XOR engine
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Even though the DMA engine infrastructure has support for multiple
channels per device, the mv_xor driver registers one DMA engine device
for each channel, because the mv_xor channels inside the same XOR
engine have different capabilities, and the DMA engine infrastructure
only allows to express capabilities at the DMA engine device level.
The mv_xor driver has therefore been registering one DMA engine device
and one DMA engine channel for each XOR channel since its introduction
in the kernel. However, it kept two separate internal structures,
mv_xor_device and mv_xor_channel, which didn't make a lot of sense
since there was a 1:1 mapping between those structures.
This patch gets rid of this duplication, and merges everything into
the mv_xor_chan structure.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
The mv_xor_device structure embeds a 'struct dma_device', which is
named 'common', a not very meaningful name. Rename it to 'dmadev',
which will help avoid confusions later as we merge the mv_xor_device
and mv_xor_chan structures together.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
The mv_xor_chan structure embeds a 'struct dma_chan', which is named
'common', a not very meaningful name. Rename it to 'dmachan', which
will help avoid confusions later as we merge the mv_xor_device and
mv_xor_chan structures together.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
It was only used in places where we could get the 'struct device *'
pointer through a different way.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
The 'shared' word no longer makes sense in a number of places as we
renamed the 'mv_xor_shared' driver to 'mv_xor'.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Extend the XOR engine driver (currently called "mv_xor_shared") so
that XOR channels can be passed in the platform_data structure, and be
registered from there.
This will allow the users of the driver to be converted to the single
platform_driver variant of the mv_xor driver.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
The driver currently pokes into the platform_data structure during its
normal operation to get the pool_size value. Poking into the
platform_data structure is not nice when moving to the Device Tree, so
this commit adds a new pool_size field in the mv_xor_device structure,
which gets initialized at ->probe() time. The driver then uses this
field instead of the platform_data.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Some orion platforms can gate the XOR driver clock. If the clock
exisits, unable/disable it as appropriate.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Jamie Lentin <jm@lentin.co.uk>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
Every DMA engine implementation declares a last completed dma cookie
in their private dma channel structures. This is pointless, and
forces driver specific code. Move this out into the common dma_chan
structure.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jassi Brar <jassisinghbrar@gmail.com>
[imx-sdma.c & mxs-dma.c]
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
mv_xor's is_complete_cookie is only ever written to, but never read.
This is silly, remove the write-only structure member.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jassi Brar <jassisinghbrar@gmail.com>
[imx-sdma.c & mxs-dma.c]
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Drop mv_xor's use of tx_list from struct dma_async_tx_descriptor in
preparation for removal of this field.
Cc: Saeed Bishara <saeed@marvell.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The XOR engine found in Marvell's SoCs and system controllers
provides XOR and DMA operation, iSCSI CRC32C calculation, memory
initialization, and memory ECC error cleanup operation support.
This driver implements the DMA engine API and supports the following
capabilities:
- memcpy
- xor
- memset
The XOR engine can be used by DMA engine clients implemented in the
kernel, one of those clients is the RAID module. In that case, I
observed 20% improvement in the raid5 write throughput, and 40%
decrease in the CPU utilization when doing array construction, those
results obtained on an 5182 running at 500Mhz.
When enabling the NET DMA client, the performance decreased, so
meanwhile it is recommended to keep this client off.
Signed-off-by: Saeed Bishara <saeed@marvell.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>