The sa1111 platform is one of the two remaining users of the old Arm
specific "dmabounce" code, which is an earlier implementation of the
generic swiotlb.
Linus Walleij submitted a patch that removes dmabounce support from
the ixp4xx, and I had a look at the other user, which is the sa1111
companion chip.
Looking at how dmabounce is used, I could narrow it down to one driver
one three machines:
- dmabounce is only initialized on assabet/neponset, jornada720 and
badge4, which are the platforms that have an sa1111 and support
DMA on it.
- All three of these suffer from "erratum #7" that requires only
doing DMA to half the memory sections based on one of the address
lines, in addition, the neponset also can't DMA to the RAM that
is connected to sa1111 itself.
- the pxa lubbock machine also has sa1111, but does not support DMA
on it and does not set dmabounce.
- only the OHCI and audio devices on sa1111 support DMA, but as
there is no audio driver for this hardware, only OHCI remains.
In the OHCI code, I noticed that two other platforms already have
a local bounce buffer support in the form of the "local_mem"
allocator. Specifically, TMIO and SM501 use this on a few other ARM
boards with 16KB or 128KB of local SRAM that can be accessed from the
OHCI and from the CPU.
While this is not the same problem as on sa1111, I could not find a
reason why we can't re-use the existing implementation but replace the
physical SRAM address mapping with a locally allocated DMA buffer.
There are two main downsides:
- rather than using a dynamically sized pool, this buffer needs
to be allocated at probe time using a fixed size. Without
having any idea of what it should be, I picked a size of
64KB, which is between what the other two OHCI front-ends use
in their SRAM. If anyone has a better idea what that size
is reasonable, this can be trivially changed.
- Previously, only USB transfers to unaddressable memory needed
to go through the bounce buffer, now all of them do, which may
impact runtime performance for USB endpoints that do a lot of
transfers.
On the upside, the local_mem support uses write-combining buffers,
which should be a bit faster for transfers to the device compared to
normal uncached coherent memory as used in dmabounce.
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Cc: linux-usb@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Christoph Hellwig <hch@lst.de>