Pull aio iopriority support from Al Viro:
"The rest of aio stuff for this cycle - Adam's aio ioprio series"
* 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: aio ioprio use ioprio_check_cap ret val
fs: aio ioprio add explicit block layer dependence
fs: iomap dio set bio prio from kiocb prio
fs: blkdev set bio prio from kiocb prio
fs: Add aio iopriority support
fs: Convert kiocb rw_hint from enum to u16
block: add ioprio_check_cap function
zRam as swap is useful for small memory device. However, swap means
those pages on zram are mostly cold pages due to VM's LRU algorithm.
Especially, once init data for application are touched for launching,
they tend to be not accessed any more and finally swapped out. zRAM can
store such cold pages as compressed form but it's pointless to keep in
memory. Better idea is app developers free them directly rather than
remaining them on heap.
This patch tell us last access time of each block of zram via "cat
/sys/kernel/debug/zram/zram0/block_state".
The output is as follows,
300 75.033841 .wh
301 63.806904 s..
302 63.806919 ..h
First column is zram's block index and 3rh one represents symbol (s:
same page w: written page to backing store h: huge page) of the block
state. Second column represents usec time unit of the block was last
accessed. So above example means the 300th block is accessed at
75.033851 second and it was huge so it was written to the backing store.
Admin can leverage this information to catch cold|incompressible pages
of process with *pagemap* once part of heaps are swapped out.
I used the feature a few years ago to find memory hoggers in userspace
to notify them what memory they have wasted without touch for a long
time. With it, they could reduce unnecessary memory space. However, at
that time, I hacked up zram for the feature but now I need the feature
again so I decided it would be better to upstream rather than keeping it
alone. I hope I submit the userspace tool to use the feature soon.
[akpm@linux-foundation.org: fix i386 printk warning]
[minchan@kernel.org: use ktime_get_boottime() instead of sched_clock()]
Link: http://lkml.kernel.org/r/20180420063525.GA253739@rodete-desktop-imager.corp.google.com
[akpm@linux-foundation.org: documentation tweak]
[akpm@linux-foundation.org: fix i386 printk warning]
[minchan@kernel.org: fix compile warning]
Link: http://lkml.kernel.org/r/20180508104849.GA8209@rodete-desktop-imager.corp.google.com
[rdunlap@infradead.org: fix printk formats]
Link: http://lkml.kernel.org/r/3652ccb1-96ef-0b0b-05d1-f661d7733dcc@infradead.org
Link: http://lkml.kernel.org/r/20180416090946.63057-5-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
zRam as swap is useful for small memory device. However, swap means
those pages on zram are mostly cold pages due to VM's LRU algorithm.
Especially, once init data for application are touched for launching,
they tend to be not accessed any more and finally swapped out. zRAM can
store such cold pages as compressed form but it's pointless to keep in
memory. Better idea is app developers free them directly rather than
remaining them on heap.
This patch records last access time of each block of zram so that With
upcoming zram memory tracking, it could help userspace developers to
reduce memory footprint.
Link: http://lkml.kernel.org/r/20180416090946.63057-4-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mark incompressible pages so that we could investigate who is the owner
of the incompressible pages once the page is swapped out via using
upcoming zram memory tracker feature.
With it, we could prevent such pages to be swapped out by using mlock.
Otherwise we might remove them.
This patch exposes new stat for huge pages via mm_stat.
Link: http://lkml.kernel.org/r/20180416090946.63057-3-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "zram memory tracking", v5.
zRam as swap is useful for small memory device. However, swap means
those pages on zram are mostly cold pages due to VM's LRU algorithm.
Especially, once init data for application are touched for launching,
they tend to be not accessed any more and finally swapped out. zRAM can
store such cold pages as compressed form but it's pointless to keep in
memory. As well, it's pointless to store incompressible pages to zram
so better idea is app developers manages them directly like free or
mlock rather than remaining them on heap.
This patch provides a debugfs /sys/kernel/debug/zram/zram0/block_state
to represent each block's state so admin can investigate what memory is
cold|incompressible|same page with using pagemap once the pages are
swapped out.
The output is as follows:
300 75.033841 .wh
301 63.806904 s..
302 63.806919 ..h
First column is zram's block index and 3rh one represents symbol (s:
same page w: written page to backing store h: huge page) of the block
state. Second column represents usec time unit of the block was last
accessed. So above example means the 300th block is accessed at
75.033851 second and it was huge so it was written to the backing store.
This patch (of 4):
ZRAM_ACCESS is used for locking a slot of zram so correct the name. It
is also not a common flag to indicate status of the block so move the
declare position on top of the flag. Lastly, let's move the function to
the top of source code to be able to use it easily without forward
declaration.
Link: http://lkml.kernel.org/r/20180416090946.63057-2-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJbFIrHAAoJEPfTWPspceCm2+kQAKo7o7HL30aRxJYu+gYafkuW
PV47zr3e4vhMDEzDaMsh1+V7I7bm3uS+NZu6cFbcV+N9KXFpeb4V4Hvvm5cs+OC3
WCOBi4eC1h4qnDQ3ZyySrCMN+KHYJ16pZqddEjqw+fhVudx8i+F+jz3Y4ZMDDc3q
pArKZvjKh2wEuYXUMFTjaXY46IgPt+er94OwvrhyHk+4AcA+Q/oqSfSdDahUC8jb
BVR3FV4I3NOHUaru0RbrUko13sVZSboWPCIFrlTDz8xXcJOnVHzdVS1WLFDXLHnB
O8q9cADCfa4K08kz68RxykcJiNxNvz5ChDaG0KloCFO+q1tzYRoXLsfaxyuUDg57
Zd93OFZC6hAzXdhclDFIuPET9OQIjDzwphodfKKmDsm3wtyOtydpA0o7JUEongp0
O1gQsEfYOXmQsXlo8Ot+Z7Ne/HvtGZ91JahUa/59edxQbcKaMrktoyQsQ/d1nOEL
4kXID18wPcFHWRQHYXyVuw6kbpRtQnh/U2m1eenSZ7tVQHwoe6mF3cfSf5MMseak
k8nAnmsfEvOL4Ar9ftg61GOrImaQlidxOC2A8fmY5r0Sq/ZldvIFIZizsdTTCcni
8SOTxcQowyqPf5NvMNQ8cKqqCJap3ppj4m7anZNhbypDIF2TmOWsEcXcMDn4y9on
fax14DPLo59gBRiPCn5f
=nga/
-----END PGP SIGNATURE-----
Merge tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
- clean up how we pass around gfp_t and
blk_mq_req_flags_t (Christoph)
- prepare us to defer scheduler attach (Christoph)
- clean up drivers handling of bounce buffers (Christoph)
- fix timeout handling corner cases (Christoph/Bart/Keith)
- bcache fixes (Coly)
- prep work for bcachefs and some block layer optimizations (Kent).
- convert users of bio_sets to using embedded structs (Kent).
- fixes for the BFQ io scheduler (Paolo/Davide/Filippo)
- lightnvm fixes and improvements (Matias, with contributions from Hans
and Javier)
- adding discard throttling to blk-wbt (me)
- sbitmap blk-mq-tag handling (me/Omar/Ming).
- remove the sparc jsflash block driver, acked by DaveM.
- Kyber scheduler improvement from Jianchao, making it more friendly
wrt merging.
- conversion of symbolic proc permissions to octal, from Joe Perches.
Previously the block parts were a mix of both.
- nbd fixes (Josef and Kevin Vigor)
- unify how we handle the various kinds of timestamps that the block
core and utility code uses (Omar)
- three NVMe pull requests from Keith and Christoph, bringing AEN to
feature completeness, file backed namespaces, cq/sq lock split, and
various fixes
- various little fixes and improvements all over the map
* tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-block: (196 commits)
blk-mq: update nr_requests when switching to 'none' scheduler
block: don't use blocking queue entered for recursive bio submits
dm-crypt: fix warning in shutdown path
lightnvm: pblk: take bitmap alloc. out of critical section
lightnvm: pblk: kick writer on new flush points
lightnvm: pblk: only try to recover lines with written smeta
lightnvm: pblk: remove unnecessary bio_get/put
lightnvm: pblk: add possibility to set write buffer size manually
lightnvm: fix partial read error path
lightnvm: proper error handling for pblk_bio_add_pages
lightnvm: pblk: fix smeta write error path
lightnvm: pblk: garbage collect lines with failed writes
lightnvm: pblk: rework write error recovery path
lightnvm: pblk: remove dead function
lightnvm: pass flag on graceful teardown to targets
lightnvm: pblk: check for chunk size before allocating it
lightnvm: pblk: remove unnecessary argument
lightnvm: pblk: remove unnecessary indirection
lightnvm: pblk: return NVM_ error on failed submission
lightnvm: pblk: warn in case of corrupted write buffer
...
This is the per-I/O equivalent of the ioprio_set system call.
When IOCB_FLAG_IOPRIO is set on the iocb aio_flags field, then we set the
newly added kiocb ki_ioprio field to the value in the iocb aio_reqprio field.
This patch depends on block: add ioprio_check_cap function.
Signed-off-by: Adam Manzanares <adam.manzanares@wdc.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We already check for started commands in all callbacks, but we should
also protect against already completed commands. Do this by taking
the checks to common code.
Acked-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When a userspace client requests a NBD device be disconnected, the
DISCONNECT_REQUESTED flag is set. While this flag is set, the driver
will not inform userspace when a connection is closed.
Unfortunately the flag was never cleared, so once a disconnect was
requested the driver would thereafter never tell userspace about a
closed connection. Thus when connections failed due to timeout, no
attempt to reconnect was made and eventually the device would fail.
Fix by clearing the DISCONNECT_REQUESTED flag (and setting the
DISCONNECTED flag) once all connections are closed.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The BLK_EH_NOT_HANDLED implies nothing happen, but very often that
is not what is happening - instead the driver already completed the
command. Fix the symbolic name to reflect that a little better.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Convert the S_<FOO> symbolic permissions to their octal equivalents as
using octal and not symbolic permissions is preferred by many as more
readable.
see: https://lkml.org/lkml/2016/8/2/1945
Done with automated conversion via:
$ ./scripts/checkpatch.pl -f --types=SYMBOLIC_PERMS --fix-inplace <files...>
Miscellanea:
o Wrapped modified multi-line calls to a single line where appropriate
o Realign modified multi-line calls to open parenthesis
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJbBramAAoJEPfTWPspceCm2gsP/R5p/Vo8Ml303cdgCOrw+Q+T
W/Wp6hGmNUZ7HjMlfF85rF1bqpgGoj226qIibHvKi7eYwUScDNgJwcrrWnNazbGx
Rnl/+NQ/H/38pBvvDJKjth9n0LY/O1geemhnWzYZmqmZJe3jFNTDpw5pGy4RkTpJ
4DCuXg2n81x+kDt7Nslb0dYONkrc1yUjelHS/sJKlUoPs1MvsICGsWNS+Lw0WMIj
Ls282QNs1Z6Y1gcM18UXsTNJSXQFTBmsbH3CIBukHckP2wjZrMEvgM+uxIORqwID
0DJWBSVNJMxrsyZtkMMkkz2wMjPjSvx3HjD/ULpglJD6nGSwRAiQr6XIxARSEEDL
3prQyhEeVGSioOjt5IR2Llt9QDYVhb1GJqqHYmDZux0SMCVg1Pv7mclrudpXGpzq
v2mSJqnwLofOuTFrSh6afgxClD/yh1UNKByf4Ni78PagAgNS4SjEIz8VDG4m1iNu
UgWMbM+vqpZhgIlSDsmnqFEVyGwCWwUOHGCeshQXIy9xaWbCtmaGQpNh4Tho273O
wH6F/jwCnw1BcUeiwOQi+qcwEf1z6nRTGGb3hZt7u2Tj6r8HZzNHtkRaFXsIrnvA
Vc3Y9+YzRWBVV6wzqPC5K+alvGs5ZXLj7BIxQorEkoMkNCYWiPsiBS/rtc0OYiST
b2rC+5eYzquEZ42aqIIn
=QBaF
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20180524' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Two fixes that should go into this release:
- a loop writeback error clearing fix from Jeff
- the sr sense fix from myself"
* tag 'for-linus-20180524' of git://git.kernel.dk/linux-block:
loop: clear wb_err in bd_inode when detaching backing file
sr: pass down correctly sized SCSI sense buffer
For some reason we had discard granularity set to 512 always even when
discards were disabled. Fix this by having the default be 0, and then
if we turn it on set the discard granularity to the blocksize.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add WQ_UNBOUND to the knbd-recv workqueue so we're not bound
to a single CPU that is selected at device creation time.
Signed-off-by: Dan Melnic <dmm@fb.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When a loop block device encounters a writeback error, that error will
get propagated to the bd_inode's wb_err field. If we then detach the
backing file from it, attach another and fsync it, we'll get back the
writeback error that we had from the previous backing file.
This is a bit of a grey area as POSIX doesn't cover loop devices, but it
is somewhat counterintuitive.
If we detach a backing file from the loopdev while there are still
unreported errors, take it as a sign that we're no longer interested in
the previous file, and clear out the wb_err in the loop blockdev.
Reported-and-Tested-by: Theodore Y. Ts'o <tytso@mit.edu>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We need to make sure we don't just set the size of the bdev to 0 while
it's being used by a file system. We have the appropriate check in
nbd_bdev_reset, simply use that helper instead.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
bd_invalidated is kind of a pain wrt partitions as it really only
triggers the partition rescan if it is set after bd_ops->open() runs, so
setting it when we reset the device isn't useful. We also sporadically
would still have partitions left over in some disconnect cases, so fix
this by always setting bd_invalidated on open if there's no
configuration or if we've had a disconnect action happen, that way the
partition table gets invalidated and rescanned properly.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This is what the ioctl based nbd disconnect does as well. Without this
the device will just sit there and wait for the connection to go away
(or IO to occur) before the device gets torn down. Instead clear
everything up on our end so the configuration goes away as quickly as
possible.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When we stopped relying on the bdev everywhere I broke updating the
block device size on the fly, which ceph relies on. We can't just do
set_capacity, we also have to do bd_set_size so things like parted will
notice the device size change.
Fixes: 29eaadc ("nbd: stop using the bdev everywhere")
cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
I messed up changing the size of an NBD device while it was connected by
not actually updating the device or doing the uevent. Fix this by
updating everything if we're connected and we change the size.
cc: stable@vger.kernel.org
Fixes: 639812a ("nbd: don't set the device size until we're connected")
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This fixes a use after free bug, we shouldn't be doing disk->queue right
after we do del_gendisk(disk). Save the queue and do the cleanup after
the del_gendisk.
Fixes: c6a4759ea0 ("nbd: add device refcounting")
cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Variants of proc_create{,_data} that directly take a seq_file show
callback and drastically reduces the boilerplate code in the callers.
All trivial callers converted over.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Found a bug (with ASAN) where we were passing a bio to bio_copy_data()
with bi_next not NULL, when it should have been - a driver had left
bi_next set to something after calling bio_endio().
Since the normal case is only copying single bios, split out
bio_list_copy_data() to avoid more bugs like this in the future.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Same numerical value (for now at least), but a much better documentation
of intent.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Switch everyone to blk_get_request_flags, and then rename
blk_get_request_flags to blk_get_request.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The ps3disk driver already kmaps all pages when copying from/to the
internal bounce buffer, so it can accept highmem pages just fine.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
DAC960 just sets the block bounce limit to the dma mask, which means
that the iommu or swiotlb already take care of the bounce buffering,
and the block bouncing can be removed.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
mtip32xx just sets the block bounce limit to the dma mask, which means
that the iommu or swiotlb already take care of the bounce buffering,
and the block bouncing can be removed.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Branch to the right label in the error handling path in order to keep it
logical.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
... and store num_bvecs for client code's convenience.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
This commit sets QUEUE_FLAG_NONROT and clears up QUEUE_FLAG_ADD_RANDOM
to mark the ramdisks as non-rotational device.
Signed-off-by: SeongJae Park <sj38.park@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJa4LCXAAoJEPfTWPspceCm4DkQAMT9oc4gVTooyRdjFkUwwYQp
oPQr8cNwp1ieB9n85sCVvo8AqP+hcRrp99/sNtMUfdTRcSAumHoILfJF8NcBHWW1
6t609oi73runWbKnMvv+cTNFjwkUT30HHlIZRG5Qtn4wsy6vZAQEiv7J0z7PWays
Lz7Hc1dfu2EbUoE4Phbe/wAqkQmw5HMGCCBltA9YhaApHTudR0i0bLyEN39zNyeF
GifGkqETJ847HD00x25Uko008K0Mg6C+bkBwXBXw/E0mzO+QBG14U1WcLCw7USP2
0KwaO1/F5SiaAUFBbW6TGCAYr/0M8JH9PRqAnGQ0sFlrnsh4gYrYEhnXtzMAbJb5
AfFZxGc12XzTHcrNF+Genx81NpVCCblgEPxkZky8FXUZS6p91X0kNZmtfsaFCl7e
3rc3reORz+9FaM481kY52Acw1J3gZZwdryXS040911yWRvtdS1dk80Q0FnL1qq44
WvoXawk78x59+tcGwyC57dPmTAaFMGHiFQvx4zM5EkoBqTSTtfHkHQMeyFk3Er6U
eSl+2Cp1FWiWZo4sJsmQtmhOshvBeIENyU4H3HaovsWFcOacFyLQSks2FSPquM4G
gUsg6yADbamotdiFpbALM97cEcN4se38WuMNbXsqk3gTBHsCML52m7f8IWpgU3ZW
hWuEU++093bAHADmhG81
=335E
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20180425' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
"I ended up sitting on this about a week longer than I wanted to, since
we were hashing out details with a timeout change. I've now killed
that patch, so we can flush the existing queue in due time.
This contains:
- Fix for an old regression, where entering the queue can be
disturbed by a signal to the process. This can cause spurious EIO.
Fix from Alan Jenkins.
- cdrom information leak fix from Dan.
- Trivial helper for testing queue FUA from Dave Chinner, part of his
O_DIRECT FUA series.
- Series of swim fixes from Finn that actually makes it work again.
- Loop O_DIRECT corruption fix, which caused data corruption in
production for us. From me.
- BFQ crash fix from me.
- bcache maintainer update. Michael no longer has the time to do it,
Coly has stepped up to serve as the new maintainer.
- blkcg locking fixes from Jiang Biao.
- Revert of a change from this merge window from Ming, that causes an
issue on some hardware.
- Minor clarification doc addition from Linus Walleij"
* tag 'for-linus-20180425' of git://git.kernel.dk/linux-block: (22 commits)
Revert "blk-mq: remove code for dealing with remapping queue"
block: mq: Add some minor doc for core structs
bcache: mark Coly Li as bcache maintainer
MAINTAINERS: Remove me as maintainer of bcache
blkcg: init root blkcg_gq under lock
blkcg: small fix on comment in blkcg_init_queue
blkcg: don't hold blkcg lock when deactivating policy
block: add blk_queue_fua() helper function
cdrom: information leak in cdrom_ioctl_media_changed()
bfq-iosched: ensure to clear bic/bfqq pointers when preparing request
blk-mq: start request gstate with gen 1
block/swim: Select appropriate drive on device open
block/swim: Fix IO error at end of medium
block/swim: Check drive type
block/swim: Rename macros to avoid inconsistent inverted logic
block/swim: Don't log an error message for an invalid ioctl
block/swim: Remove extra put_disk() call from error path
block/swim: Fix array bounds check
m68k/mac: Don't remap SWIM MMIO region
loop: handle short DIO reads
...
The driver supports internal and external FDD units so the floppy_open
function must not hard-code the drive location.
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: stable@vger.kernel.org # v4.14+
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Reading to the end of a 720K disk results in an IO error instead of EOF
because the block layer thinks the disk has 2880 sectors. (Partly this
is a result of inverted logic of the ONEMEG_MEDIA bit that's now fixed.)
Initialize the density and head count in swim_add_floppy() to agree
with the device size passed to set_capacity() during drive probe.
Call set_capacity() again upon device open, after refreshing the density
and head count values.
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: stable@vger.kernel.org # v4.14+
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The SWIM chip is compatible with GCR-mode Sony 400K/800K drives but
this driver only supports MFM mode. Therefore only Sony FDHD drives
are supported. Skip incompatible drives.
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: stable@vger.kernel.org # v4.14+
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The Sony drive status bits use active-low logic. The swim_readbit()
function converts that to 'C' logic for readability. Hence, the
sense of the names of the status bit macros should not be inverted.
Mostly they are correct. However, the TWOMEG_DRIVE, MFM_MODE and
TWOMEG_MEDIA macros have inverted sense (like MkLinux). Fix this
inconsistency and make the following patches less confusing.
The same problem affects swim3.c so fix that too.
No functional change.
The FDHD drive status bits are documented in sonydriv.cpp from MAME
and in swimiii.h from MkLinux.
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Jens Axboe <axboe@kernel.dk>
Cc: stable@vger.kernel.org # v4.14+
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The 'eject' shell command may send various different ioctl commands.
This leads to error messages on the console even though the FDEJECT
ioctl succeeds.
~# eject floppy
SWIM floppy_ioctl: unknown cmd 21257
SWIM floppy_ioctl: unknown cmd 1
Don't log an error message for an invalid ioctl, just do as the
swim3 driver does and return -ENOTTY.
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: stable@vger.kernel.org # v4.14+
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add an option to turn off discard and write zeroes offload support to
avoid deprovisioning a fully provisioned image. When enabled, discard
requests will fail with -EOPNOTSUPP, write zeroes requests will fall
back to manually zeroing.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Hitoshi Kamei <hitoshi.kamei.xm@hitachi.com>