This saves a little code, and allow to simplify the error handling.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Make use of a new interface provided by iov_iter, backed by
scatter-gather list of iovec, instead of the old interface based on
sg_iovec. Also use iov_iter_advance() instead of manual iteration.
This commit should contain only literal replacements, without
functional changes.
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
[dpark: add more description in commit message]
Signed-off-by: Dongsu Park <dongsu.park@profitbricks.com>
[hch: fixed to do a deep clone of the iov_iter, and to properly use
the iov_iter direction]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The code sniplet to walk all bio_vecs and free their pages is opencoded in
way to many places, so factor it into a helper. Also convert the slightly
more complex cases in bio_kern_endio and __bio_copy_iov where we break
the freeing from an existing loop into a separate one.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Just open code the trivial mapping from a kernel virtual address to
a bio instead of going through the complex user address mapping
machinery.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Commit 4ee5eaf4 ("block: add a queue flag for request stacking support")
introduced the concept of "STACKABLE" and blk-mq devices fit the
definition in that they establish q->request_fn. So establish
QUEUE_FLAG_STACKABLE in QUEUE_FLAG_MQ_DEFAULT.
While not strictly needed (DM _could_ just check for q->mq_ops to assume
the device is request-based), request-based DM support for blk-mq devices
benefits from the ability to consistently check for QUEUE_FLAG_STACKABLE
before allowing a device to be stacked into a request-based DM table.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
blk_mq_alloc_request() may establish REQ_MQ_INFLIGHT in addition to
incrementing the hctx->nr_active count. Any cmd_flags that are
established in the newly allocated clone request must be preserved in
addition to the cmd_flags that are later copied over from the original
request as part of blk_rq_prep_clone().
Otherwise, if REQ_MQ_INFLIGHT isn't set in the clone request the
hctx->nr_active count won't get decremented via blk_mq_free_request().
The only consumer of blk_rq_prep_clone() is request-based DM, which uses
blk_rq_init() prior to calling blk_rq_prep_clone() for the non-blk-mq
case. Given the cloned request's cmd_flags will be 0 it is safe to OR
them with the original request's cmd_flags for both the non-blk-mq and
blk-mq cases.
Reported-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
If the request passed to blk_insert_cloned_request() was allocated by
a blk-mq device it must be submitted using blk_mq_insert_request().
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Prepare to allow blk_rq_prep_clone() to accept clone requests that were
allocated from blk-mq request queues. As such the blk_rq_prep_clone()
caller must first initialize the clone request.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This is the blk-mq part to support tag allocation policy. The default
allocation policy isn't changed (though it's not a strict FIFO). The new
policy is round-robin for libata. But it's a try-best implementation. If
multiple tasks are competing, the tags returned will be mixed (which is
unavoidable even with !mq, as requests from different tasks can be
mixed in queue)
Cc: Jens Axboe <axboe@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The libata tag allocation is using a round-robin policy. Next patch will
make libata use block generic tag allocation, so let's add a policy to
tag allocation.
Currently two policies: FIFO (default) and round-robin.
Cc: Jens Axboe <axboe@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
As Christoph put it:
Can we just get rid of the warnings? It's fairly annoying as devices
without partitions are perfectly fine and very useful.
Me too I see this message every VM boot for ages on all my
devices. Would love to just remove it. For me a partition-table
is only needed for a booting BIOS, grub, and stuff.
CC: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Boaz Harrosh <boaz@plexistor.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
blkdev_issue_discard() will zero a given block range. This is done by
way of explicit writing, thus provisioning or allocating the blocks on
disk.
There are use cases where the desired behavior is to zero the blocks but
unprovision them if possible. The blocks must deterministically contain
zeroes when they are subsequently read back.
This patch adds a flag to blkdev_issue_zeroout() that provides this
variant. If the discard flag is set and a block device guarantees
discard_zeroes_data we will use REQ_DISCARD to clear the block range. If
the device does not support discard_zeroes_data or if the discard
request fails we will fall back to first REQ_WRITE_SAME and then a
regular REQ_WRITE.
Also update the callers of blkdev_issue_zero() to reflect the new flag
and make sb_issue_zeroout() prefer the discard approach.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Hi,
If you can manage to submit an async write as the first async I/O from
the context of a process with realtime scheduling priority, then a
cfq_queue is allocated, but filed into the wrong async_cfqq bucket. It
ends up in the best effort array, but actually has realtime I/O
scheduling priority set in cfqq->ioprio.
The reason is that cfq_get_queue assumes the default scheduling class and
priority when there is no information present (i.e. when the async cfqq
is created):
static struct cfq_queue *
cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic,
struct bio *bio, gfp_t gfp_mask)
{
const int ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio);
const int ioprio = IOPRIO_PRIO_DATA(cic->ioprio);
cic->ioprio starts out as 0, which is "invalid". So, class of 0
(IOPRIO_CLASS_NONE) is passed to cfq_async_queue_prio like so:
async_cfqq = cfq_async_queue_prio(cfqd, ioprio_class, ioprio);
static struct cfq_queue **
cfq_async_queue_prio(struct cfq_data *cfqd, int ioprio_class, int ioprio)
{
switch (ioprio_class) {
case IOPRIO_CLASS_RT:
return &cfqd->async_cfqq[0][ioprio];
case IOPRIO_CLASS_NONE:
ioprio = IOPRIO_NORM;
/* fall through */
case IOPRIO_CLASS_BE:
return &cfqd->async_cfqq[1][ioprio];
case IOPRIO_CLASS_IDLE:
return &cfqd->async_idle_cfqq;
default:
BUG();
}
}
Here, instead of returning a class mapped from the process' scheduling
priority, we get back the bucket associated with IOPRIO_CLASS_BE.
Now, there is no queue allocated there yet, so we create it:
cfqq = cfq_find_alloc_queue(cfqd, is_sync, cic, bio, gfp_mask);
That function ends up doing this:
cfq_init_cfqq(cfqd, cfqq, current->pid, is_sync);
cfq_init_prio_data(cfqq, cic);
cfq_init_cfqq marks the priority as having changed. Then, cfq_init_prio
data does this:
ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio);
switch (ioprio_class) {
default:
printk(KERN_ERR "cfq: bad prio %x\n", ioprio_class);
case IOPRIO_CLASS_NONE:
/*
* no prio set, inherit CPU scheduling settings
*/
cfqq->ioprio = task_nice_ioprio(tsk);
cfqq->ioprio_class = task_nice_ioclass(tsk);
break;
So we basically have two code paths that treat IOPRIO_CLASS_NONE
differently, which results in an RT async cfqq filed into a best effort
bucket.
Attached is a patch which fixes the problem. I'm not sure how to make
it cleaner. Suggestions would be welcome.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
The blk-mq tagging tries to maintain some locality between CPUs and
the tags issued. The tags are split into groups of words, and the
words may not be fully populated. When searching for a new free tag,
blk-mq may look at partial words, hence it passes in an offset/size
to find_next_zero_bit(). However, it does that wrong, the size must
always be the full length of the number of tags in that word,
otherwise we'll potentially miss some near the end.
Another issue is when __bt_get() goes from one word set to the next.
It bumps the index, but not the last_tag associated with the
previous index. Bump that to be in the range of the new word.
Finally, clean up __bt_get() and __bt_get_word() a bit and get
rid of the goto in there, and the unnecessary 'wrap' variable.
Signed-off-by: Jens Axboe <axboe@fb.com>
In order to support accesses to larger chunks of memory, pass in a
'size' parameter (counted in bytes), and return the amount available at
that address.
Add a new helper function, bdev_direct_access(), to handle common
functionality including partition handling, checking the length requested
is positive, checking for the sector being page-aligned, and checking
the length of the request does not pass the end of the partition.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Boaz Harrosh <boaz@plexistor.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Commit b4c6a02877 exported the start and unfreeze, but we need
the regular blk_mq_freeze_queue() for the loop conversion.
Signed-off-by: Jens Axboe <axboe@fb.com>
Check IS_ERR_OR_NULL(return value) instead of just return value.
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reduced to IS_ERR() by me, we never return NULL.
Signed-off-by: Jens Axboe <axboe@fb.com>
If it's dying, we can't expect new request to complete and come
in an wake up other tasks waiting for requests. So after we
have marked it as dying, wake up everybody currently waiting
for a request. Once they wake, they will retry their allocation
and fail appropriately due to the state of the queue.
Tested-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
On top of this, add a couple of WARN_ONs and stop spamming dmesg on
pretty much every boot of a virtual machine.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJUn8hXAAoJEL/70l94x66Dme4H/R/HA+Aswgzse8nx3pNiqStv
e0BBeUHVJtxlOfnOlJGCWc1ef7uzKdvVWuqCmJwMDJDoLd/I8kF84E3AQS+zTJ/u
Dlb+yjwjoFPbQwr8xfclcvYXZxJgleKQJcyBWKBxgMTnFdjgRfX7U0MzXZJ/gFzH
mdHhLlNBU/On0l3A+dsKVgjtiuHZIQD0FraYs4qa2QajRGgDoHypzTmwh20XBmdx
3l/zFnSFSbaCTckbKb0xYv22pZTMd/5qrxer05sl98nzrrrXIDhVSo0hbrNVqorv
pDr+908XGvTOgVR1cvgkFn74INudiYjNyICGsue/ksmUPh9jz6hWic7sNeqYfcI=
=ehkB
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"The important fixes are for two bugs introduced by the merge window.
On top of this, add a couple of WARN_ONs and stop spamming dmesg on
pretty much every boot of a virtual machine"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: warn on more invariant breakage
kvm: fix sorting of memslots with base_gfn == 0
kvm: x86: drop severity of "generation wraparound" message
kvm: x86: vmx: reorder some msr writing
Pull vfs fix from Al Viro:
"An embarrassing bug in lustre patches from this cycle ;-/"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
[regression] braino in "lustre: use is_root_inode()"
Modifying a non-existent slot is not allowed. Also check that the
first loop doesn't move a deleted slot beyond the used part of
the mslots array.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before commit 0e60b0799f (kvm: change memslot sorting rule from size
to GFN, 2014-12-01), the memslots' sorting key was npages, meaning
that a valid memslot couldn't have its sorting key equal to zero.
On the other hand, a valid memslot can have base_gfn == 0, and invalid
memslots are identified by base_gfn == npages == 0.
Because of this, commit 0e60b0799f broke the invariant that invalid
memslots are at the end of the mslots array. When a memslot with
base_gfn == 0 was created, any invalid memslot before it were left
in place.
This can be fixed by changing the insertion to use a ">=" comparison
instead of "<=", but some care is needed to avoid breaking the case
of deleting a memslot; see the comment in update_memslots.
Thanks to Tiejun Chen for posting an initial patch for this bug.
Reported-by: Jamie Heilman <jamie@audible.transient.net>
Reported-by: Andy Lutomirski <luto@amacapital.net>
Tested-by: Jamie Heilman <jamie@audible.transient.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Just a couple of fixes for the new Intel Skylake HD-audio support.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJUnsvzAAoJEGwxgFQ9KSmkyscQALX3t/bdeIOEQAnBZrdPZM7R
1qcqRp8qKTpRdqGOE3RFRb/BoekmMg1OJcNFVNHgZigq3icZg3LZoZlizc1nedbc
mxoNlhuq68y3tCFhnkAndxnI88zZV91A+Plp3JuBALAk0g4H4mksxf5oadqIkhY5
6isK3mA92jfJaihr1waaczKgsjXl7SdPIv2stdVmJbR83llwC4QVSWEwgafVOv6t
cgbVGWGuv//weYfewhjIPxGA8ZTapI4P/5qB1FghOMWY+l1K04IgvuVAP3mBJwPj
Cvzb+HLS4eZWE4iNlLlgt+ETQae2Gc26jLNEUwPEXVQJJ9oHfoC5yB/63MvCVv55
WrK7vfaG0jwXqc1fleBIhAx1JkF0glEnuq4m6mwjSB9d1TbhYOpiLMa+V3jpGPbn
aaumnAsrH8gMSQHRDo53iHUd65UVHW3F0Lqo2uKPrf3j17IXhhGayzF7YCNxqZg2
4BrpaN+Ido1Yzv/68+D6afZWBA96gezSC8IPuvf1Gv2hZe5qXajxOr20DnKUT5G4
Ifgv5vYRpcdCv2issEC9qOJdNZZQ7UhWcX7UEpVJzMK+zFD6JGun52tZ/HhiGXym
1YA4FP/xd1cqca56W38DCKxdcwl+1kPKZ4H8OfRoZwucqpScsEvv7DYm8UNIRY+k
oBSDb9U9Ra/nBhijWlMO
=EYsC
-----END PGP SIGNATURE-----
Merge tag 'sound-3.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Just a couple of fixes for the new Intel Skylake HD-audio support"
* tag 'sound-3.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda_intel: apply the Seperate stream_tag for Skylake
ALSA: hda_controller: Separate stream_tag for input and output streams.
Since most virtual machines raise this message once, it is a bit annoying.
Make it KERN_DEBUG severity.
Cc: stable@vger.kernel.org
Fixes: 7a2e8aaf0f
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The commit 34a1cd60d1, "x86: vmx: move some vmx setting from
vmx_init() to hardware_setup()", tried to refactor some codes
specific to vmx hardware setting into hardware_setup(), but some
msr writing should depend on our previous setting condition like
enable_apicv, enable_ept and so on.
Reported-by: Jamie Heilman <jamie@audible.transient.net>
Tested-by: Jamie Heilman <jamie@audible.transient.net>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In one of the places (ll_md_blocking_ast()) we had open-coded
!is_root_inode(inode) and replaced it with is_root_inode(inode).
See the last chunk of f76c23:
- inode != inode->i_sb->s_root->d_inode)
+ is_root_inode(inode))
should've been
+ !is_root_inode(inode))
obviously...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull parisc build fix from Helge Deller:
"This unbreaks the kernel compilation on parisc with gcc-4.9"
* 'parisc-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: fix out-of-register compiler error in ldcw inline assembler function
The __ldcw macro has a problem when its argument needs to be reloaded from
memory. The output memory operand and the input register operand both need to
be reloaded using a register in class R1_REGS when generating 64-bit code.
This fails because there's only a single register in the class. Instead, use a
memory clobber. This also makes the __ldcw macro a compiler memory barrier.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: <stable@vger.kernel.org> [3.13+]
Signed-off-by: Helge Deller <deller@gmx.de>
The total stream number of Skylake's input and output stream
exceeds 15, which will cause some streams do not work because
of the overflow on SDxCTL.STRM field if using the legacy
stream tag allocation method.
This patch uses the new stream tag allocation method by add
the flag AZX_DCAPS_SEPARATE_STREAM_TAG for Skylake platform.
Signed-off-by: Libin Yang <libin.yang@intel.com>
Reviewed-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Implemented separate stream_tag assignment for input and output streams.
According to hda specification stream tag must be unique throughout the
input streams group, however an output stream might use a stream tag
which is already in use by an input stream. This change is necessary
to support HW which provides a total of more than 15 stream DMA engines
which with legacy implementation causes an overflow on SDxCTL.STRM
field (and the whole SDxCTL register) and as a result usage of
Reserved value 0 in the SDxCTL.STRM field which confuses HDA controller.
Signed-off-by: Rafal Redzimski <rafal.f.redzimski@intel.com>
Signed-off-by: Jayachandran B <jayachandran.b@intel.com>
Signed-off-by: Libin Yang <libin.yang@intel.com>
Reviewed-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Pull drm fixes from Dave Airlie:
"Xmas fixes pull:
core:
one atomic fix, revert the WARN_ON dumb buffers patch.
agp:
fixup Dave J.
nouveau:
fix 3.18 regression for old userspace
tegra fixes:
vblank and iommu fixes
amdkfd:
fix bugs shown by testing with userspace, init apertures once
msm:
hdmi fixes and cleanup
i915:
misc fixes
There is also a link ordering fix that I've asked to be cc'ed to you,
putting iommu before gpu, it fixes an issue with amdkfd when things
are all in the kernel, but I didn't like sending it via my tree
without discussion.
I'll probably be a bit on/off for a few weeks with pulls now, due to
holidays and LCA, so don't be surprised if stuff gets a bit backed up,
and things end up a bit large due to lag"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (28 commits)
Revert "drm/gem: Warn on illegal use of the dumb buffer interface v2"
agp: Fix up email address & attributions in AGP MODULE_AUTHOR tags
nouveau: bring back legacy mmap handler
drm/msm/hdmi: rework HDMI IRQ handler
drm/msm/hdmi: enable regulators before clocks to avoid warnings
drm/msm/mdp5: update irqs on crtc<->encoder link change
drm/msm: block incoming update on pending updates
drm/atomic: fix potential null ptr on plane enable
drm/msm: Deletion of unnecessary checks before the function call "release_firmware"
drm/msm: Deletion of unnecessary checks before two function calls
drm/tegra: dc: Select root window for event dispatch
drm/tegra: gem: Use the proper size for GEM objects
drm/tegra: gem: Flush buffer objects upon allocation
drm/tegra: dc: Fix a potential race on page-flip completion
drm/tegra: dc: Consistently use the same pipe
drm/irq: Add drm_crtc_vblank_count()
drm/irq: Add drm_crtc_handle_vblank()
drm/irq: Add drm_crtc_send_vblank_event()
drm/i915: Disable PSMI sleep messages on all rings around context switches
drm/i915: Force the CS stall for invalidate flushes
...
One that lockdep turned up, I didn't go far enough with cleanup
of attributes for IPMI. This has been there a long time; my
previous fix of this didn't fix all the attributes.
One fix for some arches that need an explicit linux/ctype.h for
isspace().
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iEYEABECAAYFAlSXWLcACgkQIXnXXONXERd+LACeKXjsGiTpTUe4vjnOsJj8oLPg
I80AmwT20SVzlxGSVqSrwBn6uVirwjs5
=ByuY
-----END PGP SIGNATURE-----
Merge tag 'for-linus-2' of git://git.code.sf.net/p/openipmi/linux-ipmi
Pull ipmi driver bugfixes from Corey Minyard:
"Fix two bugs:
One that lockdep turned up, I didn't go far enough with cleanup of
attributes for IPMI. This has been there a long time; my previous fix
of this didn't fix all the attributes.
One fix for some arches that need an explicit linux/ctype.h for
isspace()"
* tag 'for-linus-2' of git://git.code.sf.net/p/openipmi/linux-ipmi:
ipmi: Fix compile issue with isspace()
ipmi: Finish cleanup of BMC attributes
This reverts commit 355a701838.
This had some bad side effects under normal operation, and should
have been dropped earlier.
Signed-off-by: Dave Airlie <airlied@redhat.com>
- Display MEC fw version in topology. Without this, the HSA userspace
stack is broken.
- Init apertures information only once per process
* tag 'amdkfd-fixes-2014-12-23' of git://people.freedesktop.org/~gabbayo/linux:
amdkfd: init aperture once per process
amdkfd: Display MEC fw version in topology node
drm/radeon: Add implementation of get_fw_version
drm/amd: Add get_fw_version to kfd-->kgd interface
Pull audit fixes from Paul Moore:
"Four patches to fix various problems with the audit subsystem, all are
fairly small and straightforward.
One patch fixes a problem where we weren't using the correct gfp
allocation flags (GFP_KERNEL regardless of context, oops), one patch
fixes a problem with old userspace tools (this was broken for a
while), one patch fixes a problem where we weren't recording pathnames
correctly, and one fixes a problem with PID based filters.
In general I don't think there is anything controversial with this
patchset, and it fixes some rather unfortunate bugs; the allocation
flag one can be particularly scary looking for users"
* 'upstream' of git://git.infradead.org/users/pcmoore/audit:
audit: restore AUDIT_LOGINUID unset ABI
audit: correctly record file names with different path name types
audit: use supplied gfp_mask from audit_buffer in kauditd_send_multicast_skb
audit: don't attempt to lookup PIDs when changing PID filtering audit rules
A regression was caused by commit 780a7654cee8:
audit: Make testing for a valid loginuid explicit.
(which in turn attempted to fix a regression caused by e1760bd)
When audit_krule_to_data() fills in the rules to get a listing, there was a
missing clause to convert back from AUDIT_LOGINUID_SET to AUDIT_LOGINUID.
This broke userspace by not returning the same information that was sent and
expected.
The rule:
auditctl -a exit,never -F auid=-1
gives:
auditctl -l
LIST_RULES: exit,never f24=0 syscall=all
when it should give:
LIST_RULES: exit,never auid=-1 (0xffffffff) syscall=all
Tag it so that it is reported the same way it was set. Create a new
private flags audit_krule field (pflags) to store it that won't interact with
the public one from the API.
Cc: stable@vger.kernel.org # v3.10-rc1+
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
This patch adds pgd_page definition in order to keep supporting
HAVE_GENERIC_RCU_GUP configuration. In addition, it changes pud_page
expression to align with pmd_page for readability.
An introduction of pgd_page resolves the following build breakage
under 4KB + 4Level memory management combo.
mm/gup.c: In function 'gup_huge_pgd':
mm/gup.c:889:2: error: implicit declaration of function 'pgd_page' [-Werror=implicit-function-declaration]
head = pgd_page(orig);
^
mm/gup.c:889:7: warning: assignment makes pointer from integer without a cast
head = pgd_page(orig);
Cc: Will Deacon <will.deacon@arm.com>
Cc: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Jungseok Lee <jungseoklee85@gmail.com>
[catalin.marinas@arm.com: remove duplicate pmd_page definition]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The usual defconfig tweaks, this time:
- FHANDLE and AUTOFS4_FS to keep systemd happy
- PID_NS, QUOTA and KEYS to keep LTP happy
- Disable DEBUG_PREEMPT, as this *really* hurts performance
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
On arm64 the TTBR0_EL1 register is set to either the reserved TTBR0
page tables on boot or to the active_mm mappings belonging to user space
processes, it must never be set to swapper_pg_dir page tables mappings.
When a CPU is booted its active_mm is set to init_mm even though its
TTBR0_EL1 points at the reserved TTBR0 page mappings. This implies
that when __cpu_suspend is triggered the active_mm can point at
init_mm even if the current TTBR0_EL1 register contains the reserved
TTBR0_EL1 mappings.
Therefore, the mm save and restore executed in __cpu_suspend might
turn out to be erroneous in that, if the current->active_mm corresponds
to init_mm, on resume from low power it ends up restoring in the
TTBR0_EL1 the init_mm mappings that are global and can cause speculation
of TLB entries which end up being propagated to user space.
This patch fixes the issue by checking the active_mm pointer before
restoring the TTBR0 mappings. If the current active_mm == &init_mm,
the code sets the TTBR0_EL1 to the reserved TTBR0 mapping instead of
switching back to the active_mm, which is the expected behaviour
corresponding to the TTBR0_EL1 settings when __cpu_suspend was entered.
Fixes: 95322526ef ("arm64: kernel: cpu_{suspend/resume} implementation")
Cc: <stable@vger.kernel.org> # 3.14+: 18ab7db
Cc: <stable@vger.kernel.org> # 3.14+: 714f599
Cc: <stable@vger.kernel.org> # 3.14+: c3684fb
Cc: <stable@vger.kernel.org> # 3.14+
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
- Remove soon-to-be-dead @redhat address.
- Jeff Hartmann wrote the bulk of the original backend code, and should
at least get a mention in the MODULE_AUTHOR for backend.o
- Various people at Intel have done a lot more work than myself on the
intel-* drivers, so again, mention that.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
- Fix inability to discard used space when the thin-pool target is in
out-of-data-space mode and also transition the thin-pool back to write
mode once free space is made available.
- Fix DM core bio-based end_io bug that prevented proper post-processing
of the error code returned from the block layer.
- Fix crash in DM thin-pool due to thin device being added to the pool's
active_thins list before properly initializing the thin device's
refcount.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUmGNyAAoJEMUj8QotnQNabvkIALCUey5p8mcK3nFlN+rK/e0m
xJLIMCQKcqprygFlA7neg7CoE2Wypk2S/G4BI3gvMzkWstzul5JpmXzxXTklYnXi
co65djg9sMHct3J3VKaX/X0hs8rdhXoiF9cz4f6RHuS5fyvSUMt+v6IlG0s4H3AQ
iDfN8Nx6NF2wCdVUNAuhQHuefp9NmEo8gb3OQjrnSe8yLc2DPB2fmbJl/r7/GeIn
VRCu38hZKx8f7kEfntwmC6BD45Icn2xNaP9grjZsy1pdfQzeb+03NOicy7A2NkqA
pl88DRKb/bktNHVSqzL9a9Pf4qxEQU5wYRgI/b9ZVUY5b/QU66EW/NMoeSkdRHg=
=zo19
-----END PGP SIGNATURE-----
Merge tag 'dm-3.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
"Thre stable fixes and one fix for a regression introduced during 3.19
merge:
- Fix inability to discard used space when the thin-pool target is in
out-of-data-space mode and also transition the thin-pool back to
write mode once free space is made available.
- Fix DM core bio-based end_io bug that prevented proper
post-processing of the error code returned from the block layer.
- Fix crash in DM thin-pool due to thin device being added to the
pool's active_thins list before properly initializing the thin
device's refcount"
* tag 'dm-3.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: fix missed error code if .end_io isn't implemented by target_type
dm thin: fix crash by initializing thin device's refcount and completion earlier
dm thin: fix missing out-of-data-space to write mode transition if blocks are released
dm thin: fix inability to discard blocks when in out-of-data-space mode
This reverts commit c8475d144a.
There are several[1][2] of bug reports which points to this commit as potential
cause[3].
Let's revert it until we figure out what's going on.
[1] https://lkml.org/lkml/2014/11/14/342
[2] https://lkml.org/lkml/2014/12/22/213
[3] https://lkml.org/lkml/2014/12/9/741
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a set of fixes for two regressions and one bug in the IOMMU
mapping code. It turns out that all of these issues turn up primarily
on Tegra30 hardware. The IOMMU mapping bug only manifests on buffers
that aren't multiples of the page size. I happened to be testing HDMI
with 1080p while writing the code and framebuffers for that happen to
fit exactly within 2025 pages of 4 KiB each.
One of the regressions is caused by the IOMMU code allocating pages from
shmem which can have associated cache lines. If the pages aren't flushed
then these cache lines may be flushed later on and cause framebuffer
corruption. I'm not sure why I didn't see this before. Perhaps the board
that I was using had enough RAM so that the pages shmem would hand out
had a better chance of being unused. Or maybe I didn't look too closely.
The fix for this is to fake up an SG table so that it can be passed to
the DMA API. Ideally this would use drm_clflush_*(), but implementing
that for ARM causes DRM to fail to build as a module since some of the
low-level cache maintenance functions aren't exported. Hopefully we can
get a suitable API exported on ARM for the next release.
The second regression is caused by a mismatch between the hardware pipe
number and the CRTC's DRM index. These were used inconsistently, which
could cause one code location to call drm_vblank_get() with a different
pipe than the corresponding drm_vblank_put(), thereby causing the
reference count to become unbalanced. Alexandre also reported a possible
race condition related to this, which this series also fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJUkYzSAAoJEN0jrNd/PrOhFWgP+wYiyXiLot7Wo3+HM779fQ9a
MZkOycfxyNJ+TxJjvIlJh/2y641G4Elw3rod/QhUKg1b0L2uqVrVRKvEsx5sR5Ci
XASwkx9UFLRxN6/Cr/X8SKmE7nFUSwGd3wSoVrT42ldo0DOOlHuVT9NLFoCfDmFa
GN5pxUW/WHS7WgCVpG9GgoFmFZXyrwx9ZRHqL49eJqAvjBngmBbZeyhFeZdu71fl
rm4qMiLkZZsLZEm3uP53pbdkAf7yZGV3WPWKO43LXgykSMfQ56WcN7JJsGygB3I1
uEMP65Tf3TdynW6Wz2dywq81uITJhd8y6Zhr6j2bsNINTHDz67YOxKfS50axZN/P
2PbqDLyJF7MT1ydQ7weeEv4gkRF8Vt6K3aBfL5gm8PM6jm2sdZytsjLM+/RCIkl3
cDtkC+XmPmGxLTEnV3iWMCbCfOrNvqzkp9jvilIEbxIvgX72T6EQPJnfe7Sv95Cr
VBFWoi26XtFhN9wWEGjc7fRTUwNdg4D21/ns8TY3MgOFQcdP01pp2KRTdPDC/6Mt
kknXwDaZRY6EjGQmRKkxKf1c64nmY8V7MJx2CSbPc3HgGdSXaa0AOZE2d60beste
ASpgMQIbERCmAbdmb5JN6fsKcpJrJL15zbrGcDwSnIk96x4HAC8zB9Xln2ubdCwc
IP4cm/Abz6Cfd6I1cQRr
=eVlK
-----END PGP SIGNATURE-----
Merge tag 'drm/tegra/for-3.19-rc1-fixes' of git://people.freedesktop.org/~tagr/linux into drm-fixes
drm/tegra: Fixes for v3.19-rc1
This is a set of fixes for two regressions and one bug in the IOMMU
mapping code. It turns out that all of these issues turn up primarily
on Tegra30 hardware. The IOMMU mapping bug only manifests on buffers
that aren't multiples of the page size. I happened to be testing HDMI
with 1080p while writing the code and framebuffers for that happen to
fit exactly within 2025 pages of 4 KiB each.
One of the regressions is caused by the IOMMU code allocating pages from
shmem which can have associated cache lines. If the pages aren't flushed
then these cache lines may be flushed later on and cause framebuffer
corruption. I'm not sure why I didn't see this before. Perhaps the board
that I was using had enough RAM so that the pages shmem would hand out
had a better chance of being unused. Or maybe I didn't look too closely.
The fix for this is to fake up an SG table so that it can be passed to
the DMA API. Ideally this would use drm_clflush_*(), but implementing
that for ARM causes DRM to fail to build as a module since some of the
low-level cache maintenance functions aren't exported. Hopefully we can
get a suitable API exported on ARM for the next release.
The second regression is caused by a mismatch between the hardware pipe
number and the CRTC's DRM index. These were used inconsistently, which
could cause one code location to call drm_vblank_get() with a different
pipe than the corresponding drm_vblank_put(), thereby causing the
reference count to become unbalanced. Alexandre also reported a possible
race condition related to this, which this series also fixes.
* tag 'drm/tegra/for-3.19-rc1-fixes' of git://people.freedesktop.org/~tagr/linux:
drm/tegra: dc: Select root window for event dispatch
drm/tegra: gem: Use the proper size for GEM objects
drm/tegra: gem: Flush buffer objects upon allocation
drm/tegra: dc: Fix a potential race on page-flip completion
drm/tegra: dc: Consistently use the same pipe
drm/irq: Add drm_crtc_vblank_count()
drm/irq: Add drm_crtc_handle_vblank()
drm/irq: Add drm_crtc_send_vblank_event()
misc i915 fixes.
* tag 'drm-intel-next-fixes-2014-12-17' of git://anongit.freedesktop.org/drm-intel:
drm/i915: Disable PSMI sleep messages on all rings around context switches
drm/i915: Force the CS stall for invalidate flushes
drm/i915: Invalidate media caches on gen7
drm/i915: sanitize RPS resetting during GPU reset
drm/i915: move RPS PM_IER enabling to gen6_enable_rps_interrupts
drm/i915: vlv: fix IRQ masking when uninstalling interrupts
Yeah a pull for one patch is a bit overkill but I started to assemble the
various patches for 3.20 in a branch for atomic props/ioctl and didn't
realize that this bugfix here at the beginnning of the branch should be in
3.19 (because msm is using the helpers arleady). So if you'd merge we'd
have it twice or or I need to shuffle branches again. Can do if you want.
* tag 'topic/atomic-fixes-2014-12-17' of git://anongit.freedesktop.org/drm-intel:
drm/atomic: fix potential null ptr on plane enable
A few msm fixes for 3.19:
* hdmi regulators fix
* hdmi fix for spurious HPD interrupts
* fix for sync atomic update after async update (which could show
up with a setcrtc following a pageflip)
* couple little Coccinelle cleanups
* 'msm-fixes-3.19' of git://people.freedesktop.org/~robclark/linux:
drm/msm/hdmi: rework HDMI IRQ handler
drm/msm/hdmi: enable regulators before clocks to avoid warnings
drm/msm/mdp5: update irqs on crtc<->encoder link change
drm/msm: block incoming update on pending updates
drm/msm: Deletion of unnecessary checks before the function call "release_firmware"
drm/msm: Deletion of unnecessary checks before two function calls