Граф коммитов

574736 Коммитов

Автор SHA1 Сообщение Дата
Mike Snitzer eca7ee6dc0 dm: distinquish old .request_fn (dm-old) vs dm-mq request-based DM
Rename various methods to have either a "dm_old" or "dm_mq" prefix.
Improve code comments to assist with understanding the duality of code
that handles both "dm_old" and "dm_mq" cases.

It is no much easier to quickly look at the code and _know_ that a given
method is either 1) "dm_old" only 2) "dm_mq" only 3) common to both.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-22 22:34:33 -05:00
Mike Snitzer c5248f79f3 dm: remove support for stacking dm-mq on .request_fn device(s)
Remove all fiddley code that propped up this support for a blk-mq
request-queue ontop of all .request_fn devices.

Testing has proven this niche request-based dm-mq mode to be buggy, when
testing fault tolerance with DM multipath, and there is no point trying
to preserve it.

Should help improve efficiency of pure dm-mq code and make code
maintenance less delicate.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-22 22:33:46 -05:00
Mike Snitzer 818c5f3bef dm: fix a couple locking issues with use of block interfaces
old_stop_queue() was checking blk_queue_stopped() without holding the
q->queue_lock.

dm_requeue_original_request() needed to check blk_queue_stopped(), with
q->queue_lock held, before calling blk_mq_kick_requeue_list().  And a
side-effect of that change is start_queue() must also call
blk_mq_kick_requeue_list().

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-22 22:33:09 -05:00
Mike Snitzer 1c357a1e86 dm: allocate blk_mq_tag_set rather than embed in mapped_device
The blk_mq_tag_set is only needed for dm-mq support.  There is point
wasting space in 'struct mapped_device' for non-dm-mq devices.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> # check kzalloc return
2016-02-22 12:07:14 -05:00
Mike Snitzer faad87df4b dm: add 'dm_mq_nr_hw_queues' and 'dm_mq_queue_depth' module params
Allow user to change these values via module params or sysfs.

'dm_mq_nr_hw_queues' defaults to 1 (max 32).

'dm_mq_queue_depth' defaults to 2048 (up from 64, which proved far too
small under moderate sized workloads -- the dm-multipath device would
continuously block waiting for tags (requests) to become available).
The maximum is BLK_MQ_MAX_DEPTH (currently 10240).

Keep in mind the total number of pre-allocated requests per
request-based dm-mq device is 'dm_mq_nr_hw_queues' * 'dm_mq_queue_depth'
(currently 2048).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-22 12:07:10 -05:00
Mike Snitzer c91852ff08 dm: optimize dm_request_fn()
DM multipath is the only request-based DM target -- which only supports
tables with a single target that is immutable.  Leverage this fact in
dm_request_fn().

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-22 11:06:22 -05:00
Mike Snitzer 16f122661d dm: optimize dm_mq_queue_rq()
DM multipath is the only dm-mq target.  But that aside, request-based DM
only supports tables with a single target that is immutable.  Leverage
this fact in dm_mq_queue_rq() by using the 'immutable_target' stored in
the mapped_device when the table was made active.  This saves the need
to even take the read-side of the SRCU via dm_{get,put}_live_table.

If the active DM table does not have an immutable target (e.g. "error"
target was swapped in) then fallback to the slow-path where the target
is looked up from the live table.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-22 11:06:22 -05:00
Mike Snitzer f083b09b78 dm: set DM_TARGET_WILDCARD feature on "error" target
The DM_TARGET_WILDCARD feature indicates that the "error" target may
replace any target; even immutable targets.  This feature will be useful
to preserve the ability to replace the "multipath" target even once it
is formally converted over to having the DM_TARGET_IMMUTABLE feature.

Also, implicit in the DM_TARGET_WILDCARD feature flag being set is that
.map, .map_rq, .clone_and_map_rq and .release_clone_rq are all defined
in the target_type.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-22 11:06:21 -05:00
Mike Snitzer e522c03905 dm: cleanup dm_any_congested()
The request-based DM support for checking queue congestion doesn't
require access to the live DM table.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-22 11:06:20 -05:00
Mike Snitzer ae6ad75e5c dm: remove unused dm_get_rq_mapinfo()
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-22 11:06:20 -05:00
Mike Snitzer 6acfe68bac dm: fix excessive dm-mq context switching
Request-based DM's blk-mq support (dm-mq) was reported to be 50% slower
than if an underlying null_blk device were used directly.  One of the
reasons for this drop in performance is that blk_insert_clone_request()
was calling blk_mq_insert_request() with @async=true.  This forced the
use of kblockd_schedule_delayed_work_on() to run the blk-mq hw queues
which ushered in ping-ponging between process context (fio in this case)
and kblockd's kworker to submit the cloned request.  The ftrace
function_graph tracer showed:

  kworker-2013  =>   fio-12190
  fio-12190    =>  kworker-2013
  ...
  kworker-2013  =>   fio-12190
  fio-12190    =>  kworker-2013
  ...

Fixing blk_insert_clone_request()'s blk_mq_insert_request() call to
_not_ use kblockd to submit the cloned requests isn't enough to
eliminate the observed context switches.

In addition to this dm-mq specific blk-core fix, there are 2 DM core
fixes to dm-mq that (when paired with the blk-core fix) completely
eliminate the observed context switching:

1)  don't blk_mq_run_hw_queues in blk-mq request completion

    Motivated by desire to reduce overhead of dm-mq, punting to kblockd
    just increases context switches.

    In my testing against a really fast null_blk device there was no benefit
    to running blk_mq_run_hw_queues() on completion (and no other blk-mq
    driver does this).  So hopefully this change doesn't induce the need for
    yet another revert like commit 621739b00e !

2)  use blk_mq_complete_request() in dm_complete_request()

    blk_complete_request() doesn't offer the traditional q->mq_ops vs
    .request_fn branching pattern that other historic block interfaces
    do (e.g. blk_get_request).  Using blk_mq_complete_request() for
    blk-mq requests is important for performance.  It should be noted
    that, like blk_complete_request(), blk_mq_complete_request() doesn't
    natively handle partial completions -- but the request-based
    DM-multipath target does provide the required partial completion
    support by dm.c:end_clone_bio() triggering requeueing of the request
    via dm-mpath.c:multipath_end_io()'s return of DM_ENDIO_REQUEUE.

dm-mq fix #2 is _much_ more important than #1 for eliminating the
context switches.
Before: cpu          : usr=15.10%, sys=59.39%, ctx=7905181, majf=0, minf=475
After:  cpu          : usr=20.60%, sys=79.35%, ctx=2008, majf=0, minf=472

With these changes multithreaded async read IOPs improved from ~950K
to ~1350K for this dm-mq stacked on null_blk test-case.  The raw read
IOPs of the underlying null_blk device for the same workload is ~1950K.

Fixes: 7fb4898e0 ("block: add blk-mq support to blk_insert_cloned_request()")
Fixes: bfebd1cdb ("dm: add full blk-mq support to request-based DM")
Cc: stable@vger.kernel.org # 4.1+
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
2016-02-22 11:04:40 -05:00
Mike Snitzer 956a402580 dm: fix sparse "unexpected unlock" warnings in ioctl code
Rename dm_get_live_table_for_ioctl to dm_grab_bdev_for_ioctl and have it
do the dm_{get,put}_live_table() rather than split those operations.

The dm_grab_bdev_for_ioctl() callers only care about the block_device
associated with a singleton DM device so there isn't any need to retain
a reference to the live DM table.  It is sufficient to:
1) dm_get_live_table()
2) bdgrab() the bdev associated with the singleton table's target
3) dm_put_live_table()
4) bdput() the bdev

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-21 20:27:51 -05:00
Mike Snitzer 664820265d dm: do not return target from dm_get_live_table_for_ioctl()
None of the callers actually used the returned target.
Also, just reuse bdev pointer passed to dm_blk_ioctl().

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-21 20:27:51 -05:00
Mike Snitzer 4328daa2e7 dm: fix dm_rq_target_io leak on faults with .request_fn DM w/ blk-mq paths
Using request-based DM mpath configured with the following stacking
(.request_fn DM mpath ontop of scsi-mq paths):

echo Y > /sys/module/scsi_mod/parameters/use_blk_mq
echo N > /sys/module/dm_mod/parameters/use_blk_mq

'struct dm_rq_target_io' would leak if a request is requeued before a
blk-mq clone is allocated (or fails to allocate).  free_rq_tio()
wasn't being called.

kmemleak reported:

unreferenced object 0xffff8800b90b98c0 (size 112):
  comm "kworker/7:1H", pid 5692, jiffies 4295056109 (age 78.589s)
  hex dump (first 32 bytes):
    00 d0 5c 2c 03 88 ff ff 40 00 bf 01 00 c9 ff ff  ..\,....@.......
    e0 d9 b1 34 00 88 ff ff 00 00 00 00 00 00 00 00  ...4............
  backtrace:
    [<ffffffff81672b6e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff811dbb63>] kmem_cache_alloc+0xc3/0x1e0
    [<ffffffff8117eae5>] mempool_alloc_slab+0x15/0x20
    [<ffffffff8117ec1e>] mempool_alloc+0x6e/0x170
    [<ffffffffa00029ac>] dm_old_prep_fn+0x3c/0x180 [dm_mod]
    [<ffffffff812fbd78>] blk_peek_request+0x168/0x290
    [<ffffffffa0003e62>] dm_request_fn+0xb2/0x1b0 [dm_mod]
    [<ffffffff812f66e3>] __blk_run_queue+0x33/0x40
    [<ffffffff812f9585>] blk_delay_work+0x25/0x40
    [<ffffffff81096fff>] process_one_work+0x14f/0x3d0
    [<ffffffff81097715>] worker_thread+0x125/0x4b0
    [<ffffffff8109ce88>] kthread+0xd8/0xf0
    [<ffffffff8167cb8f>] ret_from_fork+0x3f/0x70
    [<ffffffffffffffff>] 0xffffffffffffffff

crash> struct -o dm_rq_target_io
struct dm_rq_target_io {
    ...
}
SIZE: 112

Fixes: e5863d9ad7 ("dm: allocate requests in target when stacking on blk-mq devices")
Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-21 20:27:50 -05:00
Linus Torvalds 81f70ba233 Linux 4.5-rc5 2016-02-20 13:39:35 -08:00
Linus Torvalds 0389075ecf Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "This is unusually large, partly due to the EFI fixes that prevent
  accidental deletion of EFI variables through efivarfs that may brick
  machines.  These fixes are somewhat involved to maintain compatibility
  with existing install methods and other usage modes, while trying to
  turn off the 'rm -rf' bricking vector.

  Other fixes are for large page ioremap()s and for non-temporal
  user-memcpy()s"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Fix vmalloc_fault() to handle large pages properly
  hpet: Drop stale URLs
  x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache()
  x86/uaccess/64: Make the __copy_user_nocache() assembly code more readable
  lib/ucs2_string: Correct ucs2 -> utf8 conversion
  efi: Add pstore variables to the deletion whitelist
  efi: Make efivarfs entries immutable by default
  efi: Make our variable validation list include the guid
  efi: Do variable name validation tests in utf8
  efi: Use ucs2_as_utf8 in efivarfs instead of open coding a bad version
  lib/ucs2_string: Add ucs2 -> utf8 helper functions
2016-02-20 09:32:40 -08:00
Linus Torvalds 06b74c658c Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "A handful of CPU hotplug related fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Plug potential memory leak in CPU_UP_PREPARE
  perf/core: Remove the bogus and dangerous CPU_DOWN_FAILED hotplug state
  perf/core: Remove bogus UP_CANCELED hotplug state
  perf/x86/amd/uncore: Plug reference leak
2016-02-20 09:30:42 -08:00
Linus Torvalds e6a1c1e9dd powerpc fixes for 4.5 #2
- Fix build error on 32-bit with checkpoint restart from Aneesh Kumar
  - Fix dedotify for binutils >= 2.26 from Andreas Schwab
  - Don't trace hcalls on offline CPUs from Denis Kirjanov
  - eeh: Fix stale cached primary bus from Gavin Shan
  - eeh: Fix stale PE primary bus from Gavin Shan
  - mm: Fix Multi hit ERAT cause by recent THP update from Aneesh Kumar K.V
  - ioda: Set "read" permission when "write" is set from Alexey Kardashevskiy
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWx8l5AAoJEFHr6jzI4aWAeY0P/AomeQCRieoBMKJi36WX4+gU
 Cm1iBgM593VEM/KFsYtedm5+4QaCmPE+1tVm4/u0wbLeEQ8TqNLSZLniB9USE0hb
 9655gGQyFE95BZa8WfbqHOI7+BK+TkUOWGY0CfyqPVrknzSN2MCDHjUaNo1wge6l
 zmIYIkKhaQAinFSFovOdjQ63rYdk6CxsfgbP1Gl2aX0cmzWW1n07AvZLqNmLFJ+4
 L3uBXPcrEKY/nfkRi+FutoTb86ggt9J9dqCfJHHfWKn60qwhpKwiva84k3jI/BOu
 yBTFeNzlobXt0ceDSWx1ITXzKmJQokWGC5+Lo+0mDb4veAbhLgHlXdx7NUcZIB6+
 YGYGSOkeKCnbnInIOGLz45LlevJFviI94y0YY4tt++Csq/IjnBhDeTkGx7zcqRLG
 v5hl7AhykHd3Me5iRuyRRVoVyk6+318OZW450Oxxj7EtkzpSeLfHCMKxk5w1Nxuk
 tenWQeApdkTVr91m5VJNFOsrmtFDLJv51C8duiUFWzc195ejSMYDR86K+qBeaebs
 39obXHVYRnCrn9TzODIR9SnEd7dHImekQ4a3G3F54mLJ4qqUN089TDqBGY2GNuT8
 j8QVBttp3SWuZSvtvDJLxoFt2QTKxcuiMQ4FX/OAS4qWRjSR8v2WTCyBZt68l7er
 kpUnIelJSuIDVLdNuFlf
 =7Yzi
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 - Fix build error on 32-bit with checkpoint restart from Aneesh Kumar
 - Fix dedotify for binutils >= 2.26 from Andreas Schwab
 - Don't trace hcalls on offline CPUs from Denis Kirjanov
 - eeh: Fix stale cached primary bus from Gavin Shan
 - eeh: Fix stale PE primary bus from Gavin Shan
 - mm: Fix Multi hit ERAT cause by recent THP update from Aneesh Kumar K.V
 - ioda: Set "read" permission when "write" is set from Alexey Kardashevskiy

* tag 'powerpc-4.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/ioda: Set "read" permission when "write" is set
  powerpc/mm: Fix Multi hit ERAT cause by recent THP update
  powerpc/powernv: Fix stale PE primary bus
  powerpc/eeh: Fix stale cached primary bus
  powerpc/pseries: Don't trace hcalls on offline CPUs
  powerpc: Fix dedotify for binutils >= 2.26
  powerpc/book3s_32: Fix build error with checkpoint restart
2016-02-20 09:22:11 -08:00
Linus Torvalds da6b7366db dmaengine fixes for 4.5-rc5
Few fixes on drivers nothing major here.
 Fixes are: iotdma fix to restart channels, new ID for wildcat
 PCH, residue fix for edma, disable irq for non-cyclic in dw.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWyBF/AAoJEHwUBw8lI4NH1ckP/2Ht39D0EFm/PnEkUZCzCixv
 zYKvsL8gniKEUWSlPjUGQQ0yY/z6WRatihHEMl/6bmXIvR1l6NJCWOMI/HRZQAeR
 izpkNpaSo8uyl3NrQ8IMAuJzqgibn2QuyqjAHsgfyCHrDLVS+/LK4H5PMOvyboJA
 BhP22GkbOPCDoW7R8ddqQYXQidwLjezBuNrDncox2wXZ7WoQa7xaxSeFt2RiCMKr
 nTCHKaX27Tsn3CMyiFODLcIP/soNvWrA6DQkPqluqAmBt5b7GS1eiqF8JvBLQa4/
 RUkZuMN5gKnbaNzlYvY2+cOe2BRpWihvfEwm5Fu7ZBgNrzOwrBvUjmG76SCVlZ4O
 Azp2SF8kjwwDJOzjlKAJFHyihL9nNZySRyZ1N89zInPBegal13P7Ai9e5aAOLs/e
 mH5WfbiCAhXcaX8vQ5Cdu1sHj0T749Q4h+TIGL9T0CJZf9kVcy/XvD+I4SplzmBC
 zOgAyAda3EEiZGHiifxPim9f95O9OcHQanKHCc5gflcN8tPKot8qJBhMOP2lwl99
 i4IOHK+2KDq1maF/dwAMH/7vX0VfHYyXs75VQtSA6MLguaqflBbB0oDhPnHt+amY
 VSdZoYtt/mgzptcEbGIQG8m2w5AhhJMOJFmyXJsUXqLCY682nifeEj0/rXpuqm60
 9gfhkM6gIALKcYiOUfbA
 =Qs4V
 -----END PGP SIGNATURE-----

Merge tag 'dmaengine-fix-4.5-rc5' of git://git.infradead.org/users/vkoul/slave-dma

Pull dmaengine fixes from Vinod Koul:
 "A few fixes for drivers, nothing major here.

  Fixes are: iotdma fix to restart channels, new ID for wildcat PCH,
  residue fix for edma, disable irq for non-cyclic in dw"

* tag 'dmaengine-fix-4.5-rc5' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: dw: disable BLOCK IRQs for non-cyclic xfer
  dmaengine: edma: fix residue race for cyclic
  dmaengine: dw: pci: add ID for WildcatPoint PCH
  dmaengine: IOATDMA: fix timer code that continues to restart channels during idle
2016-02-20 09:19:56 -08:00
Linus Torvalds 37aa4dac99 An assortment of vendor specific clk drivers fixes, most notably
fallout from adding Tegra210 and rockchip rk3036/rk3368 drivers
 this cycle. There's also the random smattering of sparse/checker
 fixes, a build "fix" to get the Tango clk driver to compile
 because the Kconfig symbol was renamed after the fact, and a clk
 gpio fix for a patch mismerge.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABCAAGBQJWxohFAAoJENidgRMleOc9BdMP/A+OSpgI9wx0nQTFya9DOMLn
 cDJvsD3qagMqyy1KkEgd5Zx1H2vBdABn9Hdtm+nGccisV1yR3zxxy5t2TbxB/Dcl
 /bZKAmhGViArTV83VUjkzQuAJ0cvH4He85ODG8UkQlAl3PiSLjO2NkZVeZsgeszx
 mG0VHLVy+AzzPLVFPrT6Xa6iuk1IQFgy5QfkowjaWynawxAf4In6XBYmN6nPABL3
 17WNzpfsl9jLw3PfybX0qKvM7tKmNT9+IQ7SDV2nObkL/dwHci5jiWuhE2dxkYrO
 S6g73WklMKd9ah4qvq3d7Jcxk6ee82uvdlBiSJmpcQQnWPKDjUnJhPJxNo+yWR0U
 2XtQcbreNlyygg5NKDwqKaTZEypx7QnzDy7tEj30GvMVb0fRFMpnmEN4BjdJiyWT
 MwAG5e3WsoqkNFNAvMhpbOMG/grUp8XoyN1IILhmaerMAhhSc1+as2oUTI5ezPWF
 CiY/dyvVpzda+Z4/LglYbC550qWK7sUNFcVEMMJUHrNPAHeFifZlJzglMAD/5Tz4
 hdwMEMeRXy3VgspwrRrmxuETSWuhsdxuzWYx1ZAyB0vYzrAWK+a5msVodyRttNIY
 K/RyaMqUacWQLZ01dItW/RVbXzOynptqmzpPsWF4ASZG7GAGsp/5gwYVN+lDr0G4
 GrfRRcvcrKAwiF69AM8H
 =I6OK
 -----END PGP SIGNATURE-----

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk driver fixes from Stephen Boyd:
 "An assortment of vendor specific clk drivers fixes, most notably
  fallout from adding Tegra210 and rockchip rk3036/rk3368 drivers this
  cycle.

  There's also the random smattering of sparse/checker fixes, a build
  "fix" to get the Tango clk driver to compile because the Kconfig
  symbol was renamed after the fact, and a clk gpio fix for a patch
  mismerge"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (28 commits)
  clk: gpio: Really allow an optional clock= DT property
  Revert "clk: qcom: Specify LE device endianness"
  clk: versatile: mask VCO bits before writing
  clk: tegra: super: Fix sparse warnings for functions not declared as static
  clk: tegra: Fix sparse warnings for functions not declared as static
  clk: tegra: Fix sparse warning for pll_m
  clk: tegra: Use definition for pll_u override bit
  clk: tegra: Fix warning caused by pll_u failing to lock
  clk: tegra: Fix clock sources for Tegra210 EMC
  clk: tegra: Add the APB2APE audio clock on Tegra210
  clk: tegra: Add missing of_node_put()
  clk: tegra: Fix PLLE SS coefficients
  clk: tegra: Fix typos around clearing PLLE bits during enable
  clk: tegra: Do not disable PLLE when under hardware control
  clk: tegra: Fix pllx dyn step calculation
  clk: tegra: pll: Fix potential sleeping-while-atomic
  clk: tegra: Fix the misnaming of nvenc from msenc
  clk: tegra: Fix naming of MISC registers
  clk: tango4: rename ARCH_TANGOX to ARCH_TANGO
  clk: scpi: Fix checking return value of platform_device_register_simple()
  ...
2016-02-20 09:16:51 -08:00
Linus Torvalds a703f42d6c Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull more drm fixes from Dave Airlie:
 "Some more fixes trickled in:

  A bunch of VC4 ones since it's a pretty new driver not much chance of
  regressions, and it fixes GPU resets.

  Also one atomic fix, one set of fixes for a common bug in TTM cleanup,
  and one i915 hotplug fix"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/nouveau: use post-decrement in error handling
  drm/atomic: Allow for holes in connector state, v2.
  drm/i915: Fix hpd live status bits for g4x
  drm/vc4: Use runtime PM to power cycle the device when the GPU hangs.
  drm/vc4: Enable runtime PM.
  drm/vc4: Fix spurious GPU resets due to BO reuse.
  drm/vc4: Drop error message on seqno wait timeouts.
  drm/vc4: Fix -ERESTARTSYS error return from BO waits.
  drm/vc4: Return an ERR_PTR from BO creation instead of NULL.
  drm/vc4: Fix the clear color for the first tile rendered.
  drm/vc4: Validate that WAIT_BO padding is cleared.
  drm/radeon: use post-decrement in error handling
  drm/amdgpu: use post-decrement in error handling
2016-02-20 09:13:18 -08:00
Simon Guinot 59ceeaaf35 kernel/resource.c: fix muxed resource handling in __request_region()
In __request_region, if a conflict with a BUSY and MUXED resource is
detected, then the caller goes to sleep and waits for the resource to be
released.  A pointer on the conflicting resource is kept.  At wake-up
this pointer is used as a parent to retry to request the region.

A first problem is that this pointer might well be invalid (if for
example the conflicting resource have already been freed).  Another
problem is that the next call to __request_region() fails to detect a
remaining conflict.  The previously conflicting resource is passed as a
parameter and __request_region() will look for a conflict among the
children of this resource and not at the resource itself.  It is likely
to succeed anyway, even if there is still a conflict.

Instead, the parent of the conflicting resource should be passed to
__request_region().

As a fix, this patch doesn't update the parent resource pointer in the
case we have to wait for a muxed region right after.

Reported-and-tested-by: Vincent Pelletier <plr.vincent@gmail.com>
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Tested-by: Vincent Donnefort <vdonnefort@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-20 08:57:52 -08:00
Linus Torvalds 020ecbba05 Miscellaneous ext4 bug fixes for v4.5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJWx2ZiAAoJEPL5WVaVDYGjrbcH/2EqCUDmW+FqVqR7PkpQsNiV
 WxBTNkxnVXf1Jin5beIUN/Ehq0GSuqcSujMwdbFUa0i7YJNVEe++hTw28JmFILYV
 5nZtTYmYIq7dZb/tnc3tj0SsDpgEE1h31VyWAu4W2q4wSQMDc8AqGM90VktgrerJ
 H9k/WDDL6KC8uXagBsQC0d5xaQglJNZC+S6pSBbMegBAFNJqAL5N78oWAoEFN3OH
 LN3B3eccxBx98rGWx8DBiugY8ZDRHB4Cre+fXu8wmAuMb/+Y7Mwj4RzI+fz5Vpiw
 vMS5RqZ7PvCaMhdyUWt9bI8j10bBXcaxOHL2UQND5A1zundJ1ZNOY/ZPvHVUS4s=
 =AXFu
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 bugfixes from Ted Ts'o:
 "Miscellaneous ext4 bug fixes for v4.5"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix crashes in dioread_nolock mode
  ext4: fix bh->b_state corruption
  ext4: fix memleak in ext4_readdir()
  ext4: remove unused parameter "newblock" in convert_initialized_extent()
  ext4: don't read blocks from disk after extents being swapped
  ext4: fix potential integer overflow
  ext4: add a line break for proc mb_groups display
  ext4: ioctl: fix erroneous return value
  ext4: fix scheduling in atomic on group checksum failure
  ext4 crypto: move context consistency check to ext4_file_open()
  ext4 crypto: revalidate dentry after adding or removing the key
2016-02-19 13:44:12 -08:00
Linus Torvalds ce6b71432d Merge branch 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fix from Chris Mason:
 "My for-linus-4.5 branch has a btrfs DIO error passing fix.

  I know how much you love DIO, so I'm going to suggest against reading
  it.  We'll follow up with a patch to drop the error arg from
  dio_end_io in the next merge window."

* 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: fix direct IO requests not reporting IO error to user space
2016-02-19 13:40:42 -08:00
Linus Torvalds 87d9ac712b Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "10 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: slab: free kmem_cache_node after destroy sysfs file
  ipc/shm: handle removed segments gracefully in shm_mmap()
  MAINTAINERS: update Kselftest Framework mailing list
  devm_memremap_release(): fix memremap'd addr handling
  mm/hugetlb.c: fix incorrect proc nr_hugepages value
  mm, x86: fix pte_page() crash in gup_pte_range()
  fsnotify: turn fsnotify reaper thread into a workqueue job
  Revert "fsnotify: destroy marks with call_srcu instead of dedicated thread"
  mm: fix regression in remap_file_pages() emulation
  thp, dax: do not try to withdraw pgtable from non-anon VMA
2016-02-19 13:36:00 -08:00
Linus Torvalds 23300f6575 arm64 fixes:
- Allow EFI stub to use strnlen(), which is required by recent libfdt
 
 - Avoid smp_processor_id() in preempt context during unwinding
 
 - Avoid false Kasan warnings during unwinding
 
 - Ensure early devices are picked up by the IOMMU DMA ops
 
 - Avoid rebuilding the kernel for the 'install' target
 
 - Run fixup handlers for alignment faults on userspace access
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJWxwt3AAoJELescNyEwWM09rIH/3ygrixUcnk/22vI+y32ALDL
 TpBih0pgNmFmls3QxTQaIYqsdjfHVCuzoLRcHGYsPgb42fIeLTgcx6Bp4xacUVGh
 +xjBdEjacUR92TiB/QeP3lNEYIuBhHEPE+H5hHccbdRa+xNB5rUx0Z6nTRokOM4u
 j25KiNf5wO2bOMwo6TNYT0N1Lggp+TZrIP2bIUkWm+RSorF3NGqLS0Rw3ZKwBXxm
 jtUA4ohKR3uyeRHki8Nw/M/AV+gMq+nELX1RGK4HMW00cqakKwIEFvANbdbxGMmg
 q7OIgluSK3BCTQPVQTiss+W6rEjg1z0dTyHGCPVwP16SGXH2i0ys0xQ0BZR5SMw=
 =/uso
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "Here are some more arm64 fixes for 4.5.  This has mostly come from
  Yang Shi, who saw some issues under -rt that also affect mainline.
  The rest of it is pretty small, but still worth having.

  We've got an old issue outstanding with valid_user_regs which will
  likely wait until 4.6 (since it would really benefit from some time in
  -next) and another issue with kasan and idle which should be fixed
  next week.

  Apart from that, pretty quiet here (and still no sign of the THP issue
  reported on s390...)

  Summary:

   - Allow EFI stub to use strnlen(), which is required by recent libfdt

   - Avoid smp_processor_id() in preempt context during unwinding

   - Avoid false Kasan warnings during unwinding

   - Ensure early devices are picked up by the IOMMU DMA ops

   - Avoid rebuilding the kernel for the 'install' target

   - Run fixup handlers for alignment faults on userspace access"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: mm: allow the kernel to handle alignment faults on user accesses
  arm64: kbuild: make "make install" not depend on vmlinux
  arm64: dma-mapping: fix handling of devices registered before arch_initcall
  arm64/efi: Make strnlen() available to the EFI namespace
  arm/arm64: crypto: assure that ECB modes don't require an IV
  arm64: make irq_stack_ptr more robust
  arm64: debug: re-enable irqs before sending breakpoint SIGTRAP
  arm64: disable kasan when accessing frame->fp in unwind_frame
2016-02-19 08:40:05 -08:00
Linus Torvalds ff5f16820f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "Several bug fixes:

   - There are four different stack tracers, and three of them have
     bugs.  For 4.5 the bugs are fixed and we prepare a cleanup patch
     for the next merge window.

   - Three bug fixes for the dasd driver in regard to parallel access
     volumes and the new max_dev_sectors block device queue limit

   - The irq restore optimization needs a fixup for memcpy_real

   - The diagnose trace code has a conflict with lockdep"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/dasd: fix performance drop
  s390/maccess: reduce stnsm instructions
  s390/diag: avoid lockdep recursion
  s390/dasd: fix refcount for PAV reassignment
  s390/dasd: prevent incorrect length error under z/VM after PAV changes
  s390: fix DAT off memory access, e.g. on kdump
  s390/oprofile: fix address range for asynchronous stack
  s390/perf_event: fix address range for asynchronous stack
  s390/stacktrace: add save_stack_trace_regs()
  s390/stacktrace: save full stack traces
  s390/stacktrace: add missing end marker
  s390/stacktrace: fix address ranges for asynchronous and panic stack
  s390/stacktrace: fix save_stack_trace_tsk() for current task
2016-02-19 08:33:12 -08:00
Linus Torvalds 409ee136f2 Pin control fixes for the v4.5 series, all are individual
driver fixes:
 - Fix the PXA2xx driver to export its init function so we
   do not break modular compiles.
 - Hide unused functions in the Nomadik driver.
 - Fix up direction control in the Mediatek driver.
 - Toggle the sunxi GPIO lines to input when you read them
   on the H3 GPIO controller, lest you only get garbage.
 - Fix up the number of settings in the MVEBU driver.
 - Fix a serious SMP race condition in the Samsung driver.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWxvyrAAoJEEEQszewGV1zgL4QAMGRaOXb2JIxz39gc37h92Cv
 Cop92w8SU54OCAsA0UhfNv/y2oUZ5Hui++N5M3pTFlVZbtOD/POzcObehmi61DYJ
 IkJEcA4df6prDLTtmIALXwSYoUapgc+kjl3RyNAg0pRPj5k/QxrqMmEhbHt3ZOwo
 ZehWdl/4Kap4gzHYcflYICf71u0HGE0QbjEUkr8oB+lhCE3zFlfYSVelu5Ysg/CF
 tRugNsLhn4rke3e1QTK2leOWREatgqngJUEMhSrxyulA7E+0tQPRt9dhHaQD47Zw
 NxDDcpXevgRGRo9//VLM33tjMe4vZlWdjGTJ2Bro3rGYyJBDwdiiDumKCR9b8Ba6
 qEddYoSjFV2IJWy3ngLuGXy7+t14LGHN6+kWU9XMD2V0idPcKipCcaUKl92x4v1s
 at8+uStzeLPyb9NZ2v3PxE7IvAwSG85xtSZ53yJgoudxJHMBU0xFO42G4D3mCzIm
 xd2ERHhIsuUIS9+hOC++lcxXfdMVogcGpA3NyW9TCiX5NMs0IG957iWADL2z73Yu
 uoX4GUEBjQwsSyWIiic90mOw/OByKBIQmx/8Kj2kIotINzRHrwg+bBfplc+gQumX
 CKREs3QHAfBrvfSTd115rJX5UuKFiCX6eG2z7Ardori3qkRVOYpWh1dkGSZg0ZKO
 ft/f8ZD6qZ8+0F5xb3/m
 =Qs3i
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v4.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull Pin control fixes from Linus Walleij:
 "Pin control fixes for the v4.5 series, all are individual driver
  fixes:

   - Fix the PXA2xx driver to export its init function so we do not
     break modular compiles.
   - Hide unused functions in the Nomadik driver.
   - Fix up direction control in the Mediatek driver.
   - Toggle the sunxi GPIO lines to input when you read them on the H3
     GPIO controller, lest you only get garbage.
   - Fix up the number of settings in the MVEBU driver.
   - Fix a serious SMP race condition in the Samsung driver"

* tag 'pinctrl-v4.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: samsung: fix SMP race condition
  pinctrl: mvebu: fix num_settings in mpp group assignment
  pinctrl: sunxi: H3 requires irq_read_needs_mux
  pinctrl: mediatek: fix direction control issue
  pinctrl: nomadik: hide unused functions
  pinctrl: pxa: export pxa2xx_pinctrl_init()
2016-02-19 08:25:40 -08:00
Linus Torvalds 9001b8e4f0 sound fixes for 4.5-rc5
this update contains again a few more fixes for ALSA core stuff
 although it's no longer high flux: two race fixes in sequencer and one
 PCM race fix for non-atomic PCM ops.  In addition, HD-audio gained a
 similar fix for race at reloading the driver.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJWxvBBAAoJEGwxgFQ9KSmk2RoQALqPcE1WomF9KoCTRDQ0+V2V
 CDQFBDZsrlBPNH0p5tiVOWdzk8tsKh/asF5JO83g5RmIBd3MP06MJYdMnbEa7xVE
 77WykgGT3onmiY7fka4ufNysm0LGdPNjaPvdOmzrrU0z3+JdrffuWPT2nnd6znDF
 aZH3sx/Hy+Hl7nrfhG+xh08OAQJNK9ro6ZrEzQFAdhZMUoWTAlI0cV6fTxdnMLVf
 DV9lYT4TMfCijHAu+ujPoP5TihgQ2215RvQa4GCERX2vmoXK7yPfDB6Kbkh7uQhl
 HW8blEKHKfrEOElmg/VKUKaa1W7Of+n3m1eFrBgPiDfUEzpBM4U0Bp3dr/097rnk
 v2arIBQQcu/V3yInID8BxQAH9Rq4P+wS8UREBXws5AEe0OSyMsYheVmkPOJoBSST
 uudqKSf876qqC2+ze7R90rUF91UNiWJNyraynUPPgXO5IXMpUbt6H8oucICU+vys
 eaIl/sl0n0fB3XUcMLXKGySgXxeKUGKqWPNga5v2YLdyFLhtd3uHJFuoq7vy8as2
 vzjq/sd87R2VUCxN907UKdhQkvXuhIXs5I6ugc7Wwv+RHv9dhLNs+zB8yOnA+iFT
 T6tMX0M/5lgJ+s8DH9mLUxZWb8cUjLWU0llVedBuicFAilnujIkw4Gc+3RKcf3h0
 Hj/sHiss0F9q0XhPI5gI
 =u/fj
 -----END PGP SIGNATURE-----

Merge tag 'sound-4.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "This update contains again a few more fixes for ALSA core stuff
  although it's no longer high flux: two race fixes in sequencer and one
  PCM race fix for non-atomic PCM ops.

  In addition, HD-audio gained a similar fix for race at reloading the
  driver"

* tag 'sound-4.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: pcm: Fix rwsem deadlock for non-atomic PCM stream
  ALSA: seq: Fix double port list deletion
  ALSA: hda - Cancel probe work instead of flush at remove
  ALSA: seq: Fix leak of pool buffer at concurrent writes
2016-02-19 08:01:41 -08:00
EunTaik Lee 52d7523d84 arm64: mm: allow the kernel to handle alignment faults on user accesses
Although we don't expect to take alignment faults on access to normal
memory, misbehaving (i.e. buggy) user code can pass MMIO pointers into
system calls, leading to things like get_user accessing device memory.

Rather than OOPS the kernel, allow any exception fixups to run and
return something like -EFAULT back to userspace. This makes the
behaviour more consistent with userspace, even though applications with
access to device mappings can easily cause other issues if they try
hard enough.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Eun Taik Lee <eun.taik.lee@samsung.com>
[will: dropped __kprobes annotation and rewrote commit mesage]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-02-19 12:20:37 +00:00
Masahiro Yamada 8684fa3e7a arm64: kbuild: make "make install" not depend on vmlinux
For the same reason as commit 19514fc665 ("arm, kbuild: make "make
install" not depend on vmlinux"), the install targets should never
trigger the rebuild of the kernel.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-02-19 10:33:35 +00:00
Jan Kara 74dae42785 ext4: fix crashes in dioread_nolock mode
Competing overwrite DIO in dioread_nolock mode will just overwrite
pointer to io_end in the inode. This may result in data corruption or
extent conversion happening from IO completion interrupt because we
don't properly set buffer_defer_completion() when unlocked DIO races
with locked DIO to unwritten extent.

Since unlocked DIO doesn't need io_end for anything, just avoid
allocating it and corrupting pointer from inode for locked DIO.
A cleaner fix would be to avoid these games with io_end pointer from the
inode but that requires more intrusive changes so we leave that for
later.

Cc: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-02-19 00:33:21 -05:00
Jan Kara ed8ad83808 ext4: fix bh->b_state corruption
ext4 can update bh->b_state non-atomically in _ext4_get_block() and
ext4_da_get_block_prep(). Usually this is fine since bh is just a
temporary storage for mapping information on stack but in some cases it
can be fully living bh attached to a page. In such case non-atomic
update of bh->b_state can race with an atomic update which then gets
lost. Usually when we are mapping bh and thus updating bh->b_state
non-atomically, nobody else touches the bh and so things work out fine
but there is one case to especially worry about: ext4_finish_bio() uses
BH_Uptodate_Lock on the first bh in the page to synchronize handling of
PageWriteback state. So when blocksize < pagesize, we can be atomically
modifying bh->b_state of a buffer that actually isn't under IO and thus
can race e.g. with delalloc trying to map that buffer. The result is
that we can mistakenly set / clear BH_Uptodate_Lock bit resulting in the
corruption of PageWriteback state or missed unlock of BH_Uptodate_Lock.

Fix the problem by always updating bh->b_state bits atomically.

CC: stable@vger.kernel.org
Reported-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-02-19 00:18:25 -05:00
Rasmus Villemoes 4fbbed46dc drm/nouveau: use post-decrement in error handling
We need to use post-decrement to get the dma_map_page undone also for
i==0, and to avoid some very unpleasant behaviour if dma_map_page
failed already at i==0.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-02-19 13:36:05 +10:00
Maarten Lankhorst 5fff80bbdb drm/atomic: Allow for holes in connector state, v2.
Because we record connector_mask using 1 << drm_connector_index now
the connector_mask should stay the same even when other connectors
are removed. This was not the case with MST, in that case when removing
a connector all other connectors may change their index.

This is fixed by waiting until the first get_connector_state to allocate
connector_state, and force reallocation when state is too small.

As a side effect connector arrays no longer have to be preallocated,
and can be allocated on first use which means a less allocations in
the page flip only path.

Changes since v1:
- Whitespace. (Ville)
- Call ida_remove when destroying the connector. (Ville)
- u32 alloc -> int. (Ville)

Fixes: 14de6c44d1 ("drm/atomic: Remove drm_atomic_connectors_for_crtc.")
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lyude <cpaul@redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-02-19 13:24:03 +10:00
Stephen Boyd 4462b4bbfc clk: gpio: Really allow an optional clock= DT property
We mis-merged the original patch from Russell here and so the
patch went almost all the way, except that we still failed to
probe when there wasn't a clocks property in the DT node. Allow
that case by making a negative value from
of_clk_get_parent_count() into "no parents", like the original
patch did.

Fixes: 7ed88aa2ef ("clk: fix clk-gpio.c with optional clock= DT property")
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Michael Turquette <mturquette@baylibre.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2016-02-18 19:10:22 -08:00
Dave Airlie 5441ea115e This pull request fixes GPU reset (which was disabled shortly after
V3D integration due to build breakage) and waits for idle in the
 presence of signals (which X likes to do a lot).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCgAGBQJWxNTWAAoJELXWKTbR/J7oJl4P/AouaCmV5G3P0x+s/qiWY7ZY
 ppSFTyNI5sw25m4Ag/5hIqMkSbgj72o7lacpNk2Jt3qudOVDNzLeujjabHHvA6pK
 FOXq1IadYdidsnhgWr7GB1aB9tSsSzLEfNw/2JKilzGQUlkebQuGK1cLFZSowcRy
 dYo4RaFJJ4ucFAIf2rkdhemfPRLfPSB2P6DUZURMprFO1xj/m7eOfiVgcLDPUtBS
 WvuDEFL+7YVP92tlxwlPQ7Gmz5DgnSoIn9aS8j0YDlRTPZ9bXsMz/VEjIKK90Ea4
 qV+UE+DPkay8632fKdDaLzajyCqGoi7RW0XQFxSdfHGrJm2C8miMlg39ht7ylpVv
 djwatw/BO/wDh9NWUftkq0ByUA8OqGelTq+jlR5EwEkNN3WoGjsh7+VPd+4Fdkoc
 tH+YQOMX/2DE6v3y1hePqzGSeAn+HmggYWOqsevmZ0Q8xKLM2xEdIZ6USTZMFRXH
 SkUvE9miJqMuroRCtB8k/9QSbLzCGla1liC01XvOTU2AqoJe1WBHJOB3Oj1HcFhK
 oipWy28O9A4YByobQvG6SdadfbGfaViWWTidcchxj/USiPqqBTyi/CPQMKhpOpA5
 PcQgXLGPeolwjUYobhGJmTfyPjILkprYNFXuFmoYkLtqHsRxhezfIQvylXLD1GrO
 UPfH0YAYypTi1w+O8t2N
 =fLU7
 -----END PGP SIGNATURE-----

Merge tag 'drm-vc4-fixes-2016-02-17' of github.com:anholt/linux into drm-fixes

This pull request fixes GPU reset (which was disabled shortly after
V3D integration due to build breakage) and waits for idle in the
presence of signals (which X likes to do a lot).

* tag 'drm-vc4-fixes-2016-02-17' of github.com:anholt/linux:
  drm/vc4: Use runtime PM to power cycle the device when the GPU hangs.
  drm/vc4: Enable runtime PM.
  drm/vc4: Fix spurious GPU resets due to BO reuse.
  drm/vc4: Drop error message on seqno wait timeouts.
  drm/vc4: Fix -ERESTARTSYS error return from BO waits.
  drm/vc4: Return an ERR_PTR from BO creation instead of NULL.
  drm/vc4: Fix the clear color for the first tile rendered.
  drm/vc4: Validate that WAIT_BO padding is cleared.
2016-02-19 12:50:00 +10:00
Dave Airlie aaa7dd2ced Merge branch 'drm-fixes-4.5' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Just two small fixes in the ttm_tt_populate error handling; one for radeon,
one for amdgpu.

* 'drm-fixes-4.5' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: use post-decrement in error handling
  drm/amdgpu: use post-decrement in error handling
2016-02-19 12:49:03 +10:00
Dave Airlie 42412b120d Merge tag 'drm-intel-fixes-2016-02-18' of git://anongit.freedesktop.org/drm-intel into drm-fixes
single g4x hpd fix.

* tag 'drm-intel-fixes-2016-02-18' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Fix hpd live status bits for g4x
2016-02-19 12:43:03 +10:00
Linus Torvalds 705d43dbe1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching
Pull livepatching fixes from Jiri Kosina:

 - regression (from 4.4) fix for ordering issue, introduced by an
   earlier ftrace change, that broke live patching of modules.

   The fix replaces the ftrace module notifier by direct call in order
   to make the ordering guaranteed and well-defined.  The patch, from
   Jessica Yu, has been acked both by Steven and Rusty

 - error message fix from Miroslav Benes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
  ftrace/module: remove ftrace module notifier
  livepatch: change the error message in asm/livepatch.h header files
2016-02-18 16:34:15 -08:00
Linus Torvalds dd8fc10e60 SCSI fixes on 20160218
Two simple fixes.  One prevents a soft lockup on some target removal
 scenarios and the other prevents us trying to probe the marvell
 console device, which causes it to time out and need the bus
 resetting.
 
 Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJWxf9xAAoJEDeqqVYsXL0M9noIAL7F5d/zvbJPHS0gHErd+TX8
 vkdMbFmrXkrARy1ZChCv7Z/3NcdLPzMp/erB7Ed9uc9SrcMEVGYNw3zJUicJCrmN
 1WEvd+4iFhCpqjYtmKwxOTcuvZAgobJA8Is4Lnrx5KmY0YKUoFOpFPZBbWuY22mR
 QNZwxOL1O6uM8PihKsCJCzHJVB1RiicHL24DmFK4sOU85c7zStwuvKNv/3RPy+0c
 FtDz4Gc6NmmSrsC/DZHcf5q+ybSe4VSoqeKj5eSvuhxhpPpVku2sEgtxHBhm9cZ+
 nPMWQIlXxR4vSJ6oOq+IpezorlIF3NlJVKPtwg6CyNI3sOgFxUg3EN2YKRlp+KQ=
 =cX3U
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Two simple fixes.

  One prevents a soft lockup on some target removal scenarios and the
  other prevents us trying to probe the marvell console device, which
  causes it to time out and need the bus resetting"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: fix soft lockup in scsi_remove_target() on module removal
  SCSI: Add Marvell configuration device to VPD blacklist
2016-02-18 16:24:48 -08:00
Dmitry Safonov 52b4b950b5 mm: slab: free kmem_cache_node after destroy sysfs file
When slub_debug alloc_calls_show is enabled we will try to track
location and user of slab object on each online node, kmem_cache_node
structure and cpu_cache/cpu_slub shouldn't be freed till there is the
last reference to sysfs file.

This fixes the following panic:

   BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
   IP:  list_locations+0x169/0x4e0
   PGD 257304067 PUD 438456067 PMD 0
   Oops: 0000 [#1] SMP
   CPU: 3 PID: 973074 Comm: cat ve: 0 Not tainted 3.10.0-229.7.2.ovz.9.30-00007-japdoll-dirty #2 9.30
   Hardware name: DEPO Computers To Be Filled By O.E.M./H67DE3, BIOS L1.60c 07/14/2011
   task: ffff88042a5dc5b0 ti: ffff88037f8d8000 task.ti: ffff88037f8d8000
   RIP: list_locations+0x169/0x4e0
   Call Trace:
     alloc_calls_show+0x1d/0x30
     slab_attr_show+0x1b/0x30
     sysfs_read_file+0x9a/0x1a0
     vfs_read+0x9c/0x170
     SyS_read+0x58/0xb0
     system_call_fastpath+0x16/0x1b
   Code: 5e 07 12 00 b9 00 04 00 00 3d 00 04 00 00 0f 4f c1 3d 00 04 00 00 89 45 b0 0f 84 c3 00 00 00 48 63 45 b0 49 8b 9c c4 f8 00 00 00 <48> 8b 43 20 48 85 c0 74 b6 48 89 df e8 46 37 44 00 48 8b 53 10
   CR2: 0000000000000020

Separated __kmem_cache_release from __kmem_cache_shutdown which now
called on slab_kmem_cache_release (after the last reference to sysfs
file object has dropped).

Reintroduced locking in free_partial as sysfs file might access cache's
partial list after shutdowning - partial revert of the commit
69cb8e6b7c ("slub: free slabs without holding locks").  Zap
__remove_partial and use remove_partial (w/o underscores) as
free_partial now takes list_lock which s partial revert for commit
1e4dd9461f ("slub: do not assert not having lock in removing freed
partial")

Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Suggested-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18 16:23:24 -08:00
Kirill A. Shutemov 1ac0b6dec6 ipc/shm: handle removed segments gracefully in shm_mmap()
remap_file_pages(2) emulation can reach file which represents removed
IPC ID as long as a memory segment is mapped.  It breaks expectations of
IPC subsystem.

Test case (rewritten to be more human readable, originally autogenerated
by syzkaller[1]):

	#define _GNU_SOURCE
	#include <stdlib.h>
	#include <sys/ipc.h>
	#include <sys/mman.h>
	#include <sys/shm.h>

	#define PAGE_SIZE 4096

	int main()
	{
		int id;
		void *p;

		id = shmget(IPC_PRIVATE, 3 * PAGE_SIZE, 0);
		p = shmat(id, NULL, 0);
		shmctl(id, IPC_RMID, NULL);
		remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0);

	        return 0;
	}

The patch changes shm_mmap() and code around shm_lock() to propagate
locking error back to caller of shm_mmap().

[1] http://github.com/google/syzkaller

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18 16:23:24 -08:00
Shuah Khan 64f0085001 MAINTAINERS: update Kselftest Framework mailing list
Kselftest Framework now has a dedicated mailing list linux-kselftest.
Update the entry in MAINTAINERS file.

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18 16:23:24 -08:00
Toshi Kani 9273a8bbf5 devm_memremap_release(): fix memremap'd addr handling
The pmem driver calls devm_memremap() to map a persistent memory range.
When the pmem driver is unloaded, this memremap'd range is not released
so the kernel will leak a vma.

Fix devm_memremap_release() to handle a given memremap'd address
properly.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18 16:23:24 -08:00
Vaishali Thakkar f8b74815a4 mm/hugetlb.c: fix incorrect proc nr_hugepages value
Currently incorrect default hugepage pool size is reported by proc
nr_hugepages when number of pages for the default huge page size is
specified twice.

When multiple huge page sizes are supported, /proc/sys/vm/nr_hugepages
indicates the current number of pre-allocated huge pages of the default
size.  Basically /proc/sys/vm/nr_hugepages displays default_hstate->
max_huge_pages and after boot time pre-allocation, max_huge_pages should
equal the number of pre-allocated pages (nr_hugepages).

Test case:

Note that this is specific to x86 architecture.

Boot the kernel with command line option 'default_hugepagesz=1G
hugepages=X hugepagesz=2M hugepages=Y hugepagesz=1G hugepages=Z'.  After
boot, 'cat /proc/sys/vm/nr_hugepages' and 'sysctl -a | grep hugepages'
returns the value X.  However, dmesg output shows that Z huge pages were
pre-allocated.

So, the root cause of the problem here is that the global variable
default_hstate_max_huge_pages is set if a default huge page size is
specified (directly or indirectly) on the command line.  After the command
line processing in hugetlb_init, if default_hstate_max_huge_pages is set,
the value is assigned to default_hstae.max_huge_pages.  However,
default_hstate.max_huge_pages may have already been set based on the
number of pre-allocated huge pages of default_hstate size.

The solution to this problem is if hstate->max_huge_pages is already set
then it should not set as a result of global max_huge_pages value.
Basically if the value of the variable hugepages is set multiple times on
a command line for a specific supported hugepagesize then proc layer
should consider the last specified value.

Signed-off-by: Vaishali Thakkar <vaishali.thakkar@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18 16:23:24 -08:00
Hugh Dickins 457a98b080 mm, x86: fix pte_page() crash in gup_pte_range()
Commit 3565fce3a6 ("mm, x86: get_user_pages() for dax mappings") has
moved up the pte_page(pte) in x86's fast gup_pte_range(), for no
discernible reason: put it back where it belongs, after the pte_flags
check and the pfn_valid cross-check.

That may be the cause of the NULL pointer dereference in
gup_pte_range(), seen when vfio called vaddr_get_pfn() when starting a
qemu-kvm based VM.

Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Michael Long <Harn-Solo@gmx.de>
Tested-by: Michael Long <Harn-Solo@gmx.de>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18 16:23:24 -08:00
Jeff Layton 0918f1c309 fsnotify: turn fsnotify reaper thread into a workqueue job
We don't require a dedicated thread for fsnotify cleanup.  Switch it
over to a workqueue job instead that runs on the system_unbound_wq.

In the interest of not thrashing the queued job too often when there are
a lot of marks being removed, we delay the reaper job slightly when
queueing it, to allow several to gather on the list.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Tested-by: Eryu Guan <guaneryu@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18 16:23:24 -08:00
Jeff Layton 13d34ac6e5 Revert "fsnotify: destroy marks with call_srcu instead of dedicated thread"
This reverts commit c510eff6be ("fsnotify: destroy marks with
call_srcu instead of dedicated thread").

Eryu reported that he was seeing some OOM kills kick in when running a
testcase that adds and removes inotify marks on a file in a tight loop.

The above commit changed the code to use call_srcu to clean up the
marks.  While that does (in principle) work, the srcu callback job is
limited to cleaning up entries in small batches and only once per jiffy.
It's easily possible to overwhelm that machinery with too many call_srcu
callbacks, and Eryu's reproduer did just that.

There's also another potential problem with using call_srcu here.  While
you can obviously sleep while holding the srcu_read_lock, the callbacks
run under local_bh_disable, so you can't sleep there.

It's possible when putting the last reference to the fsnotify_mark that
we'll end up putting a chain of references including the fsnotify_group,
uid, and associated keys.  While I don't see any obvious ways that that
could occurs, it's probably still best to avoid using call_srcu here
after all.

This patch reverts the above patch.  A later patch will take a different
approach to eliminated the dedicated thread here.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reported-by: Eryu Guan <guaneryu@gmail.com>
Tested-by: Eryu Guan <guaneryu@gmail.com>
Cc: Jan Kara <jack@suse.com>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18 16:23:24 -08:00
Kirill A. Shutemov 48f7df3294 mm: fix regression in remap_file_pages() emulation
Grazvydas Ignotas has reported a regression in remap_file_pages()
emulation.

Testcase:
	#define _GNU_SOURCE
	#include <assert.h>
	#include <stdlib.h>
	#include <stdio.h>
	#include <sys/mman.h>

	#define SIZE    (4096 * 3)

	int main(int argc, char **argv)
	{
		unsigned long *p;
		long i;

		p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
				MAP_SHARED | MAP_ANONYMOUS, -1, 0);
		if (p == MAP_FAILED) {
			perror("mmap");
			return -1;
		}

		for (i = 0; i < SIZE / 4096; i++)
			p[i * 4096 / sizeof(*p)] = i;

		if (remap_file_pages(p, 4096, 0, 1, 0)) {
			perror("remap_file_pages");
			return -1;
		}

		if (remap_file_pages(p, 4096 * 2, 0, 1, 0)) {
			perror("remap_file_pages");
			return -1;
		}

		assert(p[0] == 1);

		munmap(p, SIZE);

		return 0;
	}

The second remap_file_pages() fails with -EINVAL.

The reason is that remap_file_pages() emulation assumes that the target
vma covers whole area we want to over map.  That assumption is broken by
first remap_file_pages() call: it split the area into two vma.

The solution is to check next adjacent vmas, if they map the same file
with the same flags.

Fixes: c8d78c1823 ("mm: replace remap_file_pages() syscall with emulation")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Cc: <stable@vger.kernel.org>	[4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-18 16:23:24 -08:00