WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
蔡正龙	a9302e8439	alpha: Enable system-call auditing support. Signed-off-by: Zhenglong.cai <zhenglong.cai@cs2c.com.cn> Signed-off-by: Matt Turner <mattst88@gmail.com>	2014-01-31 09:21:55 -08:00
Linus Torvalds	aafd9d6a46	Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer/dynticks updates from Ingo Molnar: "This tree contains misc dynticks updates: a fix and three cleanups" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/nohz: Fix overflow error in scheduler_tick_max_deferment() nohz_full: fix code style issue of tick_nohz_full_stop_tick nohz: Get timekeeping max deferment outside jiffies_lock tick: Rename tick_check_idle() to tick_irq_enter()	2014-01-31 09:02:51 -08:00
Linus Torvalds	595bf999e3	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "A crash fix and documentation updates" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Make sched_class::get_rr_interval() optional sched/deadline: Add sched_dl documentation sched: Fix docbook parameter annotation error in wait.h	2014-01-31 09:00:44 -08:00
Linus Torvalds	ab5318788c	Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core debug changes from Ingo Molnar: "This contains mostly kernel debugging related updates: - make hung_task detection more configurable to distros - add final bits for x86 UV NMI debugging, with related KGDB changes - update the mailing-list of MAINTAINERS entries I'm involved with" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hung_task: Display every hung task warning sysctl: Add neg_one as a standard constraint x86/uv/nmi, kgdb/kdb: Fix UV NMI handler when KDB not configured x86/uv/nmi: Fix Sparse warnings kgdb/kdb: Fix no KDB config problem MAINTAINERS: Restore "L: linux-kernel@vger.kernel.org" entries	2014-01-31 08:59:46 -08:00
Stephen Warren	75fae117a5	ALSA: hda/hdmi - allow PIN_OUT to be dynamically enabled Commit `384a48d715` "ALSA: hda: HDMI: Support codecs with fewer cvts than pins" dynamically enabled each pin widget's PIN_OUT only when the pin was actively in use. This was required on certain NVIDIA CODECs for correct operation. Specifically, if multiple pin widgets each had their mux input select the same audio converter widget and each pin widget had PIN_OUT enabled, then only one of the pin widgets would actually receive the audio, and often not the one the user wanted! However, this apparently broke some Intel systems, and commit `6169b67361` "ALSA: hda - Always turn on pins for HDMI/DP" reverted the dynamic setting of PIN_OUT. This in turn broke the afore-mentioned NVIDIA CODECs. This change supports either dynamic or static handling of PIN_OUT, selected by a flag set up during CODEC initialization. This flag is enabled for all recent NVIDIA GPUs. Reported-by: Uosis <uosisl@gmail.com> Cc: <stable@vger.kernel.org> # v3.13 Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2014-01-31 17:57:02 +01:00
Linus Torvalds	14164b46fc	Bug-fixes: - Xen ARM couldn't use the new FIFO events - Xen ARM couldn't use the SWIOTLB if compiled as 32-bit with 64-bit PCIe devices. - Grant table were doing needless M2P operations. - Ratchet down the self-balloon code so it won't OOM. - Fix misplaced kfree in Xen PVH error code paths. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJS68IQAAoJEFjIrFwIi8fJAWgH/j4HStEey3rgGcqwIWSHkkap +t55wsrT8Ylq6CzZjaUtCo3pB7HotW526x/0rA2pxVqHn/8oCN/1EtdrNtYm/umX qOoda+db5NIjAEGVLWSLqGyokJQDrX/brXIWfYR300e9fnJi7yT/rFC4QHoZVUYl 5LME8XH/jE012vvYelNu6DbbodlRmVCT8hctJS+eB5ER2WmtD9Pkw4GybEXPVYJz hE0Ts1DN91nKP2FGJb+mfB9UFT5X8i00akAK+Qc1R3sRnRh6eRoNV8dgyCnudKpO UPEdiAZvgij+mzlgIYSz6nKH0U/VbvRsG3lc3i5Si3o+vR3CYPCkvzOGX2d0rjw= =7cxW -----END PGP SIGNATURE----- Merge tag 'stable/for-linus-3.14-rc0-late-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen bugfixes from Konrad Rzeszutek Wilk: "Bug-fixes for the new features that were added during this cycle. There are also two fixes for long-standing issues for which we have a solution: grant-table operations extra work that was not needed causing performance issues and the self balloon code was too aggressive causing OOMs. Details: - Xen ARM couldn't use the new FIFO events - Xen ARM couldn't use the SWIOTLB if compiled as 32-bit with 64-bit PCIe devices. - Grant table were doing needless M2P operations. - Ratchet down the self-balloon code so it won't OOM. - Fix misplaced kfree in Xen PVH error code paths" * tag 'stable/for-linus-3.14-rc0-late-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pvh: Fix misplaced kfree from xlated_setup_gnttab_pages drivers: xen: deaggressive selfballoon driver xen/grant-table: Avoid m2p_override during mapping xen/gnttab: Use phys_addr_t to describe the grant frame base address xen: swiotlb: handle sizeof(dma_addr_t) != sizeof(phys_addr_t) arm/xen: Initialize event channels earlier	2014-01-31 08:38:18 -08:00
Linus Torvalds	e2a0f813e0	Second batch of KVM updates. Some minor x86 fixes, two s390 guest features that need some handling in the host, and all the PPC changes. The PPC changes include support for little-endian guests and enablement for new POWER8 features. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJS6UF5AAoJEBvWZb6bTYby55kP/AgTJnyu7avN653/2aSHkjkx KgYSMYhZPIFoY5LyZuNetXaoXFRvCykux1VYSZ6V6s35h2PZ+hdJNbHGjFYKPGTq FQ92xQVNuWCAPxmFCjDNuDV/0BauG5y08/Orh/jpjz+GAfH43LruUQGbtXUuyJ8u vf+yTHniU5gguqsAmodqjHUgbf+GoPJ1j7hmRoWwt8IWm7Ns3v/IK4l0p6G0h26a RjE6aK+Tm208Yr5hD/dRAqeTbBNt3c4xub+QPsKoiEMaZBSuAOiux7D3Kx+If1gp WsmqEQxoymihVtkZhUFO9ONLJepvmG2QwJVVyMSUW9iqxX9rraXsvVyVMwcQAhog JuOAYxBftH07xu6Fs4eym5KvCFghM+EaJvxxt+kgnvdD4htK1+eK5trntc2zygSi /qGiIrkqjXpkskW8kujLayF0eAU3CrZvFWveEPBfFgYiOGX/2wzJCtSm/bt9Jo0M v60qgNFK3LNqAyeEfnm9VtlwGr6ZgsAB6DHNPX4fM5s2IBjL+qloXk/e/+aVKkW0 I3yeRdy/ExhLAab6w81JtMeR7G3YS0UNuAEVvcoxzNb5wIBY8qnpfUzTKyMxQR94 64EVpxWEYO1s55eCCyMujWrSvc+YAwhJcWHGKgC4K7mxxLD3FVyQXX6YZvgRozMX HjQju+DToj9CskyrFlRL =yd0Z -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull more KVM updates from Paolo Bonzini: "Second batch of KVM updates. Some minor x86 fixes, two s390 guest features that need some handling in the host, and all the PPC changes. The PPC changes include support for little-endian guests and enablement for new POWER8 features" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (45 commits) x86, kvm: correctly access the KVM_CPUID_FEATURES leaf at 0x40000101 x86, kvm: cache the base of the KVM cpuid leaves kvm: x86: move KVM_CAP_HYPERV_TIME outside #ifdef KVM: PPC: Book3S PR: Cope with doorbell interrupts KVM: PPC: Book3S HV: Add software abort codes for transactional memory KVM: PPC: Book3S HV: Add new state for transactional memory powerpc/Kconfig: Make TM select VSX and VMX KVM: PPC: Book3S HV: Basic little-endian guest support KVM: PPC: Book3S HV: Add support for DABRX register on POWER7 KVM: PPC: Book3S HV: Prepare for host using hypervisor doorbells KVM: PPC: Book3S HV: Handle new LPCR bits on POWER8 KVM: PPC: Book3S HV: Handle guest using doorbells for IPIs KVM: PPC: Book3S HV: Consolidate code that checks reason for wake from nap KVM: PPC: Book3S HV: Implement architecture compatibility modes for POWER8 KVM: PPC: Book3S HV: Add handler for HV facility unavailable KVM: PPC: Book3S HV: Flush the correct number of TLB sets on POWER8 KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs KVM: PPC: Book3S HV: Align physical and virtual CPU thread numbers KVM: PPC: Book3S HV: Don't set DABR on POWER8 kvm/ppc: IRQ disabling cleanup ...	2014-01-31 08:37:32 -08:00
Linus Torvalds	e30b82bbe0	Minor bug fix for linux-3.14 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJS67pQAAoJEDaohF61QIxkfkcQAKAuNhBtaGkx8Dd9gPM98DrW F4NnitUNzcjHTSq23oRIe7rLanXTMHKMhq4S83YWISLTWCAmekOVF4xs8WTEm1Px 5/X3an65m3mvW/1V7oxixA+GLUk2LwPqPt7N+jWApkR9KB5LydQCucdbq4Vza9gd SkoK5RJe8bTE+43k962vsDJM7XCVeUOGOSXO7MyTaipRZ+fyuEPwwbQ0u7Bxl9JG HkfhCJUu3V79nruIgmP1Z2llN2XIv/fsCo3F42yBspSBUxYqnpnAdo6Zr5srnEQC kHm67VFk5sPnLPnm6AxaoonE9GX+TMzy5eUk6Lb5x5EwRlH694FFovlquFyr3RjN cmh6VWGv1jcYnEmRhKjT8Vs0GQ+Q5L2L6dEdDbck2WkVWPDdLb83wbPJRZ/g5kcF GAUI8EjjSR4ezuZqmv6tbWHpn/q7p6NekC8DVMmwtoBZgIuFmITyrqpFJUG4acnO yuGdZtYePnMHZlQMNTdzUjuOu/Wyqit5OrM/ztmrGlYCI8eiE17OInWikZzSFFcf +uDodCg5tokG8AlNgNZ+thcpjO7aAR9GCOD2INZYKlFRkJANpQ+OOK+AOafmWoqO UjmL0rH3FXRu0uhD8VLKRaUfuyMaBHJ/9+ndwmyOa+7x0BPv03hw3bGGhy6tsTHg XbxlExsTrtpKEXwznWqG =F/GX -----END PGP SIGNATURE----- Merge tag 'jfs-3.14' of git://github.com/kleikamp/linux-shaggy Pull jfs fix from David Kleikamp: "Minor bug fix for linux-3.14" * tag 'jfs-3.14' of git://github.com/kleikamp/linux-shaggy: jfs: fix xattr value size overflow in __jfs_setxattr	2014-01-31 08:14:35 -08:00
Sage Weil	77516dc92a	ceph: fix missing dput in ceph_set_acl Add matching dput() for d_find_alias(). Move d_find_alias() down a bit at Julia's suggestion. [ Introduced by commit 72466d0b92e0: "ceph: fix posix ACL hooks" ] Reported-by: Fengguang Wu <fengguang.wu@intel.com> Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Ilya Dryomov <ilya.dryomov@inktank.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-31 08:14:06 -08:00
Sachin Prabhu	a9a315d414	cifs: Fix check for regular file in couldbe_mf_symlink() MF Symlinks are regular files containing content in a specified format. The function couldbe_mf_symlink() checks the mode for a set S_IFREG bit as a test to confirm that it is a regular file. This bit is also set for other filetypes and simply checking for this bit being set may return false positives. We ensure that we are actually checking for a regular file by using the S_ISREG macro to test instead. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reported-by: Neil Brown <neilb@suse.de> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <smfrench@gmail.com>	2014-01-31 09:06:43 -06:00
Dave Jones	f93576e1ac	xen/pvh: Fix misplaced kfree from xlated_setup_gnttab_pages Passing a freed 'pages' to free_xenballooned_pages will end badly on kernels with slub debug enabled. This looks out of place between the rc assign and the check, but we do want to kfree pages regardless of which path we take. Signed-off-by: Dave Jones <davej@fedoraproject.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-31 09:48:58 -05:00
Bob Liu	bc1b0df59e	drivers: xen: deaggressive selfballoon driver Current xen-selfballoon driver is too aggressive which may cause OOM be triggered more often. Eg. this bug reported by James: https://lkml.org/lkml/2013/11/21/158 There are two mainly reasons: 1) The original goal_page didn't consider some pages used by kernel space, like slab pages and pages used by device drivers. 2) The balloon driver may not give back memory to guest OS fast enough when the workload suddenly aquries a lot of physical memory. In both cases, the guest OS will suffer from memory pressure and OOM may be triggered. The fix is make xen-selfballoon driver not that aggressive by adding extra 10% of total ram pages to goal_page. It's more valuable to keep the guest system reliable and response faster than balloon out these 10% pages to XEN. Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-31 09:48:43 -05:00
Zoltan Kiss	08ece5bb23	xen/grant-table: Avoid m2p_override during mapping The grant mapping API does m2p_override unnecessarily: only gntdev needs it, for blkback and future netback patches it just cause a lock contention, as those pages never go to userspace. Therefore this series does the following: - the original functions were renamed to __gnttab_[un]map_refs, with a new parameter m2p_override - based on m2p_override either they follow the original behaviour, or just set the private flag and call set_phys_to_machine - gnttab_[un]map_refs are now a wrapper to call __gnttab_[un]map_refs with m2p_override false - a new function gnttab_[un]map_refs_userspace provides the old behaviour It also removes a stray space from page.h and change ret to 0 if XENFEAT_auto_translated_physmap, as that is the only possible return value there. v2: - move the storing of the old mfn in page->index to gnttab_map_refs - move the function header update to a separate patch v3: - a new approach to retain old behaviour where it needed - squash the patches into one v4: - move out the common bits from m2p* functions, and pass pfn/mfn as parameter - clear page->private before doing anything with the page, so m2p_find_override won't race with this v5: - change return value handling in __gnttab_[un]map_refs - remove a stray space in page.h - add detail why ret = 0 now at some places v6: - don't pass pfn to m2p* functions, just get it locally Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Suggested-by: David Vrabel <david.vrabel@citrix.com> Acked-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-31 09:48:32 -05:00
Malahal Naineni	a1800acaf7	nfs: initialize the ACL support bits to zero. Avoid returning incorrect acl mask attributes when the server doesn't support ACLs. Signed-off-by: Malahal Naineni <malahal@us.ibm.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-01-31 08:28:16 -05:00
Masanari Iida	cb8ee1a3d4	mm: Fix warning on make htmldocs caused by slab.c This patch fixed following errors while make htmldocs Warning(/mm/slab.c:1956): No description found for parameter 'page' Warning(/mm/slab.c:1956): Excess function parameter 'slabp' description in 'slab_destroy' Incorrect function parameter "slabp" was set instead of "page" Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>	2014-01-31 13:52:25 +02:00
Dave Hansen	67b6c900dc	mm: slub: work around unneeded lockdep warning The slub code does some setup during early boot in early_kmem_cache_node_alloc() with some local data. There is no possible way that another CPU can see this data, so the slub code doesn't unnecessarily lock it. However, some new lockdep asserts check to make sure that add_partial() _always_ has the list_lock held. Just add the locking, even though it is technically unnecessary. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>	2014-01-31 13:41:26 +02:00
Dave Hansen	433a91ff5f	mm: sl[uo]b: fix misleading comments On x86, SLUB creates and handles <=8192-byte allocations internally. It passes larger ones up to the allocator. Saying "up to order 2" is, at best, ambiguous. Is that order-1? Or (order-2 bytes)? Make it more clear. SLOB commits a similar sin. It handles page-size requests, but the comment says that it passes up "all page size and larger requests". SLOB also swaps around the order of the very-similarly-named KMALLOC_SHIFT_HIGH and KMALLOC_SHIFT_MAX #defines. Make it consistent with the order of the other two allocators. Cc: Matt Mackall <mpm@selenic.com> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Christoph Lameter <cl@linux-foundation.org> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>	2014-01-31 13:40:34 +02:00
Steve Capper	c2c93e5b7f	arm64: mm: Introduce PTE_WRITE We have the following means for encoding writable or dirty ptes: PTE_DIRTY PTE_RDONLY !pte_dirty && !pte_write 0 1 !pte_dirty && pte_write 0 1 pte_dirty && !pte_write 1 1 pte_dirty && pte_write 1 0 So we can't distinguish between writable clean ptes and read only ptes. This can cause problems with ptes being incorrectly flagged as read only when they are writable but not dirty. This patch introduces a new software bit PTE_WRITE which allows us to correctly identify writable ptes. PTE_RDONLY is now only clear for valid ptes where a page is both writable and dirty. Signed-off-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-01-31 11:30:49 +00:00
Steve Capper	44b6dfc556	arm64: mm: Remove PTE_BIT_FUNC macro Expand out the pte manipulation functions. This makes our life easier when using things like tags and cscope. Signed-off-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-01-31 11:30:05 +00:00
Linus Torvalds	e7651b819e	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs updates from Chris Mason: "This is a pretty big pull, and most of these changes have been floating in btrfs-next for a long time. Filipe's properties work is a cool building block for inheriting attributes like compression down on a per inode basis. Jeff Mahoney kicked in code to export filesystem info into sysfs. Otherwise, lots of performance improvements, cleanups and bug fixes. Looks like there are still a few other small pending incrementals, but I wanted to get the bulk of this in first" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (149 commits) Btrfs: fix spin_unlock in check_ref_cleanup Btrfs: setup inode location during btrfs_init_inode_locked Btrfs: don't use ram_bytes for uncompressed inline items Btrfs: fix btrfs_search_slot_for_read backwards iteration Btrfs: do not export ulist functions Btrfs: rework ulist with list+rb_tree Btrfs: fix memory leaks on walking backrefs failure Btrfs: fix send file hole detection leading to data corruption Btrfs: add a reschedule point in btrfs_find_all_roots() Btrfs: make send's file extent item search more efficient Btrfs: fix to catch all errors when resolving indirect ref Btrfs: fix protection between walking backrefs and root deletion btrfs: fix warning while merging two adjacent extents Btrfs: fix infinite path build loops in incremental send btrfs: undo sysfs when open_ctree() fails Btrfs: fix snprintf usage by send's gen_unique_name btrfs: fix defrag 32-bit integer overflow btrfs: sysfs: list the NO_HOLES feature btrfs: sysfs: don't show reserved incompat feature btrfs: call permission checks earlier in ioctls and return EPERM ...	2014-01-30 20:08:20 -08:00
Linus Torvalds	060e8e3b6f	* Improve the NOR erasure quirk - now it tries to do as little writes as possible, because the eraseblock may be in an "unstable" state and write operation sometimes causes NOR chip lock-ups. * Both UBI and UBIFS changes are now maintainer in one single tree, because the amount of changes dropped significantly. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQIcBAABAgAGBQJS6kWPAAoJECmIfjd9wqK0U68QAJ4vljgaEGi4VvErH3q9PJ68 EUOg4f1rUxSB2uMjE9BahVC5bGRUsZtsViau3CdsF7LaJK+6h7oEtOOLcHeIqUbg OkeYDdX9D5WFP4RtwB56WDERh8Qbj9Nl8/LVnCr9iVZiy1QsHSPDb/XFd4zimOdH Cvtf8MuQZN44o0N0ooBhO5nshrQ/Y/gm4ufomDZzK1MTiq4SZMyDcH4UyWnSHtrY ZkCdPjNj18xVPjnxJF4RbxwUJkEybczribfMIBZt4dveeW0ERU/xmLdJb9MAx3mY SmZG2vFPjtwPykBvdLLVm43xfxuUG1eZ2PE1COwmkfUb/u1Ej0eviVRAISby0RAL VlRP7CcMu0GGRZhZ20yGO0YvIgciLHJj4HgRzaiFoxRpsbPoWIcBuRsvrpIDDILV qfBhA9njv5o3KkcdmZ9kxl42kbC3CxkTER/VnDhdAKDq9s2bxZNYI2829Xbbdnwj +BawsYvieH+7zhgchqdoX8nZ2Mc7z9TQwuUbAIK3SHvqC7K172SyC3QNPDezbrl2 gqOzf6wTkhf+PO2fbEnr9ERCxPS6nsoni1e4Na5eVmNy7ww9kkAuSLILMFlE2dsv h/doqZ7zQlrdw++dRzDWzqgDN1a+iBsXuZwjQ0Qqha+xb4j+LMkmcco/kibou6Xn TT94iHc+G7+k9U4ILeF5 =pRDO -----END PGP SIGNATURE----- Merge tag 'upstream-3.14-rc1' of git://git.infradead.org/linux-ubifs Pull ubifs updates from Artem Bityutskiy: - Improve the NOR erasure quirk - now it tries to do as little writes as possible, because the eraseblock may be in an "unstable" state and write operation sometimes causes NOR chip lock-ups. - Both UBI and UBIFS changes are now maintainer in one single tree, because the amount of changes dropped significantly. * tag 'upstream-3.14-rc1' of git://git.infradead.org/linux-ubifs: UBI: avoid program operation on NOR flash after erasure interrupted MAINTAINERS: keep UBI and UBIFS stuff in the same tree UBI: fix error return code	2014-01-30 20:04:09 -08:00
Linus Torvalds	271bf66d4c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull some further ceph acl cleanups from Sage Weil: "I do have a couple patches on top of what's in your tree, though, that clean up a couple duplicated lines in your fix and apply Christoph's cleanup" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: simplify ceph_{get,init}_acl ceph: remove duplicate declaration of ceph_setattr	2014-01-30 20:02:51 -08:00
Christoph Hellwig	7585823619	ceph: simplify ceph_{get,init}_acl - ->get_acl only gets called after we checked for a cached ACL, so no need to call get_cached_acl again. - no need to check IS_POSIXACL in ->get_acl, without that it should never get set as all the callers that set it already have the check. - you should be able to use the full posix_acl_create in CEPH Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Sage Weil <sage@inktank.com>	2014-01-30 19:26:17 -08:00
Linus Torvalds	aa2e7100e3	Merge branch 'akpm' (patches from Andrew Morton) Merge misc fixes from Andrew Morton: "A few hotfixes and various leftovers which were awaiting other merges. Mainly movement of zram into mm/" * emailed patches fron Andrew Morton <akpm@linux-foundation.org>: (25 commits) memcg: fix mutex not unlocked on memcg_create_kmem_cache fail path Documentation/filesystems/vfs.txt: update file_operations documentation mm, oom: base root bonus on current usage mm: don't lose the SOFT_DIRTY flag on mprotect mm/slub.c: fix page->_count corruption (again) mm/mempolicy.c: fix mempolicy printing in numa_maps zram: remove zram->lock in read path and change it with mutex zram: remove workqueue for freeing removed pending slot zram: introduce zram->tb_lock zram: use atomic operation for stat zram: remove unnecessary free zram: delay pending free request in read path zram: fix race between reset and flushing pending work zsmalloc: add maintainers zram: add zram maintainers zsmalloc: add copyright zram: add copyright zram: remove old private project comment zram: promote zram from staging zsmalloc: move it under mm ...	2014-01-30 18:44:44 -08:00
PaX Team	2def2ef2ae	x86, x32: Correct invalid use of user timespec in the kernel The x32 case for the recvmsg() timout handling is broken: asmlinkage long compat_sys_recvmmsg(int fd, struct compat_mmsghdr __user mmsg, unsigned int vlen, unsigned int flags, struct compat_timespec __user timeout) { int datagrams; struct timespec ktspec; if (flags & MSG_CMSG_COMPAT) return -EINVAL; if (COMPAT_USE_64BIT_TIME) return __sys_recvmmsg(fd, (struct mmsghdr __user )mmsg, vlen, flags \| MSG_CMSG_COMPAT, (struct timespec ) timeout); ... The timeout pointer parameter is provided by userland (hence the __user annotation) but for x32 syscalls it's simply cast to a kernel pointer and is passed to __sys_recvmmsg which will eventually directly dereference it for both reading and writing. Other callers to __sys_recvmmsg properly copy from userland to the kernel first. The bug was introduced by commit `ee4fa23c4b` ("compat: Use COMPAT_USE_64BIT_TIME in net/compat.c") and should affect all kernels since 3.4 (and perhaps vendor kernels if they backported x32 support along with this code). Note that CONFIG_X86_X32_ABI gets enabled at build time and only if CONFIG_X86_X32 is enabled and ld can build x32 executables. Other uses of COMPAT_USE_64BIT_TIME seem fine. This addresses CVE-2014-0038. Signed-off-by: PaX Team <pageexec@freemail.hu> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org> # v3.4+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 18:44:13 -08:00
Linus Torvalds	12f2bbd609	Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asmlinkage (LTO) changes from Peter Anvin: "This patchset adds more infrastructure for link time optimization (LTO). This patchset was pulled into my tree late because of a miscommunication (part of the patchset was picked up by other maintainers). However, the patchset is strictly build-related and seems to be okay in testing" * 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, asmlinkage, xen: Fix type of NMI x86, asmlinkage, xen, kvm: Make {xen,kvm}_lock_spinning global and visible x86: Use inline assembler instead of global register variable to get sp x86, asmlinkage, paravirt: Make paravirt thunks global x86, asmlinkage, paravirt: Don't rely on local assembler labels x86, asmlinkage, lguest: Fix C functions used by inline assembler	2014-01-30 18:15:32 -08:00
Linus Torvalds	10ffe3dbf7	Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 build bits from Peter Anvin: "Various build-related minor bits. Most of this is work by David Woodhouse to be able to compile the early boot code with clang/llvm; we have also managed to push an actual -m16 option into gcc 4.9 so this makes us use that option if available instead of hacking it. The balance is a patch from Michael Davidson to the relocs program to help manual debugging. None of these should change the actual compiled binary with currently released compilers" * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, build: Build 16-bit code with -m16 where possible x86, boot: Fix word-size assumptions in has_eflag() inline asm x86, boot: Use __attribute__((used)) to ensure videocard structs are emitted x86: Remove duplication of 16-bit CFLAGS x86, relocs: Add manual debug mode	2014-01-30 18:13:20 -08:00
Linus Torvalds	f8a504c404	ARM: SoC late changes for v3.14 These are changes that arrived a little late but were considered self-contained enough to still go in for v3.14. They are all device tree updtes this time around, and mainly for Broadcom SoCs. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQIcBAABAgAGBQJS6u4lAAoJEFk3GJrT+8ZlbmkP/jvLB3+S7wyHfIqXeuQAL5s1 24jp495ynJe8aql70VGYIkS5+m7oB4Gx7cY0eAwsUClggI0EGuCuQWssrxqc0sNU zO5AiQp1HvVe5zNhNy7fSRH+XKrLCbIFsTwDGM4XQBVSJeskbOH2+xPernkxFdTC xCbCDoA8bLyrw+868T2AvQ3ArUAFdfnMIMtMDaFLY2D3ibJx0tG7PKm+uiMgd1n/ J+K+xGqq12+BbR2tDQWeKfKWeEPizlfFT07bhz01gdt5036bKTIicr2n3J0K+hq/ VxEdR3ZxYTW5sbfYqNk9JJ213PfQ+9PzCu7BsH+RlPLm7jzUohpMYB/mwXkroBnV DsXLu3514v1DNPzWQmLvjx4wM3BAMUrFwuWalsPWrffwQaIVQDp6/aTCHWma67Nq egzbQWOrVLGIhPaG8W3CLbjuussOh9orsoNi2UwM4GImgz24CDuNf6n8XICheY8r 6PH+lro/x72SC4e7FNhGbxMc4MGK90wiNNBMSKqBUgQYMbxWPEfK3irIwrvlGPrC E3tmaQSbl6zSQ/b6SSsu4tvg2JldulXQ9a+uYc+dQ0HWf5CtGqyaBYuZ4zRqCGNk ualHxIPQKvlp0/bjWRvVTBXlWrnih/RDUT2AFT044L1R/s7ICz4tZE/xZNS40Cy6 aZ0Ce0JrdsKpWSlWxu0o =tEa/ -----END PGP SIGNATURE----- Merge tag 'late-dt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC late changes from Kevin Hilman: "These are changes that arrived a little late but were considered self-contained enough to still go in for v3.14. They are all device tree updtes this time around, and mainly for Broadcom SoCs" * tag 'late-dt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: moxart: move fixed rate clock child node to board level dts clk: bcm281xx: define kona clock binding ARM: dts: add usb udc support to bcm281xx ARM: dts: Specify clocks for timer on bcm11351 Documentation: dt: kona-timer: Add clocks property ARM: dts: Specify clocks for SDHCIs on bcm11351 Documentation: dt: kona-sdhci: Add clocks property ARM: dts: Specify clocks for UARTs on bcm11351 ARM: dts: bcm281xx: Add i2c busses ARM: dts: Declare clocks as fixed on bcm11351 ARM: dts: bcm28155-ap: Enable all the i2c busses	2014-01-30 18:08:27 -08:00
Linus Torvalds	cdfc83075f	Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus Pull MIPS updates from Ralf Baechle: "The most notable new addition inside this pull request is the support for MIPS's latest and greatest core called "inter/proAptiv". The patch series describes this core as follows. "The interAptiv is a power-efficient multi-core microprocessor for use in system-on-chip (SoC) applications. The interAptiv combines a multi-threading pipeline with a coherence manager to deliver improved computational throughput and power efficiency. The interAptiv can contain one to four MIPS32R3 interAptiv cores, system level coherence manager with L2 cache, optional coherent I/O port, and optional floating point unit." The platform specific patches touch all 3 Broadcom families. It adds support for the new Broadcom/Netlogix XLP9xx Soc, building a common BCM63XX SMP kernel for all BCM63XX SoCs regardless of core type/count and full gpio button/led descriptions for BCM47xx. The rest of the series are cleanups and bug fixes that are MIPS generic and consist largely of changes that Imgtec/MIPS had published in their linux-mti-3.10.git stable tree. Random other cleanups and patches preparing code to be merged in 3.15" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (139 commits) mips: select ARCH_MIGHT_HAVE_PC_SERIO mips: delete non-required instances of include <linux/init.h> MIPS: KVM: remove shadow_tlb code MIPS: KVM: use common EHINV aware UNIQUE_ENTRYHI mips/ide: flush dcache also if icache does not snoop dcache MIPS: BCM47XX: fix position of cpu_wait disabling MIPS: BCM63XX: select correct MIPS_L1_CACHE_SHIFT value MIPS: update MIPS_L1_CACHE_SHIFT based on MIPS_L1_CACHE_SHIFT_<N> MIPS: introduce MIPS_L1_CACHE_SHIFT_<N> MIPS: ZBOOT: gather string functions into string.c arch/mips/pci: don't check resource with devm_ioremap_resource arch/mips/lantiq/xway: don't check resource with devm_ioremap_resource bcma: gpio: don't cast u32 to unsigned long ssb: gpio: add own IRQ domain MIPS: BCM47XX: fix sparse warnings in board.c MIPS: BCM47XX: add board detection for Linksys WRT54GS V1 MIPS: BCM47XX: fix detection for some boards MIPS: BCM47XX: Enable buttons support on SSB MIPS: BCM47XX: Convert WNDR4500 to new syntax MIPS: BCM47XX: Use "timer" trigger for status LEDs ...	2014-01-30 17:20:32 -08:00
Linus Torvalds	04a24ae45d	OpenRISC updates for 3.14 The interesting change here is a rework of the OpenRISC signal handling to make it more like other architectures in the hopes that this makes it easier for others to comment on and understand. This rework fixes some real bugs, like the fact that syscall restart did not work reliably. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEABECAAYFAlLqR+0ACgkQ70gcjN2673PA4ACePPK0tWp0swxAezjwq1KzXfVi xMcAn3Y/z7PXximluzeThTeqjeSz/wJ4 =DDIb -----END PGP SIGNATURE----- Merge tag 'for-3.14' of git://openrisc.net/~jonas/linux Pull OpenRISC updates from Jonas Bonn: "The interesting change here is a rework of the OpenRISC signal handling to make it more like other architectures in the hopes that this makes it easier for others to comment on and understand. This rework fixes some real bugs, like the fact that syscall restart did not work reliably" * tag 'for-3.14' of git://openrisc.net/~jonas/linux: openrisc: Use get_signal() signal_setup_done() openrisc: Rework signal handling	2014-01-30 17:08:41 -08:00
Linus Torvalds	4bcec913d0	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull more powerpc bits from Ben Herrenschmidt: "Here are a few more powerpc bits for this merge window. The bulk is made of two pull requests from Scott and Anatolij that I had missed previously (they arrived while I was away). Since both their branches are in -next independently, and the content has been around for a little while, they can still go in. The rest is mostly bug and regression fixes, a small series of cleanups to our pseries cpuidle code (including moving it to the right place), and one new cpuidle bakend for the powernv platform. I also wired up the new sched_attr syscalls" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (37 commits) powerpc: Wire up sched_setattr and sched_getattr syscalls powerpc/hugetlb: Replace __get_cpu_var with get_cpu_var powerpc: Make sure "cache" directory is removed when offlining cpu powerpc/mm: Fix mmap errno when MAP_FIXED is set and mapping exceeds the allowed address space powerpc/powernv/cpuidle: Back-end cpuidle driver for powernv platform. powerpc/pseries/cpuidle: smt-snooze-delay cleanup. powerpc/pseries/cpuidle: Remove MAX_IDLE_STATE macro. powerpc/pseries/cpuidle: Make cpuidle-pseries backend driver a non-module. powerpc/pseries/cpuidle: Use cpuidle_register() for initialisation. powerpc/pseries/cpuidle: Move processor_idle.c to drivers/cpuidle. powerpc: Fix 32-bit frames for signals delivered when transactional powerpc/iommu: Fix initialisation of DART iommu table powerpc/numa: Fix decimal permissions powerpc/mm: Fix compile error of pgtable-ppc64.h powerpc: Fix hw breakpoints on !HAVE_HW_BREAKPOINT configurations clk: corenet: Adds the clock binding powerpc/booke64: Guard e6500 tlb handler with CONFIG_PPC_FSL_BOOK3E powerpc/512x: dts: add MPC5125 clock specs powerpc/512x: clk: support MPC5121/5123/5125 SoC variants powerpc/512x: clk: enforce even SDHC divider values ...	2014-01-30 17:07:18 -08:00
Linus Torvalds	03c7287dd2	Merge branch 'drop-time' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull __TIME__/__DATE__ removal from Michal Marek: "This series by Josh finishes the removal of __DATE__ and __TIME__ from the kernel. The last patch adds -Werror=date-time to KBUILD_CFLAGS to stop these from reappearing. Part of the series went through Greg's trees during this merge window, which is why this pull request is not based on v3.13-rc1" * 'drop-time' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: Makefile: Build with -Werror=date-time if the compiler supports it x86: math-emu: Drop already-disabled print of build date net: wireless: brcm80211: Drop debug version with build date/time mtd: denali: Drop print of build date/time	2014-01-30 17:00:35 -08:00
Linus Torvalds	597690cd02	Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild changes from Michal Marek: - fix make -s detection with make-4.0 - fix for scripts/setlocalversion when the kernel repository is a submodule - do not hardcode ';' in macros that expand to assembler code, as some architectures' assemblers use a different character for newline - Fix passing --gdwarf-2 to the assembler * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: frv: Remove redundant debugging info flag mn10300: Remove redundant debugging info flag kbuild: Fix debugging info generation for .S files arch: use ASM_NL instead of ';' for assembler new line character in the macro kbuild: Fix silent builds with make-4 Fix detectition of kernel git repository in setlocalversion script [take #2]	2014-01-30 16:58:05 -08:00
Vladimir Davydov	7c094fd698	memcg: fix mutex not unlocked on memcg_create_kmem_cache fail path Commit `842e287369` ("memcg: get rid of kmem_cache_dup()") introduced a mutex for memcg_create_kmem_cache() to protect the tmp_name buffer that holds the memcg name. It failed to unlock the mutex if this buffer could not be allocated. This patch fixes the issue by appropriately unlocking the mutex if the allocation fails. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Glauber Costa <glommer@parallels.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:56 -08:00
Richard Yao	46bf16c44b	Documentation/filesystems/vfs.txt: update file_operations documentation ->readv, ->writev and ->sendfile have been removed while ->show_fdinfo has been added. The documentation should reflect this. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:56 -08:00
David Rientjes	778c14affa	mm, oom: base root bonus on current usage A 3% of system memory bonus is sometimes too excessive in comparison to other processes. With commit `a63d83f427` ("oom: badness heuristic rewrite"), the OOM killer tries to avoid killing privileged tasks by subtracting 3% of overall memory (system or cgroup) from their per-task consumption. But as a result, all root tasks that consume less than 3% of overall memory are considered equal, and so it only takes 33+ privileged tasks pushing the system out of memory for the OOM killer to do something stupid and kill dhclient or other root-owned processes. For example, on a 32G machine it can't tell the difference between the 1M agetty and the 10G fork bomb member. The changelog describes this 3% boost as the equivalent to the global overcommit limit being 3% higher for privileged tasks, but this is not the same as discounting 3% of overall memory from _every privileged task individually_ during OOM selection. Replace the 3% of system memory bonus with a 3% of current memory usage bonus. By giving root tasks a bonus that is proportional to their actual size, they remain comparable even when relatively small. In the example above, the OOM killer will discount the 1M agetty's 256 badness points down to 179, and the 10G fork bomb's 262144 points down to 183500 points and make the right choice, instead of discounting both to 0 and killing agetty because it's first in the task list. Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:56 -08:00
Andrey Vagin	24f91eba18	mm: don't lose the SOFT_DIRTY flag on mprotect The SOFT_DIRTY bit shows that the content of memory was changed after a defined point in the past. mprotect() doesn't change the content of memory, so it must not change the SOFT_DIRTY bit. This bug causes a malfunction: on the first iteration all pages are dumped. On other iterations only pages with the SOFT_DIRTY bit are dumped. So if the SOFT_DIRTY bit is cleared from a page by mistake, the page is not dumped and its content will be restored incorrectly. This patch does nothing with _PAGE_SWP_SOFT_DIRTY, becase pte_modify() is called only for present pages. Fixes commit `0f8975ec4d` ("mm: soft-dirty bits for user memory changes tracking"). Signed-off-by: Andrey Vagin <avagin@openvz.org> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Borislav Petkov <bp@suse.de> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:56 -08:00
Dave Hansen	a03208652d	mm/slub.c: fix page->_count corruption (again) Commit `abca7c4965` ("mm: fix slab->page _count corruption when using slub") notes that we can not _set_ a page->counters directly, except when using a real double-cmpxchg. Doing so can lose updates to ->_count. That is an absolute rule: You may not set page->counters except via a cmpxchg. Commit `abca7c4965` fixed this for the folks who have the slub cmpxchg_double code turned off at compile time, but it left the bad case alone. It can still be reached, and the same bug triggered in two cases: 1. Turning on slub debugging at runtime, which is available on the distro kernels that I looked at. 2. On 64-bit CPUs with no CMPXCHG16B (some early AMD x86-64 cpus, evidently) There are at least 3 ways we could fix this: 1. Take all of the exising calls to cmpxchg_double_slab() and __cmpxchg_double_slab() and convert them to take an old, new and target 'struct page'. 2. Do (1), but with the newly-introduced 'slub_data'. 3. Do some magic inside the two cmpxchg...slab() functions to pull the counters out of new_counters and only set those fields in page->{inuse,frozen,objects}. I've done (2) as well, but it's a bunch more code. This patch is an attempt at (3). This was the most straightforward and foolproof way that I could think to do this. This would also technically allow us to get rid of the ugly #if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \ defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE) in 'struct page', but leaving it alone has the added benefit that 'counters' stays 'unsigned' instead of 'unsigned long', so all the copies that the slub code does stay a bit smaller. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:56 -08:00
David Rientjes	8790c71a18	mm/mempolicy.c: fix mempolicy printing in numa_maps As a result of commit `5606e3877a` ("mm: numa: Migrate on reference policy"), /proc/<pid>/numa_maps prints the mempolicy for any <pid> as "prefer:N" for the local node, N, of the process reading the file. This should only be printed when the mempolicy of <pid> is MPOL_PREFERRED for node N. If the process is actually only using the default mempolicy for local node allocation, make sure "default" is printed as expected. Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Robert Lippert <rlippert@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: <stable@vger.kernel.org> [3.7+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:56 -08:00
Minchan Kim	e46e33152e	zram: remove zram->lock in read path and change it with mutex Finally, we separated zram->lock dependency from 32bit stat/ table handling so there is no reason to use rw_semaphore between read and write path so this patch removes the lock from read path totally and changes rw_semaphore with mutex. So, we could do old: read-read: OK read-write: NO write-write: NO Now: read-read: OK read-write: OK write-write: NO The below data proves mixed workload performs well 11 times and there is also enhance on write-write path because current rw-semaphore doesn't support SPIN_ON_OWNER. It's side effect but anyway good thing for us. Write-related tests perform better (from 61% to 1058%) but read path has good/bad(from -2.22% to 1.45%) but they are all marginal within stddev. CPU 12 iozone -t -T -l 12 -u 12 -r 16K -s 60M -I +Z -V 0 ==Initial write ==Initial write records: 10 records: 10 avg: 516189.16 avg: 839907.96 std: 22486.53 (4.36%) std: 47902.17 (5.70%) max: 546970.60 max: 909910.35 min: 481131.54 min: 751148.38 ==Rewrite ==Rewrite records: 10 records: 10 avg: 509527.98 avg: 1050156.37 std: 45799.94 (8.99%) std: 40695.44 (3.88%) max: 611574.27 max: 1111929.26 min: 443679.95 min: 980409.62 ==Read ==Read records: 10 records: 10 avg: 4408624.17 avg: 4472546.76 std: 281152.61 (6.38%) std: 163662.78 (3.66%) max: 4867888.66 max: 4727351.03 min: 4058347.69 min: 4126520.88 ==Re-read ==Re-read records: 10 records: 10 avg: 4462147.53 avg: 4363257.75 std: 283546.11 (6.35%) std: 247292.63 (5.67%) max: 4912894.44 max: 4677241.75 min: 4131386.50 min: 4035235.84 ==Reverse Read ==Reverse Read records: 10 records: 10 avg: 4565865.97 avg: 4485818.08 std: 313395.63 (6.86%) std: 248470.10 (5.54%) max: 5232749.16 max: 4789749.94 min: 4185809.62 min: 3963081.34 ==Stride read ==Stride read records: 10 records: 10 avg: 4515981.80 avg: 4418806.01 std: 211192.32 (4.68%) std: 212837.97 (4.82%) max: 4889287.28 max: 4686967.22 min: 4210362.00 min: 4083041.84 ==Random read ==Random read records: 10 records: 10 avg: 4410525.23 avg: 4387093.18 std: 236693.22 (5.37%) std: 235285.23 (5.36%) max: 4713698.47 max: 4669760.62 min: 4057163.62 min: 3952002.16 ==Mixed workload ==Mixed workload records: 10 records: 10 avg: 243234.25 avg: 2818677.27 std: 28505.07 (11.72%) std: 195569.70 (6.94%) max: 288905.23 max: 3126478.11 min: 212473.16 min: 2484150.69 ==Random write ==Random write records: 10 records: 10 avg: 555887.07 avg: 1053057.79 std: 70841.98 (12.74%) std: 35195.36 (3.34%) max: 683188.28 max: 1096125.73 min: 437299.57 min: 992481.93 ==Pwrite ==Pwrite records: 10 records: 10 avg: 501745.93 avg: 810363.09 std: 16373.54 (3.26%) std: 19245.01 (2.37%) max: 518724.52 max: 833359.70 min: 464208.73 min: 765501.87 ==Pread ==Pread records: 10 records: 10 avg: 4539894.60 avg: 4457680.58 std: 197094.66 (4.34%) std: 188965.60 (4.24%) max: 4877170.38 max: 4689905.53 min: 4226326.03 min: 4095739.72 Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:56 -08:00
Minchan Kim	f614a9f48d	zram: remove workqueue for freeing removed pending slot Commit `a0c516cbfc` ("zram: don't grab mutex in zram_slot_free_noity") introduced free request pending code to avoid scheduling by mutex under spinlock and it was a mess which made code lenghty and increased overhead. Now, we don't need zram->lock any more to free slot so this patch reverts it and then, tb_lock should protect it. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:55 -08:00
Minchan Kim	92967471b6	zram: introduce zram->tb_lock Currently, the zram table is protected by zram->lock but it's rather coarse-grained lock and it makes hard for scalibility. Let's use own rwlock instead of depending on zram->lock. This patch adds new locking so obviously, it would make slow but this patch is just prepartion for removing coarse-grained rw_semaphore(ie, zram->lock) which is hurdle about zram scalability. Final patch in this patchset series will remove the lock from read-path and change rw_semaphore with mutex in write path. With bonus, we could drop pending slot free mess in next patch. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:55 -08:00
Minchan Kim	deb0bdeb2f	zram: use atomic operation for stat Some of fields in zram->stats are protected by zram->lock which is rather coarse-grained so let's use atomic operation without explict locking. This patch is ready for removing dependency of zram->lock in read path which is very coarse-grained rw_semaphore. Of course, this patch adds new atomic operation so it might make slow but my 12CPU test couldn't spot any regression. All gain/lose is marginal within stddev. iozone -t -T -l 12 -u 12 -r 16K -s 60M -I +Z -V 0 ==Initial write ==Initial write records: 50 records: 50 avg: 412875.17 avg: 415638.23 std: 38543.12 (9.34%) std: 36601.11 (8.81%) max: 521262.03 max: 502976.72 min: 343263.13 min: 351389.12 ==Rewrite ==Rewrite records: 50 records: 50 avg: 416640.34 avg: 397914.33 std: 60798.92 (14.59%) std: 46150.42 (11.60%) max: 543057.07 max: 522669.17 min: 304071.67 min: 316588.77 ==Read ==Read records: 50 records: 50 avg: 4147338.63 avg: 4070736.51 std: 179333.25 (4.32%) std: 223499.89 (5.49%) max: 4459295.28 max: 4539514.44 min: 3753057.53 min: 3444686.31 ==Re-read ==Re-read records: 50 records: 50 avg: 4096706.71 avg: 4117218.57 std: 229735.04 (5.61%) std: 171676.25 (4.17%) max: 4430012.09 max: 4459263.94 min: 2987217.80 min: 3666904.28 ==Reverse Read ==Reverse Read records: 50 records: 50 avg: 4062763.83 avg: 4078508.32 std: 186208.46 (4.58%) std: 172684.34 (4.23%) max: 4401358.78 max: 4424757.22 min: 3381625.00 min: 3679359.94 ==Stride read ==Stride read records: 50 records: 50 avg: 4094933.49 avg: 4082170.22 std: 185710.52 (4.54%) std: 196346.68 (4.81%) max: 4478241.25 max: 4460060.97 min: 3732593.23 min: 3584125.78 ==Random read ==Random read records: 50 records: 50 avg: 4031070.04 avg: 4074847.49 std: 192065.51 (4.76%) std: 206911.33 (5.08%) max: 4356931.16 max: 4399442.56 min: 3481619.62 min: 3548372.44 ==Mixed workload ==Mixed workload records: 50 records: 50 avg: 149925.73 avg: 149675.54 std: 7701.26 (5.14%) std: 6902.09 (4.61%) max: 191301.56 max: 175162.05 min: 133566.28 min: 137762.87 ==Random write ==Random write records: 50 records: 50 avg: 404050.11 avg: 393021.47 std: 58887.57 (14.57%) std: 42813.70 (10.89%) max: 601798.09 max: 524533.43 min: 325176.99 min: 313255.34 ==Pwrite ==Pwrite records: 50 records: 50 avg: 411217.70 avg: 411237.96 std: 43114.99 (10.48%) std: 33136.29 (8.06%) max: 530766.79 max: 471899.76 min: 320786.84 min: 317906.94 ==Pread ==Pread records: 50 records: 50 avg: 4154908.65 avg: 4087121.92 std: 151272.08 (3.64%) std: 219505.04 (5.37%) max: 4459478.12 max: 4435857.38 min: 3730512.41 min: 3101101.67 Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:55 -08:00
Minchan Kim	874e3cddc3	zram: remove unnecessary free Commit `a0c516cbfc` ("zram: don't grab mutex in zram_slot_free_noity") introduced pending zram slot free in zram's write path in case of missing slot free by memory allocation failure in zram_slot_free_notify but it is not necessary because we have already freed the slot right before overwriting. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:55 -08:00
Minchan Kim	9b353db16d	zram: delay pending free request in read path Sergey reported we don't need to handle pending free request every I/O so that this patch removes it in read path while we remain it in write path. Let's consider below example. Swap subsystem ask to zram "A" block free by swap_slot_free_notify but zram had been pended it without real freeing. Swap subsystem allocates "A" block for new data but request pended for a long time just handled and zram blindly free new data on the "A" block. :( That's why we couldn't remove handle pending free request right before zram-write. Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:55 -08:00
Minchan Kim	da4a04126b	zram: fix race between reset and flushing pending work Dan and Sergey reported that there is a racy between reset and flushing of pending work so that it could make oops by freeing zram->meta in reset while zram_slot_free can access zram->meta if new request is adding during the race window. This patch moves flush after taking init_lock so it prevents new request so that it closes the race. Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:55 -08:00
Minchan Kim	eae70d0684	zsmalloc: add maintainers tAdd adds maintainer information for zsmalloc into the MAINTAINERS file. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:55 -08:00
Minchan Kim	6920f2cc9e	zram: add zram maintainers Add maintainer information for zram into the MAINTAINERS file. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:55 -08:00
Minchan Kim	31fc00bb78	zsmalloc: add copyright Add my copyright to the zsmalloc source code which I maintain. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:55 -08:00
Minchan Kim	7bfb3de8a1	zram: add copyright Add my copyright to the zram source code which I maintain. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-30 16:56:55 -08:00

1 2 3 4 5 ...

425939 Коммитов Все ветки Поиск

425939 Коммитов

Все ветки