WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Andrey Ryabinin	eae08dcab8	kasan/tests: add tests for user memory access functions Add some tests for the newly-added user memory access API. Link: http://lkml.kernel.org/r/1462538722-1574-1-git-send-email-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-20 17:58:30 -07:00
Andrey Ryabinin	1771c6e1a5	x86/kasan: instrument user memory access API Exchange between user and kernel memory is coded in assembly language. Which means that such accesses won't be spotted by KASAN as a compiler instruments only C code. Add explicit KASAN checks to user memory access API to ensure that userspace writes to (or reads from) a valid kernel memory. Note: Unlike others strncpy_from_user() is written mostly in C and KASAN sees memory accesses in it. However, it makes sense to add explicit check for all @count bytes that potentially could be written to the kernel. [aryabinin@virtuozzo.com: move kasan check under the condition] Link: http://lkml.kernel.org/r/1462869209-21096-1-git-send-email-aryabinin@virtuozzo.com Link: http://lkml.kernel.org/r/1462538722-1574-4-git-send-email-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-20 17:58:30 -07:00
Alexander Potapenko	96fe805fb6	mm, kasan: add a ksize() test Add a test that makes sure ksize() unpoisons the whole chunk. Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andrey Konovalov <adech.fo@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Konstantin Serebryany <kcc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-20 17:58:30 -07:00
Linus Torvalds	a05a70db34	Merge branch 'akpm' (patches from Andrew) Merge updates from Andrew Morton: - fsnotify fix - poll() timeout fix - a few scripts/ tweaks - debugobjects updates - the (small) ocfs2 queue - Minor fixes to kernel/padata.c - Maybe half of the MM queue * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (117 commits) mm, page_alloc: restore the original nodemask if the fast path allocation failed mm, page_alloc: uninline the bad page part of check_new_page() mm, page_alloc: don't duplicate code in free_pcp_prepare mm, page_alloc: defer debugging checks of pages allocated from the PCP mm, page_alloc: defer debugging checks of freed pages until a PCP drain cpuset: use static key better and convert to new API mm, page_alloc: inline pageblock lookup in page free fast paths mm, page_alloc: remove unnecessary variable from free_pcppages_bulk mm, page_alloc: pull out side effects from free_pages_check mm, page_alloc: un-inline the bad part of free_pages_check mm, page_alloc: check multiple page fields with a single branch mm, page_alloc: remove field from alloc_context mm, page_alloc: avoid looking up the first zone in a zonelist twice mm, page_alloc: shortcut watermark checks for order-0 pages mm, page_alloc: reduce cost of fair zone allocation policy retry mm, page_alloc: shorten the page allocator fast path mm, page_alloc: check once if a zone has isolated pageblocks mm, page_alloc: move __GFP_HARDWALL modifications out of the fastpath mm, page_alloc: simplify last cpupid reset mm, page_alloc: remove unnecessary initialisation from __alloc_pages_nodemask() ...	2016-05-19 20:00:06 -07:00
Andrew Morton	0edaf86cf1	include/linux/nodemask.h: create next_node_in() helper Lots of code does node = next_node(node, XXX); if (node == MAX_NUMNODES) node = first_node(XXX); so create next_node_in() to do this and use it in various places. [mhocko@suse.com: use next_node_in() helper] Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@kernel.org> Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Hui Zhu <zhuhui@xiaomi.com> Cc: Wang Xiaoqiang <wangxq10@lzu.edu.cn> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-19 19:12:14 -07:00
Du, Changbin	b9fdac7f66	debugobjects: insulate non-fixup logic related to static obj from fixup callbacks When activating a static object we need make sure that the object is tracked in the object tracker. If it is a non-static object then the activation is illegal. In previous implementation, each subsystem need take care of this in their fixup callbacks. Actually we can put it into debugobjects core. Thus we can save duplicated code, and have pure fixup callbacks. To achieve this, a new callback "is_static_object" is introduced to let the type specific code decide whether a object is static or not. If yes, we take it into object tracker, otherwise give warning and invoke fixup callback. This change has paassed debugobjects selftest, and I also do some test with all debugobjects supports enabled. At last, I have a concern about the fixups that can it change the object which is in incorrect state on fixup? Because the 'addr' may not point to any valid object if a non-static object is not tracked. Then Change such object can overwrite someone's memory and cause unexpected behaviour. For example, the timer_fixup_activate bind timer to function stub_timer. Link: http://lkml.kernel.org/r/1462576157-14539-1-git-send-email-changbin.du@intel.com [changbin.du@intel.com: improve code comments where invoke the new is_static_object callback] Link: http://lkml.kernel.org/r/1462777431-8171-1-git-send-email-changbin.du@intel.com Signed-off-by: Du, Changbin <changbin.du@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Triplett <josh@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-19 19:12:14 -07:00
Du, Changbin	d99b1d8912	percpu_counter: update debugobjects fixup callbacks return type Update the return type to use bool instead of int, corresponding to cheange (debugobjects: make fixup functions return bool instead of int). Signed-off-by: Du, Changbin <changbin.du@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Triplett <josh@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-19 19:12:14 -07:00
Du, Changbin	e7a8e78bd4	debugobjects: correct the usage of fixup call results If debug_object_fixup() return non-zero when problem has been fixed. But the code got it backwards, it taks 0 as fixup successfully. So fix it. Signed-off-by: Du, Changbin <changbin.du@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Triplett <josh@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-19 19:12:14 -07:00
Du, Changbin	b1e4d9d82d	debugobjects: make fixup functions return bool instead of int I am going to introduce debugobjects infrastructure to USB subsystem. But before this, I found the code of debugobjects could be improved. This patchset will make fixup functions return bool type instead of int. Because fixup only need report success or no. boolean is the 'real' type. This patch (of 7): The object debugging infrastructure core provides some fixup callbacks for the subsystem who use it. These callbacks are called from the debug code whenever a problem in debug_object_init is detected. And debugobjects core suppose them returns 1 when the fixup was successful, otherwise 0. So the return type is boolean. A bad thing is that debug_object_fixup use the return value for arithmetic operation. It confused me that what is the reall return type. Reading over the whole code, I found some place do use the return value incorrectly(see next patch). So why use bool type instead? Signed-off-by: Du, Changbin <changbin.du@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Triplett <josh@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-19 19:12:14 -07:00
Linus Torvalds	f4f27d0028	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Highlights: - A new LSM, "LoadPin", from Kees Cook is added, which allows forcing of modules and firmware to be loaded from a specific device (this is from ChromeOS, where the device as a whole is verified cryptographically via dm-verity). This is disabled by default but can be configured to be enabled by default (don't do this if you don't know what you're doing). - Keys: allow authentication data to be stored in an asymmetric key. Lots of general fixes and updates. - SELinux: add restrictions for loading of kernel modules via finit_module(). Distinguish non-init user namespace capability checks. Apply execstack check on thread stacks" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (48 commits) LSM: LoadPin: provide enablement CONFIG Yama: use atomic allocations when reporting seccomp: Fix comment typo ima: add support for creating files using the mknodat syscall ima: fix ima_inode_post_setattr vfs: forbid write access when reading a file into memory fs: fix over-zealous use of "const" selinux: apply execstack check on thread stacks selinux: distinguish non-init user namespace capability checks LSM: LoadPin for kernel file loading restrictions fs: define a string representation of the kernel_read_file_id enumeration Yama: consolidate error reporting string_helpers: add kstrdup_quotable_file string_helpers: add kstrdup_quotable_cmdline string_helpers: add kstrdup_quotable selinux: check ss_initialized before revalidating an inode label selinux: delay inode label lookup as long as possible selinux: don't revalidate an inode's label when explicitly setting it selinux: Change bool variable name to index. KEYS: Add KEYCTL_DH_COMPUTE command ...	2016-05-19 09:21:36 -07:00
Linus Torvalds	675e0655c1	SCSI misc on 20160517 This patch includes the usual quota of driver updates (bnx2fc, mp3sas, hpsa, ncr5380, lpfc, hisi_sas, snic, aacraid, megaraid_sas) there's also a multiqueue update for scsi_debug, assorted bug fixes and a few other minor updates (refactor of scsi_sg_pools into generic code, alua and VPD updates, and struct timeval conversions). Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABAgAGBQJXO8W0AAoJEDeqqVYsXL0MW24H/jGWwfjsDUiSsLwbLca6DWu8 ZCWZ7rSZ27CApwGPgZGpLvUg+vpW8Ykm2zdeBnlZ6ScXS+dT3uo/PHsnemsTextj 6glQNIOFY0Ja2GwkkN00M6IZQhTJ628cqJKIEJxC68lIw16wiOwjZaK68GMrusDO Sl062rkuLR6Jb2T+YoT/sD8jQfWlSj2V9e9rqJoS/rIbS6B+hUipuybz2yQ2yK2u XFc30yal9oVz1fHEoh2O8aqckW3/iskukVXVuZ0MQzT/lV/bm9I6AnWVHw7d0Yhp ZELjXpjx5M2Z/d8k0Wvx1e25oL/ERwa96yLnTvRcqyF5Yt1EgAhT+jKvo4pnGr8= =L6y/ -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "First round of SCSI updates for the 4.6+ merge window. This batch includes the usual quota of driver updates (bnx2fc, mp3sas, hpsa, ncr5380, lpfc, hisi_sas, snic, aacraid, megaraid_sas). There's also a multiqueue update for scsi_debug, assorted bug fixes and a few other minor updates (refactor of scsi_sg_pools into generic code, alua and VPD updates, and struct timeval conversions)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (138 commits) mpt3sas: Used "synchronize_irq()"API to synchronize timed-out IO & TMs mpt3sas: Set maximum transfer length per IO to 4MB for VDs mpt3sas: Updating mpt3sas driver version to 13.100.00.00 mpt3sas: Fix initial Reference tag field for 4K PI drives. mpt3sas: Handle active cable exception event mpt3sas: Update MPI header to 2.00.42 Revert "lpfc: Delete unnecessary checks before the function call mempool_destroy" eata_pio: missing break statement hpsa: Fix type ZBC conditional checks scsi_lib: Decode T10 vendor IDs scsi_dh_alua: do not fail for unknown VPD identification scsi_debug: use locally assigned naa scsi_debug: uuid for lu name scsi_debug: vpd and mode page work scsi_debug: add multiple queue support bfa: fix bfa_fcb_itnim_alloc() error handling megaraid_sas: Downgrade two success messages to info cxlflash: Fix to resolve dead-lock during EEH recovery scsi_debug: rework resp_report_luns scsi_debug: use pdt constants ...	2016-05-18 16:38:59 -07:00
Linus Torvalds	69370471d0	Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull iov_iter cleanups from Al Viro. * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fold checks into iterate_and_advance() rw_verify_area(): saner calling conventions aio: remove a pointless assignment	2016-05-18 11:46:23 -07:00
James Bottomley	e7ca7f9fa2	Merge branch 'fixes' into misc	2016-05-17 21:12:50 -04:00
Linus Torvalds	a7fd20d1c4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Highlights: 1) Support SPI based w5100 devices, from Akinobu Mita. 2) Partial Segmentation Offload, from Alexander Duyck. 3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE. 4) Allow cls_flower stats offload, from Amir Vadai. 5) Implement bpf blinding, from Daniel Borkmann. 6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is actually using FASYNC these atomics are superfluous. From Eric Dumazet. 7) Run TCP more preemptibly, also from Eric Dumazet. 8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e driver, from Gal Pressman. 9) Allow creating ppp devices via rtnetlink, from Guillaume Nault. 10) Improve BPF usage documentation, from Jesper Dangaard Brouer. 11) Support tunneling offloads in qed, from Manish Chopra. 12) aRFS offloading in mlx5e, from Maor Gottlieb. 13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo Leitner. 14) Add MSG_EOR support to TCP, this allows controlling packet coalescing on application record boundaries for more accurate socket timestamp sampling. From Martin KaFai Lau. 15) Fix alignment of 64-bit netlink attributes across the board, from Nicolas Dichtel. 16) Per-vlan stats in bridging, from Nikolay Aleksandrov. 17) Several conversions of drivers to ethtool ksettings, from Philippe Reynes. 18) Checksum neutral ILA in ipv6, from Tom Herbert. 19) Factorize all of the various marvell dsa drivers into one, from Vivien Didelot 20) Add VF support to qed driver, from Yuval Mintz" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits) Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m" Revert "phy dp83867: Make rgmii parameters optional" r8169: default to 64-bit DMA on recent PCIe chips phy dp83867: Make rgmii parameters optional phy dp83867: Fix compilation with CONFIG_OF_MDIO=m bpf: arm64: remove callee-save registers use for tmp registers asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions switchdev: pass pointer to fib_info instead of copy net_sched: close another race condition in tcf_mirred_release() tipc: fix nametable publication field in nl compat drivers: net: Don't print unpopulated net_device name qed: add support for dcbx. ravb: Add missing free_irq() calls to ravb_close() qed: Remove a stray tab net: ethernet: fec-mpc52xx: use phy_ethtool_{get\|set}_link_ksettings net: ethernet: fec-mpc52xx: use phydev from struct net_device bpf, doc: fix typo on bpf_asm descriptions stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set net: ethernet: fs-enet: use phy_ethtool_{get\|set}_link_ksettings net: ethernet: fs-enet: use phydev from struct net_device ...	2016-05-17 16:26:30 -07:00
Linus Torvalds	9a07a79684	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: "API: - Crypto self tests can now be disabled at boot/run time. - Add async support to algif_aead. Algorithms: - A large number of fixes to MPI from Nicolai Stange. - Performance improvement for HMAC DRBG. Drivers: - Use generic crypto engine in omap-des. - Merge ppc4xx-rng and crypto4xx drivers. - Fix lockups in sun4i-ss driver by disabling IRQs. - Add DMA engine support to ccp. - Reenable talitos hash algorithms. - Add support for Hisilicon SoC RNG. - Add basic crypto driver for the MXC SCC. Others: - Do not allocate crypto hash tfm in NORECLAIM context in ecryptfs" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (77 commits) crypto: qat - change the adf_ctl_stop_devices to void crypto: caam - fix caam_jr_alloc() ret code crypto: vmx - comply with ABIs that specify vrsave as reserved. crypto: testmgr - Add a flag allowing the self-tests to be disabled at runtime. crypto: ccp - constify ccp_actions structure crypto: marvell/cesa - Use dma_pool_zalloc crypto: qat - make adf_vf_isr.c dependant on IOV config crypto: qat - Fix typo in comments lib: asn1_decoder - add MODULE_LICENSE("GPL") crypto: omap-sham - Use dma_request_chan() for requesting DMA channel crypto: omap-des - Use dma_request_chan() for requesting DMA channel crypto: omap-aes - Use dma_request_chan() for requesting DMA channel crypto: omap-des - Integrate with the crypto engine framework crypto: s5p-sss - fix incorrect usage of scatterlists api crypto: s5p-sss - Fix missed interrupts when working with 8 kB blocks crypto: s5p-sss - Use common BIT macro crypto: mxc-scc - fix unwinding in mxc_scc_crypto_register() crypto: mxc-scc - signedness bugs in mxc_scc_ablkcipher_req_init() crypto: talitos - fix ahash algorithms registration crypto: ccp - Ensure all dependencies are specified ...	2016-05-17 09:33:39 -07:00
Linus Torvalds	a3871bd434	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main changes are: - Documentation updates, including fixes to the design-level requirements documentation and a fixed version of the design-level data-structure documentation. These fixes include removing cartoons and getting rid of the html/htmlx duplication. - Further improvements to the new-age expedited grace periods. - Miscellaneous fixes. - Torture-test changes, including a new rcuperf module for measuring RCU grace-period performance and scalability, which is useful for the expedited-grace-period changes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits) rcutorture: Add boot-time adjustment of leaf fanout rcutorture: Add irqs-disabled test for call_rcu() rcutorture: Dump trace buffer upon shutdown rcutorture: Don't rebuild identical kernel rcutorture: Add OS-jitter capability documentation: Add documentation for RCU's major data structures rcutorture: Convert test duration to seconds early torture: Kill qemu, not parent process torture: Clarify refusal to run more than one torture test rcutorture: Consider FROZEN hotplug notifier transitions rcutorture: Remove redundant initialization to zero rcuperf: Do not wake up shutdown wait queue if "shutdown" is false. rcutorture: Add largish-system rcuperf scenario rcutorture: Avoid RCU CPU stall warning and RT throttling rcutorture: Add rcuperf holdoff boot parameter to reduce interference rcutorture: Make scripts analyze rcuperf trace data, if present rcutorture: Make rcuperf collect expedited event-trace data rcutorture: Print measure of batching efficiency rcutorture: Set rcuperf writer kthreads to real-time priority rcutorture: Bind rcuperf reader/writer kthreads to CPUs ...	2016-05-16 12:02:08 -07:00
Linus Torvalds	0052af4411	Merge branch 'core-lib-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core/lib update from Ingo Molnar: "This contains a single commit that removes an unused facility that the scheduler used to make use of" * 'core-lib-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: lib/proportions: Remove unused code	2016-05-16 11:36:02 -07:00
Daniel Borkmann	d1c55ab5e4	bpf: prepare bpf_int_jit_compile/bpf_prog_select_runtime apis Since the blinding is strictly only called from inside eBPF JITs, we need to change signatures for bpf_int_jit_compile() and bpf_prog_select_runtime() first in order to prepare that the eBPF program we're dealing with can change underneath. Hence, for call sites, we need to return the latest prog. No functional change in this patch. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-16 13:49:32 -04:00
David S. Miller	909b27f706	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The nf_conntrack_core.c fix in 'net' is not relevant in 'net-next' because we no longer have a per-netns conntrack hash. The ip_gre.c conflict as well as the iwlwifi ones were cases of overlapping changes. Conflicts: drivers/net/wireless/intel/iwlwifi/mvm/tx.c net/ipv4/ip_gre.c net/netfilter/nf_conntrack_core.c Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-15 13:32:48 -04:00
Linus Torvalds	6ba5b85fd4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Overlayfs fixes from Miklos, assorted fixes from me. Stable fodder of varying severity, all sat in -next for a while" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ovl: ignore permissions on underlying lookup vfs: add lookup_hash() helper vfs: rename: check backing inode being equal vfs: add vfs_select_inode() helper get_rock_ridge_filename(): handle malformed NM entries ecryptfs: fix handling of directory opening atomic_open(): fix the handling of create_error fix the copy vs. map logics in blk_rq_map_user_iov() do_splice_to(): cap the size before passing to ->splice_read()	2016-05-14 11:59:43 -07:00
David Howells	23c8a812dc	KEYS: Fix ASN.1 indefinite length object parsing This fixes CVE-2016-0758. In the ASN.1 decoder, when the length field of an ASN.1 value is extracted, it isn't validated against the remaining amount of data before being added to the cursor. With a sufficiently large size indicated, the check: datalen - dp < 2 may then fail due to integer overflow. Fix this by checking the length indicated against the amount of remaining data in both places a definite length is determined. Whilst we're at it, make the following changes: (1) Check the maximum size of extended length does not exceed the capacity of the variable it's being stored in (len) rather than the type that variable is assumed to be (size_t). (2) Compare the EOC tag to the symbolic constant ASN1_EOC rather than the integer 0. (3) To reduce confusion, move the initialisation of len outside of: for (len = 0; n > 0; n--) { since it doesn't have anything to do with the loop counter n. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Peter Jones <pjones@redhat.com>	2016-05-12 12:01:49 +01:00
Al Viro	e4d35be584	Merge branch 'ovl-fixes' into for-linus	2016-05-11 00:00:29 -04:00
David S. Miller	e800072c18	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net In netdevice.h we removed the structure in net-next that is being changes in 'net'. In macsec.c and rtnetlink.c we have overlaps between fixes in 'net' and the u64 attribute changes in 'net-next'. The mlx5 conflicts have to do with vxlan support dependencies. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-09 15:59:24 -04:00
Al Viro	dd254f5a38	fold checks into iterate_and_advance() they are open-coded in all users except iov_iter_advance(), and there they wouldn't be a bad idea either - as it is, iov_iter_advance(i, 0) ends up dereferencing potentially past the end of iovec array. It doesn't do anything with the value it reads, and very unlikely to trigger an oops on dereference, but it is not impossible. Reported-by: Jiri Slaby <jslaby@suse.cz> Reported-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-09 14:04:29 -04:00
Joonsoo Kim	7c31190bcf	lib/stackdepot: avoid to return 0 handle Recently, we allow to save the stacktrace whose hashed value is 0. It causes the problem that stackdepot could return 0 even if in success. User of stackdepot cannot distinguish whether it is success or not so we need to solve this problem. In this patch, 1 bit are added to handle and make valid handle none 0 by setting this bit. After that, valid handle will not be 0 and 0 handle will represent failure correctly. Fixes: `33334e2576` ("lib/stackdepot.c: allow the stack trace hash to be zero") Link: http://lkml.kernel.org/r/1462252403-1106-1-git-send-email-iamjoonsoo.kim@lge.com Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-05 17:38:53 -07:00
David S. Miller	cba6532100	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/ipv4/ip_gre.c Minor conflicts between tunnel bug fixes in net and ipv6 tunnel cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-04 00:52:29 -04:00
Tudor Ambarus	ccab6058da	lib: asn1_decoder - add MODULE_LICENSE("GPL") A kernel taint results when loading the rsa_generic module: root@(none):~# modprobe rsa_generic asn1_decoder: module license 'unspecified' taints kernel. Disabling lock debugging due to kernel taint "Tainting" of the kernel is (usually) a way of indicating that a proprietary module has been inserted, which is not the case here. Signed-off-by: Tudor Ambarus <tudor-dan.ambarus@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-05-03 16:10:12 +08:00
Alexander Potapenko	33334e2576	lib/stackdepot.c: allow the stack trace hash to be zero Do not bail out from depot_save_stack() if the stack trace has zero hash. Initially depot_save_stack() silently dropped stack traces with zero hashes, however there's actually no point in reserving this zero value. Reported-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-04-28 19:34:04 -07:00
Ingo Molnar	41ed943d85	Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: * Documentation updates, including fixes to the design-level requirements documentation and a fixed version of the design-level data-structure documentation. These fixes include removing cartoons and getting rid of the html/htmlx duplication. * Further improvements to the new-age expedited grace periods. * Miscellaneous fixes. * Torture-test changes, including a new rcuperf module for measuring RCU grace-period performance and scalability, which is useful for the expedited-grace-period changes. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-04-27 16:57:36 +02:00
Nicolas Dichtel	11a9957307	libnl: fix help of _64bit functions Fix typo and describe 'padattr'. Fixes: `089bf1a6a9` ("libnl: add more helpers to align attributes on 64-bit") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-23 20:13:24 -04:00
David S. Miller	1602f49b58	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts were two cases of simple overlapping changes, nothing serious. In the UDP case, we need to add a hlist_add_tail_rcu() to linux/rculist.h, because we've moved UDP socket handling away from using nulls lists. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-23 18:51:33 -04:00
Nicolas Dichtel	089bf1a6a9	libnl: add more helpers to align attributes on 64-bit Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-21 14:22:12 -04:00
Kees Cook	21985319ad	string_helpers: add kstrdup_quotable_file Allocate a NULL-terminated file path with special characters escaped, safe for logging. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2016-04-21 10:47:26 +10:00
Kees Cook	0d0443288f	string_helpers: add kstrdup_quotable_cmdline Provide an escaped (but readable: no inter-argument NULLs) commandline safe for logging. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2016-04-21 10:47:26 +10:00
Kees Cook	b53f27e4fa	string_helpers: add kstrdup_quotable Handle allocating and escaping a string safe for logging. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2016-04-21 10:47:25 +10:00
Greg Kroah-Hartman	5614e77258	Merge 4.6-rc4 into driver-core-next We want those fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-19 04:28:28 +09:00
Linus Torvalds	e1e22b27ec	"driver core" fixes for 4.6-rc4 Here are 3 small fixes 4.6-rc4. Two fix up some lz4 issues with big endian systems, and the remaining one resolves a minor debugfs issue that was reported. All have been in linux-next with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlcS3ucACgkQMUfUDdst+ynw1QCbBGgbY7Xt08whNFcAP81z1Q5X fmsAn1fyrhfsxe+JnybzswsTOjFw99Xd =YX6h -----END PGP SIGNATURE----- Merge tag 'driver-core-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull misc fixes from Greg KH: "Here are three small fixes for 4.6-rc4. Two fix up some lz4 issues with big endian systems, and the remaining one resolves a minor debugfs issue that was reported. All have been in linux-next with no reported issues" * tag 'driver-core-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: lib: lz4: cleanup unaligned access efficiency detection lib: lz4: fixed zram with lz4 on big endian machines debugfs: Make automount point inodes permanently empty	2016-04-16 20:53:50 -07:00
Ming Lin	9b1d6c8950	lib: scatterlist: move SG pool code from SCSI driver to lib/sg_pool.c Now it's ready to move the mempool based SG chained allocator code from SCSI driver to lib/sg_pool.c, which will be compiled only based on a Kconfig symbol CONFIG_SG_POOL. SCSI selects CONFIG_SG_POOL. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2016-04-15 16:53:14 -04:00
Rui Salvaterra	dea5c24a14	lib: lz4: cleanup unaligned access efficiency detection These identifiers are bogus. The interested architectures should define HAVE_EFFICIENT_UNALIGNED_ACCESS whenever relevant to do so. If this isn't true for some arch, it should be fixed in the arch definition. Signed-off-by: Rui Salvaterra <rsalvaterra@gmail.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-13 09:22:49 -07:00
Rui Salvaterra	3e26a691fe	lib: lz4: fixed zram with lz4 on big endian machines Based on Sergey's test patch [1], this fixes zram with lz4 compression on big endian cpus. Note that the 64-bit preprocessor test is not a cleanup, it's part of the fix, since those identifiers are bogus (for example, __ppc64__ isn't defined anywhere else in the kernel, which means we'd fall into the 32-bit definitions on ppc64). Tested on ppc64 with no regression on x86_64. [1] http://marc.info/?l=linux-kernel&m=145994470805853&w=4 Cc: stable@vger.kernel.org Suggested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Rui Salvaterra <rsalvaterra@gmail.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-13 09:22:49 -07:00
James Morris	58976eef9d	Merge tag 'keys-fixes-20160412' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into for-linus	2016-04-13 11:06:52 +10:00
Nicolai Stange	9fd4dcece4	debugfs: prevent access to possibly dead file_operations at file open Nothing prevents a dentry found by path lookup before a return of __debugfs_remove() to actually get opened after that return. Now, after the return of __debugfs_remove(), there are no guarantees whatsoever regarding the memory the corresponding inode's file_operations object had been kept in. Since __debugfs_remove() is seldomly invoked, usually from module exit handlers only, the race is hard to trigger and the impact is very low. A discussion of the problem outlined above as well as a suggested solution can be found in the (sub-)thread rooted at http://lkml.kernel.org/g/20130401203445.GA20862@ZenIV.linux.org.uk ("Yet another pipe related oops.") Basically, Greg KH suggests to introduce an intermediate fops and Al Viro points out that a pointer to the original ones may be stored in ->d_fsdata. Follow this line of reasoning: - Add SRCU as a reverse dependency of DEBUG_FS. - Introduce a srcu_struct object for the debugfs subsystem. - In debugfs_create_file(), store a pointer to the original file_operations object in ->d_fsdata. - Make debugfs_remove() and debugfs_remove_recursive() wait for a SRCU grace period after the dentry has been delete()'d and before they return to their callers. - Introduce an intermediate file_operations object named "debugfs_open_proxy_file_operations". It's ->open() functions checks, under the protection of a SRCU read lock, whether the dentry is still alive, i.e. has not been d_delete()'d and if so, tries to acquire a reference on the owning module. On success, it sets the file object's ->f_op to the original file_operations and forwards the ongoing open() call to the original ->open(). - For clarity, rename the former debugfs_file_operations to debugfs_noop_file_operations -- they are in no way canonical. The choice of SRCU over "normal" RCU is justified by the fact, that the former may also be used to protect ->i_private data from going away during the execution of a file's readers and writers which may (and do) sleep. Finally, introduce the fs/debugfs/internal.h header containing some declarations internal to the debugfs implementation. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 14:14:21 -07:00
David S. Miller	ae95d71261	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-04-09 17:41:41 -04:00
Al Viro	357f435d8a	fix the copy vs. map logics in blk_rq_map_user_iov() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-04-08 19:46:28 -04:00
Naveen N. Rao	9c94f6c8e0	lib/test_bpf: Add additional BPF_ADD tests Some of these tests proved useful with the powerpc eBPF JIT port due to sign-extended 16-bit immediate loads. Though some of these aspects get covered in other tests, it is better to have explicit tests so as to quickly tag the precise problem. Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-06 16:47:51 -04:00
Naveen N. Rao	b64b50eac4	lib/test_bpf: Add test to check for result of 32-bit add that overflows BPF_ALU32 and BPF_ALU64 tests for adding two 32-bit values that results in 32-bit overflow. Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-06 16:47:51 -04:00
Naveen N. Rao	c7395d6bd7	lib/test_bpf: Add tests for unsigned BPF_JGT Unsigned Jump-if-Greater-Than. Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-06 16:47:51 -04:00
Naveen N. Rao	9f134c34fb	lib/test_bpf: Fix JMP_JSET tests JMP_JSET tests incorrectly used BPF_JNE. Fix the same. Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-06 16:47:50 -04:00
Jerome Marchand	8d4a2ec1e0	assoc_array: don't call compare_object() on a node Changes since V1: fixed the description and added KASan warning. In assoc_array_insert_into_terminal_node(), we call the compare_object() method on all non-empty slots, even when they're not leaves, passing a pointer to an unexpected structure to compare_object(). Currently it causes an out-of-bound read access in keyring_compare_object detected by KASan (see below). The issue is easily reproduced with keyutils testsuite. Only call compare_object() when the slot is a leave. KASan warning: ================================================================== BUG: KASAN: slab-out-of-bounds in keyring_compare_object+0x213/0x240 at addr ffff880060a6f838 Read of size 8 by task keyctl/1655 ============================================================================= BUG kmalloc-192 (Not tainted): kasan: bad access detected ----------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: Allocated in assoc_array_insert+0xfd0/0x3a60 age=69 cpu=1 pid=1647 ___slab_alloc+0x563/0x5c0 __slab_alloc+0x51/0x90 kmem_cache_alloc_trace+0x263/0x300 assoc_array_insert+0xfd0/0x3a60 __key_link_begin+0xfc/0x270 key_create_or_update+0x459/0xaf0 SyS_add_key+0x1ba/0x350 entry_SYSCALL_64_fastpath+0x12/0x76 INFO: Slab 0xffffea0001829b80 objects=16 used=8 fp=0xffff880060a6f550 flags=0x3fff8000004080 INFO: Object 0xffff880060a6f740 @offset=5952 fp=0xffff880060a6e5d1 Bytes b4 ffff880060a6f730: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff880060a6f740: d1 e5 a6 60 00 88 ff ff 0e 00 00 00 00 00 00 00 ...`............ Object ffff880060a6f750: 02 cf 8e 60 00 88 ff ff 02 c0 8e 60 00 88 ff ff ...`.......`.... Object ffff880060a6f760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff880060a6f770: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff880060a6f780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff880060a6f790: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff880060a6f7a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff880060a6f7b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff880060a6f7c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff880060a6f7d0: 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff880060a6f7e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object ffff880060a6f7f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ CPU: 0 PID: 1655 Comm: keyctl Tainted: G B 4.5.0-rc4-kasan+ #291 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 0000000000000000 000000001b2800b4 ffff880060a179e0 ffffffff81b60491 ffff88006c802900 ffff880060a6f740 ffff880060a17a10 ffffffff815e2969 ffff88006c802900 ffffea0001829b80 ffff880060a6f740 ffff880060a6e650 Call Trace: [<ffffffff81b60491>] dump_stack+0x85/0xc4 [<ffffffff815e2969>] print_trailer+0xf9/0x150 [<ffffffff815e9454>] object_err+0x34/0x40 [<ffffffff815ebe50>] kasan_report_error+0x230/0x550 [<ffffffff819949be>] ? keyring_get_key_chunk+0x13e/0x210 [<ffffffff815ec62d>] __asan_report_load_n_noabort+0x5d/0x70 [<ffffffff81994cc3>] ? keyring_compare_object+0x213/0x240 [<ffffffff81994cc3>] keyring_compare_object+0x213/0x240 [<ffffffff81bc238c>] assoc_array_insert+0x86c/0x3a60 [<ffffffff81bc1b20>] ? assoc_array_cancel_edit+0x70/0x70 [<ffffffff8199797d>] ? __key_link_begin+0x20d/0x270 [<ffffffff8199786c>] __key_link_begin+0xfc/0x270 [<ffffffff81993389>] key_create_or_update+0x459/0xaf0 [<ffffffff8128ce0d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff81992f30>] ? key_type_lookup+0xc0/0xc0 [<ffffffff8199e19d>] ? lookup_user_key+0x13d/0xcd0 [<ffffffff81534763>] ? memdup_user+0x53/0x80 [<ffffffff819983ea>] SyS_add_key+0x1ba/0x350 [<ffffffff81998230>] ? key_get_type_from_user.constprop.6+0xa0/0xa0 [<ffffffff828bcf4e>] ? retint_user+0x18/0x23 [<ffffffff8128cc7e>] ? trace_hardirqs_on_caller+0x3fe/0x580 [<ffffffff81004017>] ? trace_hardirqs_on_thunk+0x17/0x19 [<ffffffff828bc432>] entry_SYSCALL_64_fastpath+0x12/0x76 Memory state around the buggy address: ffff880060a6f700: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00 ffff880060a6f780: 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc >ffff880060a6f800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff880060a6f880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff880060a6f900: fc fc fc fc fc fc 00 00 00 00 00 00 00 00 00 00 ================================================================== Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: stable@vger.kernel.org	2016-04-06 14:06:48 +01:00
Nicolai Stange	0bb5c9ead6	lib/mpi: mpi_read_raw_from_sgl(): fix out-of-bounds buffer access Within the copying loop in mpi_read_raw_from_sgl(), the last input SGE's byte count gets artificially extended as follows: if (sg_is_last(sg) && (len % BYTES_PER_MPI_LIMB)) len += BYTES_PER_MPI_LIMB - (len % BYTES_PER_MPI_LIMB); Within the following byte copying loop, this causes reads beyond that SGE's allocated buffer: BUG: KASAN: slab-out-of-bounds in mpi_read_raw_from_sgl+0x331/0x650 at addr ffff8801e168d4d8 Read of size 1 by task systemd-udevd/721 [...] Call Trace: [<ffffffff818c4d35>] dump_stack+0xbc/0x117 [<ffffffff818c4c79>] ? _atomic_dec_and_lock+0x169/0x169 [<ffffffff814af5d1>] ? print_section+0x61/0xb0 [<ffffffff814b1109>] print_trailer+0x179/0x2c0 [<ffffffff814bc524>] object_err+0x34/0x40 [<ffffffff814bfdc7>] kasan_report_error+0x307/0x8c0 [<ffffffff814bf315>] ? kasan_unpoison_shadow+0x35/0x50 [<ffffffff814bf38e>] ? kasan_kmalloc+0x5e/0x70 [<ffffffff814c0ad1>] kasan_report+0x71/0xa0 [<ffffffff81938171>] ? mpi_read_raw_from_sgl+0x331/0x650 [<ffffffff814bf1a6>] __asan_load1+0x46/0x50 [<ffffffff81938171>] mpi_read_raw_from_sgl+0x331/0x650 [<ffffffff817f41b6>] rsa_verify+0x106/0x260 [<ffffffff817f40b0>] ? rsa_set_pub_key+0xf0/0xf0 [<ffffffff818edc79>] ? sg_init_table+0x29/0x50 [<ffffffff817f4d22>] ? pkcs1pad_sg_set_buf+0xb2/0x2e0 [<ffffffff817f5b74>] pkcs1pad_verify+0x1f4/0x2b0 [<ffffffff81831057>] public_key_verify_signature+0x3a7/0x5e0 [<ffffffff81830cb0>] ? public_key_describe+0x80/0x80 [<ffffffff817830f0>] ? keyring_search_aux+0x150/0x150 [<ffffffff818334a4>] ? x509_request_asymmetric_key+0x114/0x370 [<ffffffff814b83f0>] ? kfree+0x220/0x370 [<ffffffff818312c2>] public_key_verify_signature_2+0x32/0x50 [<ffffffff81830b5c>] verify_signature+0x7c/0xb0 [<ffffffff81835d0c>] pkcs7_validate_trust+0x42c/0x5f0 [<ffffffff813c391a>] system_verify_data+0xca/0x170 [<ffffffff813c3850>] ? top_trace_array+0x9b/0x9b [<ffffffff81510b29>] ? __vfs_read+0x279/0x3d0 [<ffffffff8129372f>] mod_verify_sig+0x1ff/0x290 [...] The exact purpose of the len extension isn't clear to me, but due to its form, I suspect that it's a leftover somehow accounting for leading zero bytes within the most significant output limb. Note however that without that len adjustement, the total number of bytes ever processed by the inner loop equals nbytes and thus, the last output limb gets written at this point. Thus the net effect of the len adjustement cited above is just to keep the inner loop running for some more iterations, namely < BYTES_PER_MPI_LIMB ones, reading some extra bytes from beyond the last SGE's buffer and discarding them afterwards. Fix this issue by purging the extension of len beyond the last input SGE's buffer length. Fixes: `2d4d1eea54` ("lib/mpi: Add mpi sgl helpers") Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-04-05 20:35:51 +08:00
Nicolai Stange	85d541a3d1	lib/mpi: mpi_read_raw_from_sgl(): sanitize meaning of indices Within the byte reading loop in mpi_read_raw_sgl(), there are two housekeeping indices used, z and x. At all times, the index z represents the number of output bytes covered by the input SGEs for which processing has completed so far. This includes any leading zero bytes within the most significant limb. The index x changes its meaning after the first outer loop's iteration though: while processing the first input SGE, it represents "number of leading zero bytes in most significant output limb" + "current position within current SGE" For the remaining SGEs OTOH, x corresponds just to "current position within current SGE" After all, it is only the sum of z and x that has any meaning for the output buffer and thus, the "number of leading zero bytes in most significant output limb" part can be moved away from x into z from the beginning, opening up the opportunity for cleaner code. Before the outer loop iterating over the SGEs, don't initialize z with zero, but with the number of leading zero bytes in the most significant output limb. For the inner loop iterating over a single SGE's bytes, get rid of the buf_shift offset to x' bounds and let x run from zero to sg->length - 1. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-04-05 20:35:51 +08:00
Nicolai Stange	64c09b0b59	lib/mpi: mpi_read_raw_from_sgl(): fix nbits calculation The number of bits, nbits, is calculated in mpi_read_raw_from_sgl() as follows: nbits = nbytes * 8; Afterwards, the number of leading zero bits of the first byte get subtracted: nbits -= count_leading_zeros((u8 )(sg_virt(sgl) + lzeros)); However, count_leading_zeros() takes an unsigned long and thus, the u8 gets promoted to an unsigned long. Thus, the above doesn't subtract the number of leading zeros in the most significant nonzero input byte from nbits, but the number of leading zeros of the most significant nonzero input byte promoted to unsigned long, i.e. BITS_PER_LONG - 8 too many. Fix this by subtracting count_leading_zeros(...) - (BITS_PER_LONG - 8) from nbits only. Fixes: `2d4d1eea54` ("lib/mpi: Add mpi sgl helpers") Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-04-05 20:35:51 +08:00
Nicolai Stange	60e1b74c22	lib/mpi: mpi_read_raw_from_sgl(): purge redundant clearing of nbits In mpi_read_raw_from_sgl(), unsigned nbits is calculated as follows: nbits = nbytes * 8; and redundantly cleared later on if nbytes == 0: if (nbytes > 0) ... else nbits = 0; Purge this redundant clearing for the sake of clarity. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-04-05 20:35:51 +08:00
Nicolai Stange	ab1e912ec1	lib/mpi: mpi_read_raw_from_sgl(): don't include leading zero SGEs in nbytes At the very beginning of mpi_read_raw_from_sgl(), the leading zeros of the input scatterlist are counted: lzeros = 0; for_each_sg(sgl, sg, ents, i) { ... if (/* sg contains nonzero bytes /) break; / sg contains nothing but zeros here */ ents--; lzeros = 0; } Later on, the total number of trailing nonzero bytes is calculated by subtracting the number of leading zero bytes from the total number of input bytes: nbytes -= lzeros; However, since lzeros gets reset to zero for each completely zero leading sg in the loop above, it doesn't include those. Besides wasting resources by allocating a too large output buffer, this mistake propagates into the calculation of x, the number of leading zeros within the most significant output limb: x = BYTES_PER_MPI_LIMB - nbytes % BYTES_PER_MPI_LIMB; What's more, the low order bytes of the output, equal in number to the extra bytes in nbytes, are left uninitialized. Fix this by adjusting nbytes for each completely zero leading scatterlist entry. Fixes: `2d4d1eea54` ("lib/mpi: Add mpi sgl helpers") Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-04-05 20:35:50 +08:00
Nicolai Stange	b698538951	lib/mpi: mpi_read_raw_from_sgl(): replace len argument by nbytes Currently, the nbytes local variable is calculated from the len argument as follows: ... mpi_read_raw_from_sgl(..., unsigned int len) { unsigned nbytes; ... if (!ents) nbytes = 0; else nbytes = len - lzeros; ... } Given that nbytes is derived from len in a trivial way and that the len argument is shadowed by a local len variable in several loops, this is just confusing. Rename the len argument to nbytes and get rid of the nbytes local variable. Do the nbytes calculation in place. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-04-05 20:35:49 +08:00
Nicolai Stange	462696fd0f	lib/mpi: mpi_read_buffer(): fix buffer overflow Currently, mpi_read_buffer() writes full limbs to the output buffer and moves memory around to purge leading zero limbs afterwards. However, with commit `9cbe21d8f8` ("lib/mpi: only require buffers as big as needed for the integer") the caller is only required to provide a buffer large enough to hold the result without the leading zeros. This might result in a buffer overflow for small MP numbers with leading zeros. Fix this by coping the result to its final destination within the output buffer and not copying the leading zeros at all. Fixes: `9cbe21d8f8` ("lib/mpi: only require buffers as big as needed for the integer") Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-04-05 20:35:49 +08:00
Nicolai Stange	90f864e200	lib/mpi: mpi_read_buffer(): replace open coded endian conversion Currently, the endian conversion from CPU order to BE is open coded in mpi_read_buffer(). Replace this by the centrally provided cpu_to_be*() macros. Copy from the temporary storage on stack to the destination buffer by means of memcpy(). Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-04-05 20:35:49 +08:00
Nicolai Stange	f00fa2417b	lib/mpi: mpi_read_buffer(): optimize skipping of leading zero limbs Currently, if the number of leading zeros is greater than fits into a complete limb, mpi_read_buffer() skips them by iterating over them limb-wise. Instead of skipping the high order zero limbs within the loop as shown above, adjust the copying loop's bounds. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-04-05 20:35:49 +08:00
Nicolai Stange	d755290689	lib/mpi: mpi_write_sgl(): replace open coded endian conversion Currently, the endian conversion from CPU order to BE is open coded in mpi_write_sgl(). Replace this by the centrally provided cpu_to_be*() macros. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-04-05 20:35:48 +08:00
Nicolai Stange	cece762f6f	lib/mpi: mpi_write_sgl(): fix out-of-bounds stack access Within the copying loop in mpi_write_sgl(), we have if (lzeros) { mpi_limb_t limb1 = (void )p - sizeof(alimb); mpi_limb_t limb2 = (void )p - sizeof(alimb) + lzeros; limb1 = limb2; ... } where p points past the end of alimb2 which lives on the stack and contains the current limb in BE order. The purpose of the above is to shift the non-zero bytes of alimb2 to its beginning in memory, i.e. to skip its leading zero bytes. However, limb2 points somewhere into the middle of alimb2 and thus, reading *limb2 pulls in lzero bytes from somewhere. Indeed, KASAN splats: BUG: KASAN: stack-out-of-bounds in mpi_write_to_sgl+0x4e3/0x6f0 at addr ffff8800cb04f601 Read of size 8 by task systemd-udevd/391 page:ffffea00032c13c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x3fff8000000000() page dumped because: kasan: bad access detected CPU: 3 PID: 391 Comm: systemd-udevd Tainted: G B L 4.5.0-next-20160316+ #12 [...] Call Trace: [<ffffffff8194889e>] dump_stack+0xdc/0x15e [<ffffffff819487c2>] ? _atomic_dec_and_lock+0xa2/0xa2 [<ffffffff814892b5>] ? __dump_page+0x185/0x330 [<ffffffff8150ffd6>] kasan_report_error+0x5e6/0x8b0 [<ffffffff814724cd>] ? kzfree+0x2d/0x40 [<ffffffff819c5bce>] ? mpi_free_limb_space+0xe/0x20 [<ffffffff819c469e>] ? mpi_powm+0x37e/0x16f0 [<ffffffff815109f1>] kasan_report+0x71/0xa0 [<ffffffff819c0353>] ? mpi_write_to_sgl+0x4e3/0x6f0 [<ffffffff8150ed34>] __asan_load8+0x64/0x70 [<ffffffff819c0353>] mpi_write_to_sgl+0x4e3/0x6f0 [<ffffffff819bfe70>] ? mpi_set_buffer+0x620/0x620 [<ffffffff819c0e6f>] ? mpi_cmp+0xbf/0x180 [<ffffffff8186e282>] rsa_verify+0x202/0x260 What's more, since lzeros can be anything from 1 to sizeof(mpi_limb_t)-1, the above will cause unaligned accesses which is bad on non-x86 archs. Fix the issue, by preparing the starting point p for the upcoming copy operation instead of shifting the source memory, i.e. alimb2. Fixes: `2d4d1eea54` ("lib/mpi: Add mpi sgl helpers") Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-04-05 20:35:48 +08:00
Nicolai Stange	ea122be0b8	lib/mpi: mpi_write_sgl(): purge redundant pointer arithmetic Within the copying loop in mpi_write_sgl(), we have if (lzeros) { ... p -= lzeros; y = lzeros; } p = p - (sizeof(alimb) - y); If lzeros == 0, then y == 0, too. Thus, lzeros gets subtracted and added back again to p. Purge this redundancy. Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-04-05 20:35:48 +08:00
Nicolai Stange	654842ef53	lib/mpi: mpi_write_sgl(): fix style issue with lzero decrement Within the copying loop in mpi_write_sgl(), we have if (lzeros > 0) { ... lzeros -= sizeof(alimb); } However, at this point, lzeros < sizeof(alimb) holds. Make this fact explicit by rewriting the above to if (lzeros) { ... lzeros = 0; } Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-04-05 20:35:46 +08:00
Nicolai Stange	f2d1362ff7	lib/mpi: mpi_write_sgl(): fix skipping of leading zero limbs Currently, if the number of leading zeros is greater than fits into a complete limb, mpi_write_sgl() skips them by iterating over them limb-wise. However, it fails to adjust its internal leading zeros tracking variable, lzeros, accordingly: it does a p -= sizeof(alimb); continue; which should really have been a lzeros -= sizeof(alimb); continue; Since lzeros never decreases if its initial value >= sizeof(alimb), nothing gets copied by mpi_write_sgl() in that case. Instead of skipping the high order zero limbs within the loop as shown above, fix the issue by adjusting the copying loop's bounds. Fixes: `2d4d1eea54` ("lib/mpi: Add mpi sgl helpers") Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-04-05 20:35:46 +08:00
Bob Copeland	8f6fd83c6c	rhashtable: accept GFP flags in rhashtable_walk_init In certain cases, the 802.11 mesh pathtable code wants to iterate over all of the entries in the forwarding table from the receive path, which is inside an RCU read-side critical section. Enable walks inside atomic sections by allowing GFP_ATOMIC allocations for the walker state. Change all existing callsites to pass in GFP_KERNEL. Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Bob Copeland <me@bobcopeland.com> [also adjust gfs2/glock.c and rhashtable tests] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2016-04-05 10:56:32 +02:00
Richard Cochran	d18d12d0ff	lib/proportions: Remove unused code By accident I stumbled across code that is no longer used. According to git grep, the global functions in lib/proportions.c are not used anywhere. This patch removes the old, unused code. Peter Zijlstra further commented: "Ah indeed, that got replaced with the flex proportion code a while back." Signed-off-by: Richard Cochran <rcochran@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/4265b49bed713fbe3faaf8c05da0e1792f09c0b3.1459432020.git.rcochran@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-04-01 08:53:49 +02:00
Paul E. McKenney	8704baab9b	rcutorture: Add RCU grace-period performance tests This commit adds a new rcuperf module that carries out simple performance tests of RCU grace periods. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2016-03-31 13:37:38 -07:00
Alexander Potapenko	9dcadd381b	kasan: test fix: warn if the UAF could not be detected in kmalloc_uaf2 Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Konovalov <adech.fo@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-25 16:37:42 -07:00
Alexander Potapenko	cd11016e5f	mm, kasan: stackdepot implementation. Enable stackdepot for SLAB Implement the stack depot and provide CONFIG_STACKDEPOT. Stack depot will allow KASAN store allocation/deallocation stack traces for memory chunks. The stack traces are stored in a hash table and referenced by handles which reside in the kasan_alloc_meta and kasan_free_meta structures in the allocated memory chunks. IRQ stack traces are cut below the IRQ entry point to avoid unnecessary duplication. Right now stackdepot support is only enabled in SLAB allocator. Once KASAN features in SLAB are on par with those in SLUB we can switch SLUB to stackdepot as well, thus removing the dependency on SLUB stack bookkeeping, which wastes a lot of memory. This patch is based on the "mm: kasan: stack depots" patch originally prepared by Dmitry Chernenkov. Joonsoo has said that he plans to reuse the stackdepot code for the mm/page_owner.c debugging facility. [akpm@linux-foundation.org: s/depot_stack_handle/depot_stack_handle_t] [aryabinin@virtuozzo.com: comment style fixes] Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Konovalov <adech.fo@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-25 16:37:42 -07:00
Alexander Potapenko	7ed2f9e663	mm, kasan: SLAB support Add KASAN hooks to SLAB allocator. This patch is based on the "mm: kasan: unified support for SLUB and SLAB allocators" patch originally prepared by Dmitry Chernenkov. Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Konovalov <adech.fo@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-25 16:37:42 -07:00
Alexander Potapenko	e6e8379c87	kasan: modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right() This patchset implements SLAB support for KASAN Unlike SLUB, SLAB doesn't store allocation/deallocation stacks for heap objects, therefore we reimplement this feature in mm/kasan/stackdepot.c. The intention is to ultimately switch SLUB to use this implementation as well, which will save a lot of memory (right now SLUB bloats each object by 256 bytes to store the allocation/deallocation stacks). Also neither SLUB nor SLAB delay the reuse of freed memory chunks, which is necessary for better detection of use-after-free errors. We introduce memory quarantine (mm/kasan/quarantine.c), which allows delayed reuse of deallocated memory. This patch (of 7): Rename kmalloc_large_oob_right() to kmalloc_pagealloc_oob_right(), as the test only checks the page allocator functionality. Also reimplement kmalloc_large_oob_right() so that the test allocates a large enough chunk of memory that still does not trigger the page allocator fallback. Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Konovalov <adech.fo@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-25 16:37:42 -07:00
Helge Deller	6c31da3464	parisc,metag: Implement CONFIG_DEBUG_STACK_USAGE option On parisc and metag the stack grows upwards, so for those we need to scan the stack downwards in order to calculate how much stack a process has used. Tested on a 64bit parisc kernel. Signed-off-by: Helge Deller <deller@gmx.de>	2016-03-23 15:44:34 +01:00
Andrey Ryabinin	dde5cf39d4	ubsan: fix tree-wide -Wmaybe-uninitialized false positives -fsanitize=* options makes GCC less smart than usual and increase number of 'maybe-uninitialized' false-positives. So this patch does two things: * Add -Wno-maybe-uninitialized to CFLAGS_UBSAN which will disable all such warnings for instrumented files. * Remove CONFIG_UBSAN_SANITIZE_ALL from all[yes\|mod]config builds. So the all[yes\|mod]config build goes without -fsanitize=* and still with -Wmaybe-uninitialized. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-22 15:36:02 -07:00
Dmitry Vyukov	5c9a8750a6	kernel: add kcov code coverage kcov provides code coverage collection for coverage-guided fuzzing (randomized testing). Coverage-guided fuzzing is a testing technique that uses coverage feedback to determine new interesting inputs to a system. A notable user-space example is AFL (http://lcamtuf.coredump.cx/afl/). However, this technique is not widely used for kernel testing due to missing compiler and kernel support. kcov does not aim to collect as much coverage as possible. It aims to collect more or less stable coverage that is function of syscall inputs. To achieve this goal it does not collect coverage in soft/hard interrupts and instrumentation of some inherently non-deterministic or non-interesting parts of kernel is disbled (e.g. scheduler, locking). Currently there is a single coverage collection mode (tracing), but the API anticipates additional collection modes. Initially I also implemented a second mode which exposes coverage in a fixed-size hash table of counters (what Quentin used in his original patch). I've dropped the second mode for simplicity. This patch adds the necessary support on kernel side. The complimentary compiler support was added in gcc revision 231296. We've used this support to build syzkaller system call fuzzer, which has found 90 kernel bugs in just 2 months: https://github.com/google/syzkaller/wiki/Found-Bugs We've also found 30+ bugs in our internal systems with syzkaller. Another (yet unexplored) direction where kcov coverage would greatly help is more traditional "blob mutation". For example, mounting a random blob as a filesystem, or receiving a random blob over wire. Why not gcov. Typical fuzzing loop looks as follows: (1) reset coverage, (2) execute a bit of code, (3) collect coverage, repeat. A typical coverage can be just a dozen of basic blocks (e.g. an invalid input). In such context gcov becomes prohibitively expensive as reset/collect coverage steps depend on total number of basic blocks/edges in program (in case of kernel it is about 2M). Cost of kcov depends only on number of executed basic blocks/edges. On top of that, kernel requires per-thread coverage because there are always background threads and unrelated processes that also produce coverage. With inlined gcov instrumentation per-thread coverage is not possible. kcov exposes kernel PCs and control flow to user-space which is insecure. But debugfs should not be mapped as user accessible. Based on a patch by Quentin Casasnovas. [akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode'] [akpm@linux-foundation.org: unbreak allmodconfig] [akpm@linux-foundation.org: follow x86 Makefile layout standards] Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: syzkaller <syzkaller@googlegroups.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Tavis Ormandy <taviso@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Kees Cook <keescook@google.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: David Drysdale <drysdale@google.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-22 15:36:02 -07:00
Linus Torvalds	26660a4046	Merge branch 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull 'objtool' stack frame validation from Ingo Molnar: "This tree adds a new kernel build-time object file validation feature (ONFIG_STACK_VALIDATION=y): kernel stack frame correctness validation. It was written by and is maintained by Josh Poimboeuf. The motivation: there's a category of hard to find kernel bugs, most of them in assembly code (but also occasionally in C code), that degrades the quality of kernel stack dumps/backtraces. These bugs are hard to detect at the source code level. Such bugs result in incorrect/incomplete backtraces most of time - but can also in some rare cases result in crashes or other undefined behavior. The build time correctness checking is done via the new 'objtool' user-space utility that was written for this purpose and which is hosted in the kernel repository in tools/objtool/. The tool's (very simple) UI and source code design is shaped after Git and perf and shares quite a bit of infrastructure with tools/perf (which tooling infrastructure sharing effort got merged via perf and is already upstream). Objtool follows the well-known kernel coding style. Objtool does not try to check .c or .S files, it instead analyzes the resulting .o generated machine code from first principles: it decodes the instruction stream and interprets it. (Right now objtool supports the x86-64 architecture.) From tools/objtool/Documentation/stack-validation.txt: "The kernel CONFIG_STACK_VALIDATION option enables a host tool named objtool which runs at compile time. It has a "check" subcommand which analyzes every .o file and ensures the validity of its stack metadata. It enforces a set of rules on asm code and C inline assembly code so that stack traces can be reliable. Currently it only checks frame pointer usage, but there are plans to add CFI validation for C files and CFI generation for asm files. For each function, it recursively follows all possible code paths and validates the correct frame pointer state at each instruction. It also follows code paths involving special sections, like .altinstructions, __jump_table, and __ex_table, which can add alternative execution paths to a given instruction (or set of instructions). Similarly, it knows how to follow switch statements, for which gcc sometimes uses jump tables." When this new kernel option is enabled (it's disabled by default), the tool, if it finds any suspicious assembly code pattern, outputs warnings in compiler warning format: warning: objtool: rtlwifi_rate_mapping()+0x2e7: frame pointer state mismatch warning: objtool: cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup warning: objtool:__schedule()+0x3c0: duplicate frame pointer save warning: objtool:__schedule()+0x3fd: sibling call from callable instruction with changed frame pointer ... so that scripts that pick up compiler warnings will notice them. All known warnings triggered by the tool are fixed by the tree, most of the commits in fact prepare the kernel to be warning-free. Most of them are bugfixes or cleanups that stand on their own, but there are also some annotations of 'special' stack frames for justified cases such entries to JIT-ed code (BPF) or really special boot time code. There are two other long-term motivations behind this tool as well: - To improve the quality and reliability of kernel stack frames, so that they can be used for optimized live patching. - To create independent infrastructure to check the correctness of CFI stack frames at build time. CFI debuginfo is notoriously unreliable and we cannot use it in the kernel as-is without extra checking done both on the kernel side and on the build side. The quality of kernel stack frames matters to debuggability as well, so IMO we can merge this without having to consider the live patching or CFI debuginfo angle" * 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) objtool: Only print one warning per function objtool: Add several performance improvements tools: Copy hashtable.h into tools directory objtool: Fix false positive warnings for functions with multiple switch statements objtool: Rename some variables and functions objtool: Remove superflous INIT_LIST_HEAD objtool: Add helper macros for traversing instructions objtool: Fix false positive warnings related to sibling calls objtool: Compile with debugging symbols objtool: Detect infinite recursion objtool: Prevent infinite recursion in noreturn detection objtool: Detect and warn if libelf is missing and don't break the build tools: Support relative directory path for 'O=' objtool: Support CROSS_COMPILE x86/asm/decoder: Use explicitly signed chars objtool: Enable stack metadata validation on 64-bit x86 objtool: Add CONFIG_STACK_VALIDATION option objtool: Add tool to perform compile-time stack metadata validation x86/kprobes: Mark kretprobe_trampoline() stack frame as non-standard sched: Always inline context_switch() ...	2016-03-20 18:23:21 -07:00
Linus Torvalds	f0691533b7	virtio/vhost: new features, performance improvements, cleanups This adds basic polling support for vhost. Reworks virtio to optionally use DMA API, fixing it on Xen. Balloon stats gained a new entry. Using the new napi_alloc_skb speeds up virtio net. virtio blk stats can now be read while another VCPU us busy inflating or deflating the balloon. Plus misc cleanups in various places. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJW7qRJAAoJECgfDbjSjVRpVNoH/A7z+lZ6nooSJ9fUBtAAlwit mE1VKi8g0G6naV1NVLFVe7hPAejExGiHfR3ZUrVoenJKj2yeW/DFojFC10YR/KTe ac7Imuc+owA3UOE/QpeGBs59+EEWKTZUYt6r8HSJVwoodeosw9v2ecP/Iwhbax8H a4V3HqOADjKnHg73R9o3u+bAgA1GrGYHeK0AfhCBSTNwlPdxkvf0463HgfOpM4nl /sNoFWO3vOyekk+loIk+jpmWVIoIfG2NFzW4lPwEPkfqUBX7r0ei/NR23hIqHL7r QZ6vMj1Ew9qctUONbJu4kXjuV2Vk9NhxwbDjoJtm8plKL2hz2prJynUEogkHh2g= =VMD0 -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio/vhost updates from Michael Tsirkin: "New features, performance improvements, cleanups: - basic polling support for vhost - rework virtio to optionally use DMA API, fixing it on Xen - balloon stats gained a new entry - using the new napi_alloc_skb speeds up virtio net - virtio blk stats can now be read while another VCPU is busy inflating or deflating the balloon plus misc cleanups in various places" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_net: replace netdev_alloc_skb_ip_align() with napi_alloc_skb() vhost_net: basic polling support vhost: introduce vhost_vq_avail_empty() vhost: introduce vhost_has_work() virtio_balloon: Allow to resize and update the balloon stats in parallel virtio_balloon: Use a workqueue instead of "vballoon" kthread virtio/s390: size of SET_IND payload virtio/s390: use dev_to_virtio vhost: rename vhost_init_used() vhost: rename cross-endian helpers virtio_blk: VIRTIO_BLK_F_WCE->VIRTIO_BLK_F_FLUSH vring: Use the DMA API on Xen virtio_pci: Use the DMA API if enabled virtio_mmio: Use the DMA API if enabled virtio: Add improved queue allocation API virtio_ring: Support DMA APIs vring: Introduce vring_use_dma_api() s390/dma: Allow per device dma ops alpha/dma: use common noop dma ops dma: Provide simple noop dma ops	2016-03-20 13:28:18 -07:00
Linus Torvalds	1200b6809d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Highlights: 1) Support more Realtek wireless chips, from Jes Sorenson. 2) New BPF types for per-cpu hash and arrap maps, from Alexei Starovoitov. 3) Make several TCP sysctls per-namespace, from Nikolay Borisov. 4) Allow the use of SO_REUSEPORT in order to do per-thread processing of incoming TCP/UDP connections. The muxing can be done using a BPF program which hashes the incoming packet. From Craig Gallek. 5) Add a multiplexer for TCP streams, to provide a messaged based interface. BPF programs can be used to determine the message boundaries. From Tom Herbert. 6) Add 802.1AE MACSEC support, from Sabrina Dubroca. 7) Avoid factorial complexity when taking down an inetdev interface with lots of configured addresses. We were doing things like traversing the entire address less for each address removed, and flushing the entire netfilter conntrack table for every address as well. 8) Add and use SKB bulk free infrastructure, from Jesper Brouer. 9) Allow offloading u32 classifiers to hardware, and implement for ixgbe, from John Fastabend. 10) Allow configuring IRQ coalescing parameters on a per-queue basis, from Kan Liang. 11) Extend ethtool so that larger link mode masks can be supported. From David Decotigny. 12) Introduce devlink, which can be used to configure port link types (ethernet vs Infiniband, etc.), port splitting, and switch device level attributes as a whole. From Jiri Pirko. 13) Hardware offload support for flower classifiers, from Amir Vadai. 14) Add "Local Checksum Offload". Basically, for a tunneled packet the checksum of the outer header is 'constant' (because with the checksum field filled into the inner protocol header, the payload of the outer frame checksums to 'zero'), and we can take advantage of that in various ways. From Edward Cree" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits) bonding: fix bond_get_stats() net: bcmgenet: fix dma api length mismatch net/mlx4_core: Fix backward compatibility on VFs phy: mdio-thunder: Fix some Kconfig typos lan78xx: add ndo_get_stats64 lan78xx: handle statistics counter rollover RDS: TCP: Remove unused constant RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket net: smc911x: convert pxa dma to dmaengine team: remove duplicate set of flag IFF_MULTICAST bonding: remove duplicate set of flag IFF_MULTICAST net: fix a comment typo ethernet: micrel: fix some error codes ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it bpf, dst: add and use dst_tclassid helper bpf: make skb->tc_classid also readable net: mvneta: bm: clarify dependencies cls_bpf: reset class and reuse major in da ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c ldmvsw: Add ldmvsw.c driver code ...	2016-03-19 10:05:34 -07:00
Linus Torvalds	814a2bf957	Merge branch 'akpm' (patches from Andrew) Merge second patch-bomb from Andrew Morton: - a couple of hotfixes - the rest of MM - a new timer slack control in procfs - a couple of procfs fixes - a few misc things - some printk tweaks - lib/ updates, notably to radix-tree. - add my and Nick Piggin's old userspace radix-tree test harness to tools/testing/radix-tree/. Matthew said it was a godsend during the radix-tree work he did. - a few code-size improvements, switching to __always_inline where gcc screwed up. - partially implement character sets in sscanf * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits) sscanf: implement basic character sets lib/bug.c: use common WARN helper param: convert some "on"/"off" users to strtobool lib: add "on"/"off" support to kstrtobool lib: update single-char callers of strtobool() lib: move strtobool() to kstrtobool() include/linux/unaligned: force inlining of byteswap operations include/uapi/linux/byteorder, swab: force inlining of some byteswap operations include/asm-generic/atomic-long.h: force inlining of some atomic_long operations usb: common: convert to use match_string() helper ide: hpt366: convert to use match_string() helper ata: hpt366: convert to use match_string() helper power: ab8500: convert to use match_string() helper power: charger_manager: convert to use match_string() helper drm/edid: convert to use match_string() helper pinctrl: convert to use match_string() helper device property: convert to use match_string() helper lib/string: introduce match_string() helper radix-tree tests: add test for radix_tree_iter_next radix-tree tests: add regression3 test ...	2016-03-18 19:26:54 -07:00
Linus Torvalds	49dc2b7173	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: drivers/rtc: broken link fix drm/i915 Fix typos in i915_gem_fence.c Docs: fix missing word in REPORTING-BUGS lib+mm: fix few spelling mistakes MAINTAINERS: add git URL for APM driver treewide: Fix typo in printk	2016-03-17 21:38:27 -07:00
Linus Torvalds	588ab3f9af	arm64 updates for 4.6: - Initial page table creation reworked to avoid breaking large block mappings (huge pages) into smaller ones. The ARM architecture requires break-before-make in such cases to avoid TLB conflicts but that's not always possible on live page tables - Kernel virtual memory layout: the kernel image is no longer linked to the bottom of the linear mapping (PAGE_OFFSET) but at the bottom of the vmalloc space, allowing the kernel to be loaded (nearly) anywhere in physical RAM - Kernel ASLR: position independent kernel Image and modules being randomly mapped in the vmalloc space with the randomness is provided by UEFI (efi_get_random_bytes() patches merged via the arm64 tree, acked by Matt Fleming) - Implement relative exception tables for arm64, required by KASLR (initial code for ARCH_HAS_RELATIVE_EXTABLE added to lib/extable.c but actual x86 conversion to deferred to 4.7 because of the merge dependencies) - Support for the User Access Override feature of ARMv8.2: this allows uaccess functions (get_user etc.) to be implemented using LDTR/STTR instructions. Such instructions, when run by the kernel, perform unprivileged accesses adding an extra level of protection. The set_fs() macro is used to "upgrade" such instruction to privileged accesses via the UAO bit - Half-precision floating point support (part of ARMv8.2) - Optimisations for CPUs with or without a hardware prefetcher (using run-time code patching) - copy_page performance improvement to deal with 128 bytes at a time - Sanity checks on the CPU capabilities (via CPUID) to prevent incompatible secondary CPUs from being brought up (e.g. weird big.LITTLE configurations) - valid_user_regs() reworked for better sanity check of the sigcontext information (restored pstate information) - ACPI parking protocol implementation - CONFIG_DEBUG_RODATA enabled by default - VDSO code marked as read-only - DEBUG_PAGEALLOC support - ARCH_HAS_UBSAN_SANITIZE_ALL enabled - Erratum workaround Cavium ThunderX SoC - set_pte_at() fix for PROT_NONE mappings - Code clean-ups -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJW6u95AAoJEGvWsS0AyF7xMyoP/3x2O6bgreSQ84BdO4JChN4+ RQ9OVdX8u2ItO9sgaCY2AA6KoiBuEjGmPl/XRuK0I7DpODTtRjEXQHuNNhz8AelC hn4AEVqamY6Z5BzHFIjs8G9ydEbq+OXcKWEdwSsBhP/cMvI7ss3dps1f5iNPT5Vv 50E/kUz+aWYy7pKlB18VDV7TUOA3SuYuGknWV8+bOY5uPb8hNT3Y3fHOg/EuNNN3 DIuYH1V7XQkXtF+oNVIGxzzJCXULBE7egMcWAm1ydSOHK0JwkZAiL7OhI7ceVD0x YlDxBnqmi4cgzfBzTxITAhn3OParwN6udQprdF1WGtFF6fuY2eRDSH/L/iZoE4DY OulL951OsBtF8YC3+RKLk908/0bA2Uw8ftjCOFJTYbSnZBj1gWK41VkCYMEXiHQk EaN8+2Iw206iYIoyvdjGCLw7Y0oakDoVD9vmv12SOaHeQljTkjoN8oIlfjjKTeP7 3AXj5v9BDMDVh40nkVayysRNvqe48Kwt9Wn0rhVTLxwdJEiFG/OIU6HLuTkretdN dcCNFSQrRieSFHpBK9G0vKIpIss1ZwLm8gjocVXH7VK4Mo/TNQe4p2/wAF29mq4r xu1UiXmtU3uWxiqZnt72LOYFCarQ0sFA5+pMEvF5W+NrVB0wGpXhcwm+pGsIi4IM LepccTgykiUBqW5TRzPz =/oS+ -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "Here are the main arm64 updates for 4.6. There are some relatively intrusive changes to support KASLR, the reworking of the kernel virtual memory layout and initial page table creation. Summary: - Initial page table creation reworked to avoid breaking large block mappings (huge pages) into smaller ones. The ARM architecture requires break-before-make in such cases to avoid TLB conflicts but that's not always possible on live page tables - Kernel virtual memory layout: the kernel image is no longer linked to the bottom of the linear mapping (PAGE_OFFSET) but at the bottom of the vmalloc space, allowing the kernel to be loaded (nearly) anywhere in physical RAM - Kernel ASLR: position independent kernel Image and modules being randomly mapped in the vmalloc space with the randomness is provided by UEFI (efi_get_random_bytes() patches merged via the arm64 tree, acked by Matt Fleming) - Implement relative exception tables for arm64, required by KASLR (initial code for ARCH_HAS_RELATIVE_EXTABLE added to lib/extable.c but actual x86 conversion to deferred to 4.7 because of the merge dependencies) - Support for the User Access Override feature of ARMv8.2: this allows uaccess functions (get_user etc.) to be implemented using LDTR/STTR instructions. Such instructions, when run by the kernel, perform unprivileged accesses adding an extra level of protection. The set_fs() macro is used to "upgrade" such instruction to privileged accesses via the UAO bit - Half-precision floating point support (part of ARMv8.2) - Optimisations for CPUs with or without a hardware prefetcher (using run-time code patching) - copy_page performance improvement to deal with 128 bytes at a time - Sanity checks on the CPU capabilities (via CPUID) to prevent incompatible secondary CPUs from being brought up (e.g. weird big.LITTLE configurations) - valid_user_regs() reworked for better sanity check of the sigcontext information (restored pstate information) - ACPI parking protocol implementation - CONFIG_DEBUG_RODATA enabled by default - VDSO code marked as read-only - DEBUG_PAGEALLOC support - ARCH_HAS_UBSAN_SANITIZE_ALL enabled - Erratum workaround Cavium ThunderX SoC - set_pte_at() fix for PROT_NONE mappings - Code clean-ups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (99 commits) arm64: kasan: Fix zero shadow mapping overriding kernel image shadow arm64: kasan: Use actual memory node when populating the kernel image shadow arm64: Update PTE_RDONLY in set_pte_at() for PROT_NONE permission arm64: Fix misspellings in comments. arm64: efi: add missing frame pointer assignment arm64: make mrs_s prefixing implicit in read_cpuid arm64: enable CONFIG_DEBUG_RODATA by default arm64: Rework valid_user_regs arm64: mm: check at build time that PAGE_OFFSET divides the VA space evenly arm64: KVM: Move kvm_call_hyp back to its original localtion arm64: mm: treat memstart_addr as a signed quantity arm64: mm: list kernel sections in order arm64: lse: deal with clobbered IP registers after branch via PLT arm64: mm: dump: Use VA_START directly instead of private LOWEST_ADDR arm64: kconfig: add submenu for 8.2 architectural features arm64: kernel: acpi: fix ioremap in ACPI parking protocol cpu_postboot arm64: Add support for Half precision floating point arm64: Remove fixmap include fragility arm64: Add workaround for Cavium erratum 27456 arm64: mm: Mark .rodata as RO ...	2016-03-17 20:03:47 -07:00
Jessica Yu	f9310b2f9a	sscanf: implement basic character sets Implement basic character sets for the '%[' conversion specifier. The '%[' conversion specifier matches a nonempty sequence of characters from the specified set of accepted (or with '^', rejected) characters between the brackets. The substring matched is to be made up of characters in (or not in) the set. This is useful for matching substrings that are delimited by something other than spaces. This implementation differs from its glibc counterpart in the following ways: (1) No support for character ranges (e.g., 'a-z' or '0-9') (2) The hyphen '-' is not a special character (3) The closing bracket ']' cannot be matched (4) No support (yet) for discarding matching input ('%*[') The bitmap code is largely based upon sample code which was provided by Rasmus. The motivation for adding character set support to sscanf originally stemmed from the kernel livepatching project. An ongoing patchset utilizes new livepatch Elf symbol and section names to store important metadata livepatch needs to properly apply its patches. Such metadata is stored in these section and symbol names as substrings delimited by periods '.' and commas ','. For example, a livepatch symbol name might look like this: .klp.sym.vmlinux.printk,0 However, sscanf currently can only extract "substrings" delimited by whitespace using the "%s" specifier. Thus for the above symbol name, one cannot not use sscanf() to extract substrings "vmlinux" or "printk", for example. A number of discussions on the livepatch mailing list dealing with string parsing code for extracting these '.' and ',' delimited substrings eventually led to the conclusion that such code would be completely unnecessary if the kernel sscanf() supported character sets. Thus only a single sscanf() call would be necessary to extract these substrings. In addition, such an addition to sscanf() could benefit other areas of the kernel that might have a similar need in the future. [akpm@linux-foundation.org: 80-col tweaks] Signed-off-by: Jessica Yu <jeyu@redhat.com> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Josh Poimboeuf	2553b67a1f	lib/bug.c: use common WARN helper The traceoff_on_warning option doesn't have any effect on s390, powerpc, arm64, parisc, and sh because there are two different types of WARN implementations: 1) The above mentioned architectures treat WARN() as a special case of a BUG() exception. They handle warnings in report_bug() in lib/bug.c. 2) All other architectures just call warn_slowpath_*() directly. Their warnings are handled in warn_slowpath_common() in kernel/panic.c. Support traceoff_on_warning on all architectures and prevent any future divergence by using a single common function to emit the warning. Also remove the '()' from '%pS()', because the parentheses look funky: [ 45.607629] WARNING: at /root/warn_mod/warn_mod.c:17 .init_dummy+0x20/0x40 [warn_mod]() Reported-by: Chunyu Hu <chuhu@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Kees Cook	a81a5a17d4	lib: add "on"/"off" support to kstrtobool Add support for "on" and "off" when converting to boolean. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Amitkumar Karwar <akarwar@marvell.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Joe Perches <joe@perches.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nishant Sarmukadam <nishants@marvell.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Steve French <sfrench@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Kees Cook	ef95159907	lib: move strtobool() to kstrtobool() Create the kstrtobool_from_user() helper and move strtobool() logic into the new kstrtobool() (matching all the other kstrto* functions). Provides an inline wrapper for existing strtobool() callers. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Joe Perches <joe@perches.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Amitkumar Karwar <akarwar@marvell.com> Cc: Nishant Sarmukadam <nishants@marvell.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Steve French <sfrench@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Andy Shevchenko	56b060814e	lib/string: introduce match_string() helper Occasionally we have to search for an occurrence of a string in an array of strings. Make a simple helper for that purpose. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: David Airlie <airlied@linux.ie> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Sebastian Reichel <sre@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Matthew Wilcox	7cf19af4de	radix_tree: add radix_tree_dump This is debug code which is #if 0 out. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Matthew Wilcox	e614523653	radix_tree: add support for multi-order entries With huge pages, it is convenient to have the radix tree be able to return an entry that covers multiple indices. Previous attempts to deal with the problem have involved inserting N duplicate entries, which is a waste of memory and leads to problems trying to handle aliased tags, or probing the tree multiple times to find alternative entries which might cover the requested index. This approach inserts one canonical entry into the tree for a given range of indices, and may also insert other entries in order to ensure that lookups find the canonical entry. This solution only tolerates inserting powers of two that are greater than the fanout of the tree. If we wish to expand the radix tree's abilities to support large-ish pages that is less than the fanout at the penultimate level of the tree, then we would need to add one more step in lookup to ensure that any sibling nodes in the final level of the tree are dereferenced and we return the canonical entry that they reference. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Matthew Wilcox	0070e28d97	radix_tree: loop based on shift count, not height When we introduce entries that can cover multiple indices, we will need to stop in __radix_tree_create based on the shift, not the height. Split out for ease of bisect. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Matthew Wilcox	339e635304	radix_tree: tag all internal tree nodes as indirect pointers Set the 'indirect_ptr' bit on all the pointers to internal nodes, not just on the root node. This enables the following patches to support multi-order entries in the radix tree. This patch is split out for ease of bisection. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Heiko Carstens	d7b85cab74	lib/bug.c: make panic_on_warn available for all architectures Christian Borntraeger reported that panic_on_warn doesn't have any effect on s390. The panic_on_warn feature was introduced with `9e3961a097` ("kernel: add panic_on_warn"). However it did care only for the case when WANT_WARN_ON_SLOWPATH is defined. This is turn is only the case for architectures which do not have an own __WARN_TAINT defined. Other architectures which do have __WARN_TAINT defined call report_bug() for warnings within lib/bug.c which does not call panic() in case panic_on_warn is set. Let's simply enable the panic_on_warn feature by adding the same code like it was added to warn_slowpath_common() in panic.c. This enables panic_on_warn also for arm64, parisc, powerpc, s390 and sh. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Prarit Bhargava <prarit@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Vladimir Davydov	58e698af4c	radix-tree: account radix_tree_node to memory cgroup Allocation of radix_tree_node objects can be easily triggered from userspace, so we should account them to memory cgroup. Besides, we need them accounted for making shadow node shrinker per memcg (see mm/workingset.c). A tricky thing about accounting radix_tree_node objects is that they are mostly allocated through radix_tree_preload(), so we can't just set SLAB_ACCOUNT for radix_tree_node_cachep - that would likely result in a lot of unrelated cgroups using objects from each other's caches. One way to overcome this would be making radix tree preloads per memcg, but that would probably look cumbersome and overcomplicated. Instead, we make radix_tree_node_alloc() first try to allocate from the cache with __GFP_ACCOUNT, no matter if the caller has preloaded or not, and only if it fails fall back on using per cpu preloads. This should make most allocations accounted. Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Linus Torvalds	8eee93e257	Char/Misc patches for 4.6-rc1 Here is the big char/misc driver update for 4.6-rc1. The majority of the patches here is hwtracing and some new mic drivers, but there's a lot of other driver updates as well. Full details in the shortlog. All have been in linux-next for a while with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlbp9IcACgkQMUfUDdst+ykyJgCeLTC2QNGrh51kiJglkVJ0yD36 q4MAn0NkvSX2+iv5Jq8MaX6UQoRa4Nun =MNjR -----END PGP SIGNATURE----- Merge tag 'char-misc-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc updates from Greg KH: "Here is the big char/misc driver update for 4.6-rc1. The majority of the patches here is hwtracing and some new mic drivers, but there's a lot of other driver updates as well. Full details in the shortlog. All have been in linux-next for a while with no reported issues" * tag 'char-misc-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (238 commits) goldfish: Fix build error of missing ioremap on UM nvmem: mediatek: Fix later provider initialization nvmem: imx-ocotp: Fix return value of imx_ocotp_read nvmem: Fix dependencies for !HAS_IOMEM archs char: genrtc: replace blacklist with whitelist drivers/hwtracing: make coresight-etm-perf.c explicitly non-modular drivers: char: mem: fix IS_ERROR_VALUE usage char: xillybus: Fix internal data structure initialization pch_phub: return -ENODATA if ROM can't be mapped Drivers: hv: vmbus: Support kexec on ws2012 r2 and above Drivers: hv: vmbus: Support handling messages on multiple CPUs Drivers: hv: utils: Remove util transport handler from list if registration fails Drivers: hv: util: Pass the channel information during the init call Drivers: hv: vmbus: avoid unneeded compiler optimizations in vmbus_wait_for_unload() Drivers: hv: vmbus: remove code duplication in message handling Drivers: hv: vmbus: avoid wait_for_completion() on crash Drivers: hv: vmbus: don't loose HVMSG_TIMER_EXPIRED messages misc: at24: replace memory_accessor with nvmem_device_read eeprom: 93xx46: extend driver to plug into the NVMEM framework eeprom: at25: extend driver to plug into the NVMEM framework ...	2016-03-17 13:47:50 -07:00
Linus Torvalds	1a4ab084af	Driver core patches for 4.6-rc1 Just a few patches this time around for the 4.6-rc1 merge window. Largest is a new firmware driver, but there are some other updates to the driver core in here as well, the shortlog has the details. All have been in linux-next for a while with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlbp9QoACgkQMUfUDdst+ynsOQCghpfAf3CJDr4PWGCKzDJzyQG9 rZYAn2VwKsqHzAxgLXZY5fQIjxSyaLek =Mcvl -----END PGP SIGNATURE----- Merge tag 'driver-core-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Just a few patches this time around for the 4.6-rc1 merge window. Largest is a new firmware driver, but there are some other updates to the driver core in here as well, the shortlog has the details. All have been in linux-next for a while with no reported issues" * tag 'driver-core-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: Revert "driver-core: platform: probe of-devices only using list of compatibles" firmware: qemu config needs I/O ports firmware: qemu_fw_cfg.c: fix typo FW_CFG_DATA_OFF driver-core: platform: probe of-devices only using list of compatibles driver-core: platform: fix typo in documentation for multi-driver helper component: remove impossible condition drivers: dma-coherent: simplify dma_init_coherent_memory return value devicetree: update documentation for fw_cfg ARM bindings firmware: create directory hierarchy for sysfs fw_cfg entries firmware: introduce sysfs driver for QEMU's fw_cfg device kobject: export kset_find_obj() for module use driver core: bus: use to_subsys_private and to_device_private_bus driver core: bus: use list_for_each_entry* debugfs: Add stub function for debugfs_create_automount(). kernfs: make kernfs_walk_ns() use kernfs_pr_cont_buf[]	2016-03-17 13:38:00 -07:00
Linus Torvalds	70477371dc	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: "Here is the crypto update for 4.6: API: - Convert remaining crypto_hash users to shash or ahash, also convert blkcipher/ablkcipher users to skcipher. - Remove crypto_hash interface. - Remove crypto_pcomp interface. - Add crypto engine for async cipher drivers. - Add akcipher documentation. - Add skcipher documentation. Algorithms: - Rename crypto/crc32 to avoid name clash with lib/crc32. - Fix bug in keywrap where we zero the wrong pointer. Drivers: - Support T5/M5, T7/M7 SPARC CPUs in n2 hwrng driver. - Add PIC32 hwrng driver. - Support BCM6368 in bcm63xx hwrng driver. - Pack structs for 32-bit compat users in qat. - Use crypto engine in omap-aes. - Add support for sama5d2x SoCs in atmel-sha. - Make atmel-sha available again. - Make sahara hashing available again. - Make ccp hashing available again. - Make sha1-mb available again. - Add support for multiple devices in ccp. - Improve DMA performance in caam. - Add hashing support to rockchip" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits) crypto: qat - remove redundant arbiter configuration crypto: ux500 - fix checks of error code returned by devm_ioremap_resource() crypto: atmel - fix checks of error code returned by devm_ioremap_resource() crypto: qat - Change the definition of icp_qat_uof_regtype hwrng: exynos - use __maybe_unused to hide pm functions crypto: ccp - Add abstraction for device-specific calls crypto: ccp - CCP versioning support crypto: ccp - Support for multiple CCPs crypto: ccp - Remove check for x86 family and model crypto: ccp - memset request context to zero during import lib/mpi: use "static inline" instead of "extern inline" lib/mpi: avoid assembler warning hwrng: bcm63xx - fix non device tree compatibility crypto: testmgr - allow rfc3686 aes-ctr variants in fips mode. crypto: qat - The AE id should be less than the maximal AE number lib/mpi: Endianness fix crypto: rockchip - add hash support for crypto engine in rk3288 crypto: xts - fix compile errors crypto: doc - add skcipher API documentation crypto: doc - update AEAD AD handling ...	2016-03-17 11:22:54 -07:00
Linus Torvalds	271ecc5253	Merge branch 'akpm' (patches from Andrew) Merge first patch-bomb from Andrew Morton: - some misc things - ofs2 updates - about half of MM - checkpatch updates - autofs4 update * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (120 commits) autofs4: fix string.h include in auto_dev-ioctl.h autofs4: use pr_xxx() macros directly for logging autofs4: change log print macros to not insert newline autofs4: make autofs log prints consistent autofs4: fix some white space errors autofs4: fix invalid ioctl return in autofs4_root_ioctl_unlocked() autofs4: fix coding style line length in autofs4_wait() autofs4: fix coding style problem in autofs4_get_set_timeout() autofs4: coding style fixes autofs: show pipe inode in mount options kallsyms: add support for relative offsets in kallsyms address table kallsyms: don't overload absolute symbol type for percpu symbols x86: kallsyms: disable absolute percpu symbols on !SMP checkpatch: fix another left brace warning checkpatch: improve UNSPECIFIED_INT test for bare signed/unsigned uses checkpatch: warn on bare unsigned or signed declarations without int checkpatch: exclude asm volatile from complex macro check mm: memcontrol: drop unnecessary lru locking from mem_cgroup_migrate() mm: migrate: consolidate mem_cgroup_migrate() calls mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous ...	2016-03-16 11:51:08 -07:00
Vlastimil Babka	edf14cdbf9	mm, printk: introduce new format string for flags In mm we use several kinds of flags bitfields that are sometimes printed for debugging purposes, or exported to userspace via sysfs. To make them easier to interpret independently on kernel version and config, we want to dump also the symbolic flag names. So far this has been done with repeated calls to pr_cont(), which is unreliable on SMP, and not usable for e.g. sysfs export. To get a more reliable and universal solution, this patch extends printk() format string for pointers to handle the page flags (%pGp), gfp_flags (%pGg) and vma flags (%pGv). Existing users of dump_flag_names() are converted and simplified. It would be possible to pass flags by value instead of pointer, but the %p format string for pointers already has extensions for various kernel structures, so it's a good fit, and the extra indirection in a non-critical path is negligible. [linux@rasmusvillemoes.dk: lots of good implementation suggestions] Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-15 16:55:16 -07:00
Linus Torvalds	710d60cbf1	Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull cpu hotplug updates from Thomas Gleixner: "This is the first part of the ongoing cpu hotplug rework: - Initial implementation of the state machine - Runs all online and prepare down callbacks on the plugged cpu and not on some random processor - Replaces busy loop waiting with completions - Adds tracepoints so the states can be followed" More detailed commentary on this work from an earlier email: "What's wrong with the current cpu hotplug infrastructure? - Asymmetry The hotplug notifier mechanism is asymmetric versus the bringup and teardown. This is mostly caused by the notifier mechanism. - Largely undocumented dependencies While some notifiers use explicitely defined notifier priorities, we have quite some notifiers which use numerical priorities to express dependencies without any documentation why. - Control processor driven Most of the bringup/teardown of a cpu is driven by a control processor. While it is understandable, that preperatory steps, like idle thread creation, memory allocation for and initialization of essential facilities needs to be done before a cpu can boot, there is no reason why everything else must run on a control processor. Before this patch series, bringup looks like this: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu bring the rest up - All or nothing approach There is no way to do partial bringups. That's something which is really desired because we waste e.g. at boot substantial amount of time just busy waiting that the cpu comes to life. That's stupid as we could very well do preparatory steps and the initial IPI for other cpus and then go back and do the necessary low level synchronization with the freshly booted cpu. - Minimal debuggability Due to the notifier based design, it's impossible to switch between two stages of the bringup/teardown back and forth in order to test the correctness. So in many hotplug notifiers the cancel mechanisms are either not existant or completely untested. - Notifier [un]registering is tedious To [un]register notifiers we need to protect against hotplug at every callsite. There is no mechanism that bringup/teardown callbacks are issued on the online cpus, so every caller needs to do it itself. That also includes error rollback. What's the new design? The base of the new design is a symmetric state machine, where both the control processor and the booting/dying cpu execute a well defined set of states. Each state is symmetric in the end, except for some well defined exceptions, and the bringup/teardown can be stopped and reversed at almost all states. So the bringup of a cpu will look like this in the future: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu bring itself up The synchronization step does not require the control cpu to wait. That mechanism can be done asynchronously via a worker or some other mechanism. The teardown can be made very similar, so that the dying cpu cleans up and brings itself down. Cleanups which need to be done after the cpu is gone, can be scheduled asynchronously as well. There is a long way to this, as we need to refactor the notion when a cpu is available. Today we set the cpu online right after it comes out of the low level bringup, which is not really correct. The proper mechanism is to set it to available, i.e. cpu local threads, like softirqd, hotplug thread etc. can be scheduled on that cpu, and once it finished all booting steps, it's set to online, so general workloads can be scheduled on it. The reverse happens on teardown. First thing to do is to forbid scheduling of general workloads, then teardown all the per cpu resources and finally shut it off completely. This patch series implements the basic infrastructure for this at the core level. This includes the following: - Basic state machine implementation with well defined states, so ordering and prioritization can be expressed. - Interfaces to [un]register state callbacks This invokes the bringup/teardown callback on all online cpus with the proper protection in place and [un]installs the callbacks in the state machine array. For callbacks which have no particular ordering requirement we have a dynamic state space, so that drivers don't have to register an explicit hotplug state. If a callback fails, the code automatically does a rollback to the previous state. - Sysfs interface to drive the state machine to a particular step. This is only partially functional today. Full functionality and therefor testability will be achieved once we converted all existing hotplug notifiers over to the new scheme. - Run all CPU_ONLINE/DOWN_PREPARE notifiers on the booting/dying processor: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu wait for boot bring itself up Signal completion to control cpu In a previous step of this work we've done a full tree mechanical conversion of all hotplug notifiers to the new scheme. The balance is a net removal of about 4000 lines of code. This is not included in this series, as we decided to take a different approach. Instead of mechanically converting everything over, we will do a proper overhaul of the usage sites one by one so they nicely fit into the symmetric callback scheme. I decided to do that after I looked at the ugliness of some of the converted sites and figured out that their hotplug mechanism is completely buggered anyway. So there is no point to do a mechanical conversion first as we need to go through the usage sites one by one again in order to achieve a full symmetric and testable behaviour" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) cpu/hotplug: Document states better cpu/hotplug: Fix smpboot thread ordering cpu/hotplug: Remove redundant state check cpu/hotplug: Plug death reporting race rcu: Make CPU_DYING_IDLE an explicit call cpu/hotplug: Make wait for dead cpu completion based cpu/hotplug: Let upcoming cpu bring itself fully up arch/hotplug: Call into idle with a proper state cpu/hotplug: Move online calls to hotplugged cpu cpu/hotplug: Create hotplug threads cpu/hotplug: Split out the state walk into functions cpu/hotplug: Unpark smpboot threads from the state machine cpu/hotplug: Move scheduler cpu_online notifier to hotplug core cpu/hotplug: Implement setup/removal interface cpu/hotplug: Make target state writeable cpu/hotplug: Add sysfs state interface cpu/hotplug: Hand in target state to _cpu_up/down cpu/hotplug: Convert the hotplugged cpu work to a state machine cpu/hotplug: Convert to a state machine for the control processor cpu/hotplug: Add tracepoints ...	2016-03-15 13:50:29 -07:00
Linus Torvalds	ba33ea811e	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "This is another big update. Main changes are: - lots of x86 system call (and other traps/exceptions) entry code enhancements. In particular the complex parts of the 64-bit entry code have been migrated to C code as well, and a number of dusty corners have been refreshed. (Andy Lutomirski) - vDSO special mapping robustification and general cleanups (Andy Lutomirski) - cpufeature refactoring, cleanups and speedups (Borislav Petkov) - lots of other changes ..." * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) x86/cpufeature: Enable new AVX-512 features x86/entry/traps: Show unhandled signal for i386 in do_trap() x86/entry: Call enter_from_user_mode() with IRQs off x86/entry/32: Change INT80 to be an interrupt gate x86/entry: Improve system call entry comments x86/entry: Remove TIF_SINGLESTEP entry work x86/entry/32: Add and check a stack canary for the SYSENTER stack x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup x86/entry: Only allocate space for tss_struct::SYSENTER_stack if needed x86/entry: Vastly simplify SYSENTER TF (single-step) handling x86/entry/traps: Clear DR6 early in do_debug() and improve the comment x86/entry/traps: Clear TIF_BLOCKSTEP on all debug exceptions x86/entry/32: Restore FLAGS on SYSEXIT x86/entry/32: Filter NT and speed up AC filtering in SYSENTER x86/entry/compat: In SYSENTER, sink AC clearing below the existing FLAGS test selftests/x86: In syscall_nt, test NT\|TF as well x86/asm-offsets: Remove PARAVIRT_enabled x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled uprobes: __create_xol_area() must nullify xol_mapping.fault x86/cpufeature: Create a new synthetic cpu capability for machine check recovery ...	2016-03-15 09:32:27 -07:00
Linus Torvalds	e71c2c1eeb	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Main kernel side changes: - Big reorganization of the x86 perf support code. The old code grew organically deep inside arch/x86/kernel/cpu/perf* and its naming became somewhat messy. The new location is under arch/x86/events/, using the following cleaner hierarchy of source code files: perf/x86: Move perf_event.c .................. => x86/events/core.c perf/x86: Move perf_event_amd.c .............. => x86/events/amd/core.c perf/x86: Move perf_event_amd_ibs.c .......... => x86/events/amd/ibs.c perf/x86: Move perf_event_amd_iommu.[ch] ..... => x86/events/amd/iommu.[ch] perf/x86: Move perf_event_amd_uncore.c ....... => x86/events/amd/uncore.c perf/x86: Move perf_event_intel_bts.c ........ => x86/events/intel/bts.c perf/x86: Move perf_event_intel.c ............ => x86/events/intel/core.c perf/x86: Move perf_event_intel_cqm.c ........ => x86/events/intel/cqm.c perf/x86: Move perf_event_intel_cstate.c ..... => x86/events/intel/cstate.c perf/x86: Move perf_event_intel_ds.c ......... => x86/events/intel/ds.c perf/x86: Move perf_event_intel_lbr.c ........ => x86/events/intel/lbr.c perf/x86: Move perf_event_intel_pt.[ch] ...... => x86/events/intel/pt.[ch] perf/x86: Move perf_event_intel_rapl.c ....... => x86/events/intel/rapl.c perf/x86: Move perf_event_intel_uncore.[ch] .. => x86/events/intel/uncore.[ch] perf/x86: Move perf_event_intel_uncore_nhmex.c => x86/events/intel/uncore_nmhex.c perf/x86: Move perf_event_intel_uncore_snb.c => x86/events/intel/uncore_snb.c perf/x86: Move perf_event_intel_uncore_snbep.c => x86/events/intel/uncore_snbep.c perf/x86: Move perf_event_knc.c .............. => x86/events/intel/knc.c perf/x86: Move perf_event_p4.c ............... => x86/events/intel/p4.c perf/x86: Move perf_event_p6.c ............... => x86/events/intel/p6.c perf/x86: Move perf_event_msr.c .............. => x86/events/msr.c (Borislav Petkov) - Update various x86 PMU constraint and hw support details (Stephane Eranian) - Optimize kprobes for BPF execution (Martin KaFai Lau) - Rewrite, refactor and fix the Intel uncore PMU driver code (Thomas Gleixner) - Rewrite, refactor and fix the Intel RAPL PMU code (Thomas Gleixner) - Various fixes and smaller cleanups. There are lots of perf tooling updates as well. A few highlights: perf report/top: - Hierarchy histogram mode for 'perf top' and 'perf report', showing multiple levels, one per --sort entry: (Namhyung Kim) On a mostly idle system: # perf top --hierarchy -s comm,dso Then expand some levels and use 'P' to take a snapshot: # cat perf.hist.0 - 92.32% perf 58.20% perf 22.29% libc-2.22.so 5.97% [kernel] 4.18% libelf-0.165.so 1.69% [unknown] - 4.71% qemu-system-x86 3.10% [kernel] 1.60% qemu-system-x86_64 (deleted) + 2.97% swapper # - Add 'L' hotkey to dynamicly set the percent threshold for histogram entries and callchains, i.e. dynamicly do what the --percent-limit command line option to 'top' and 'report' does. (Namhyung Kim) perf mem: - Allow specifying events via -e in 'perf mem record', also listing what events can be specified via 'perf mem record -e list' (Jiri Olsa) perf record: - Add 'perf record' --all-user/--all-kernel options, so that one can tell that all the events in the command line should be restricted to the user or kernel levels (Jiri Olsa), i.e.: perf record -e cycles:u,instructions:u is equivalent to: perf record --all-user -e cycles,instructions - Make 'perf record' collect CPU cache info in the perf.data file header: $ perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] $ perf report --header-only -I \| tail -10 \| head -8 # CPU cache info: # L1 Data 32K [0-1] # L1 Instruction 32K [0-1] # L1 Data 32K [2-3] # L1 Instruction 32K [2-3] # L2 Unified 256K [0-1] # L2 Unified 256K [2-3] # L3 Unified 4096K [0-3] Will be used in 'perf c2c' and eventually in 'perf diff' to allow, for instance running the same workload in multiple machines and then when using 'diff' show the hardware difference. (Jiri Olsa) - Improved support for Java, using the JVMTI agent library to do jitdumps that then will be inserted in synthesized PERF_RECORD_MMAP2 events via 'perf inject' pointed to synthesized ELF files stored in ~/.debug and keyed with build-ids, to allow symbol resolution and even annotation with source line info, see the changeset comments to see how to use it (Stephane Eranian) perf script/trace: - Decode data_src values (e.g. perf.data files generated by 'perf mem record') in 'perf script': (Jiri Olsa) # perf script perf 693 [1] 4.088652: 1 cpu/mem-loads,ldlat=30/P: ffff88007d0b0f40 68100142 L1 hit\|SNP None\|TLB L1 or L2 hit\|LCK No <SNIP> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - Improve support to 'data_src', 'weight' and 'addr' fields in 'perf script' (Jiri Olsa) - Handle empty print fmts in 'perf script -s' i.e. when running python or perl scripts (Taeung Song) perf stat: - 'perf stat' now shows shadow metrics (insn per cycle, etc) in interval mode too. E.g: # perf stat -I 1000 -e instructions,cycles sleep 1 # time counts unit events 1.000215928 519,620 instructions # 0.69 insn per cycle 1.000215928 752,003 cycles <SNIP> - Port 'perf kvm stat' to PowerPC (Hemant Kumar) - Implement CSV metrics output in 'perf stat' (Andi Kleen) perf BPF support: - Support converting data from bpf events in 'perf data' (Wang Nan) - Print bpf-output events in 'perf script': (Wang Nan). # perf record -e bpf-output/no-inherit,name=evt/ -e ./test_bpf_output_3.c/map:channel.event=evt/ usleep 1000 # perf script usleep 4882 21384.532523: evt: ffffffff810e97d1 sys_nanosleep ([kernel.kallsyms]) BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a 0008: 42 50 46 20 65 76 65 6e BPF even 0010: 74 21 00 00 t!.. BPF string: "Raise a BPF event!" # - Add API to set values of map entries in a BPF object, be it individual map slots or ranges (Wang Nan) - Introduce support for the 'bpf-output' event (Wang Nan) - Add glue to read perf events in a BPF program (Wang Nan) - Improve support for bpf-output events in 'perf trace' (Wang Nan) ... and tons of other changes as well - see the shortlog and git log for details!" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (342 commits) perf stat: Add --metric-only support for -A perf stat: Implement --metric-only mode perf stat: Document CSV format in manpage perf hists browser: Check sort keys before hot key actions perf hists browser: Allow thread filtering for comm sort key perf tools: Add sort__has_comm variable perf tools: Recalc total periods using top-level entries in hierarchy perf tools: Remove nr_sort_keys field perf hists browser: Cleanup hist_browser__fprintf_hierarchy_entry() perf tools: Remove hist_entry->fmt field perf tools: Fix command line filters in hierarchy mode perf tools: Add more sort entry check functions perf tools: Fix hist_entry__filter() for hierarchy perf jitdump: Build only on supported archs tools lib traceevent: Add '~' operation within arg_num_eval() perf tools: Omit unnecessary cast in perf_pmu__parse_scale perf tools: Pass perf_hpp_list all the way through setup_sort_list perf tools: Fix perf script python database export crash perf jitdump: DWARF is also needed perf bench mem: Prepare the x86-64 build for upstream memcpy_mcsafe() changes ...	2016-03-14 17:58:53 -07:00
Alexander Duyck	01cfbad79a	ipv4: Update parameters for csum_tcpudp_magic to their original types This patch updates all instances of csum_tcpudp_magic and csum_tcpudp_nofold to reflect the types that are usually used as the source inputs. For example the protocol field is populated based on nexthdr which is actually an unsigned 8 bit value. The length is usually populated based on skb->len which is an unsigned integer. This addresses an issue in which the IPv6 function csum_ipv6_magic was generating a checksum using the full 32b of skb->len while csum_tcpudp_magic was only using the lower 16 bits. As a result we could run into issues when attempting to adjust the checksum as there was no protocol agnostic way to update it. With this change the value is still truncated as many architectures use "(len + proto) << 8", however this truncation only occurs for values greater than 16776960 in length and as such is unlikely to occur as we stop the inner headers at ~64K in size. I did have to make a few minor changes in the arm, mn10300, nios2, and score versions of the function in order to support these changes as they were either using things such as an OR to combine the protocol and length, or were using ntohs to convert the length which would have truncated the value. I also updated a few spots in terms of whitespace and type differences for the addresses. Most of this was just to make sure all of the definitions were in sync going forward. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-13 23:55:13 -04:00
Ingo Molnar	6cbe9e4a22	Merge branch 'linus' into locking/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-03-10 10:28:27 +01:00

1 2 3 4 5 ...

3825 Коммитов