Previous Commit ("ARC: SMP: No need for CONFIG_ARC_IPI_DBG") removed
the Kconfig option ARC_IPI_DBG. Remove the last reference on this
option.
Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARConnect/MCIP IPI sending has a retry-wait loop in case caller had
not seen a previous such interrupt. Turns out that it is not needed at
all. Linux cross core calling allows coalescing multiple IPIs to same
receiver - it is fine as long as there is one.
This logic is built into upper layer already, at a higher level of
abstraction. ipi_send_msg_one() sets the actual msg payload, but it only
calls MCIP IPI sending if msg holder was empty (using
atomic-set-new-and-get-old construct). Thus it is unlikely that the
retry-wait looping was ever getting exercised at all.
Cc: Chuck Jordan <cjordan@synopsys.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
There is no real ARC700 based SMP SoC so remove IPI definition.
EZChip's SMP ARC700 is going to use a different intc and IPI provider
anyways.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARConnect/MCIP Inter-Core-Interrupt module can't send interrupt to
local core. So use core intc capability to trigger software
interrupt to self, using an unsued IRQ #21.
This showed up as csd deadlock with LTP trace_sched on a dual core
system. This test acts as scheduler fuzzer, triggering all sorts of
schedulting activity. Trouble starts with IPI to self, which doesn't get
delivered (effectively lost due to H/w capability), but the msg intended
to be sent remain enqueued in per-cpu @ipi_data.
All subsequent IPIs to this core from other cores get elided due to the
IPI coalescing optimization in ipi_send_msg_one() where a pending msg
implies an IPI already sent and assumes other core is yet to ack it.
After the elided IPI, other core simply goes into csd_lock_wait()
but never comes out as this core never sees the interrupt.
Fixes STAR 9001008624
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org> [4.2]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Even though DEVTMPFS is required when our pre-built initramfs
is used it is not the case in general. It is perfectly possible
to use initramfs with device nodes already populated or there
could be other usages, see discussion below for more detials:
http://thread.gmane.org/gmane.comp.embedded.openwrt.devel/37819/focus=37821
This change removes mentioned dependency from arch/arc/Kconfig
updating instead those defconfigs that are usually used with this
kind of pre-build initramfs.
And while at it all touched defconfigs were regenerated via
savedefconfig and some options were removed:
* USB is selected by other options implicitly
* VGA_CONSOLE is disableb for ARC since
031e29b587
* EXT3_FS automatically selects EXT4_FS
* MTDxxx and JFFS2_FS make no sense for AXS because
AXS NAND controller is not upstreamed
* NET_OSCI_LAN is not in upstream as well
* ARCPGU_xxx options make no sense because ARC PGU is not yet
in upstream and when it gets there all config options would
be taken from devicetree
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
- ARCv2 uses a seperate BCR for {I,D}CCM base address:
ARCompact encoded both base/size in same BCR
- Size encoding in common BCR is different for ARCompact/ARCv2
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
It is unlikely that designs running Linux will not have multiplier.
Further the current support is not complete as tool don't generate a
multilib w/o multiplier.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
- Corner case of returning to delay slot from interrupt
- Changing default interrupt prioiry level
- Kconfig'ize support for super pages
- Other minor fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWvxdUAAoJEGnX8d3iisJeRkYP/2HZAt4J6c5MPk/NSy8rabVX
2bB1m5jYXlBmJAIsmWm+WcDL72MdrB1Owtc5tEN+hIoQQa2QQpxolp32IslHg0o8
C9CCzmF+iR8wz3caVk3javpsbze23XbHho/kdx/l2Ed3Fi+syI/9jF1GiboydRtR
X22an1lslA6Y44pYxFmSFcMCv7XclFkJNe1ltxsgN9/QapnNrE/HWqUIy+SMr2Oo
Tpo3m/Dc+IfMMejYyupc3keyAhyeux69lJXPuOzYiurgGUIyXz15Un2mQ9gZWf0u
W56L/55VpQVuah46qrp5CBTLmdJA5cBqr0F8RqmZAqrEYLgn5SD4IhDjamo1qsP/
FfFh0cG955SoEyCsUOPILWUFR5TeS4rJK+ZJjErUb+dwEC1BWZR0/Dn1s9KJN8b7
GgGV8yXruDACFlFnCqnlxVs1TKOPOUqD2NZRAdsKunp+ywNrvGdD43xWONcriyvr
2KW0nb+mH3RRk8HQzKjfqsVhLMoR7n1MD/+tg8ME8usLn1ik0hBerT56CX0Wh/yQ
VnOUX6xqlaRydeJJgCUyByz3+jJVvj8sk/VZbr19F0p9id6wpiPQeNus2AcoHFKW
OyvWcfxzqKegXrYtMsy8IoFzx73zJaXV3ht0I09rhAj3JkdF7vFEIUpKIhsWqxAK
yWKKqLcVKga/2Yc8jduI
=FNDd
-----END PGP SIGNATURE-----
Merge tag 'arc-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"I've been sitting on some of these fixes for a while.
- Corner case of returning to delay slot from interrupt
- Changing default interrupt prioiry level
- Kconfig'ize support for super pages
- Other minor fixes"
* tag 'arc-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: mm: Introduce explicit super page size support
ARCv2: intc: Allow interruption by lowest priority interrupt
ARCv2: Check for LL-SC livelock only if LLSC is enabled
ARC: shrink cpuinfo by not saving full timer BCR
ARCv2: clocksource: Rename GRTC -> GFRC ...
ARCv2: STAR 9000950267: Handle return from intr to Delay Slot #2
MMUv4 supports 2 concurrent page sizes: Normal and Super [4K to 16M]
So far Linux supported a single super page size for a given Normal page,
depending on the software page walking address split.
e.g. we had 11:8:13 address split for 8K page, which meant super page
was 2 ^(8+13) = 2M (given that THP size has to be PMD_SHIFT)
Now we turn this around, by allowing multiple Super Pages in Kconfig
(currently 2M and 16M only) and forcing page walker address split to
PGDIR_SHIFT and PAGE_SHIFT
For configs without Super page, things are same as before and
PGDIR_SHIFT can be hacked to get non default address split
The motivation for this change is a customer who needs 16M super page
and a 8K Normal page combo.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARC HS Cores support configurable multiple interrupt priorities of upto
16 levels.
There is processor "interrupt preemption threshhold" in STATUS32.E[4:1]
And several places need to set this up:
1. seed value as kernel is booting
2. seed value for user space programs
3. Arg to SLEEP instruction in idle task (what interrupt prio can wake)
4. Per-IRQ line prioirty (i.e. what is the priority of interrupt
raised by a peripheral or timer or perf counter...
Currently above sites use the highest priority 0. This can be potential
problem when multiple priorities are supported. e.g. user space could
only be interrupted by P0 interrupt, not others...
So turn this over and instead make default interruption level to be
the lowest priority possible 15. This should be fine even if there are
fewer priority levels configured (say two: P0 HIGH, P1 LOW)
This feature also effectively disables FIRQ feature if present in
hardware config. With old code, a P0 interrupt would be FIRQ, needing
special handling (ISR or Register Banks) which is NOT supported yet.
Now it not be P0 (P15 or whatever is lowest prio) so FIRQ is not
triggered.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Returning to delay slot, riding an interrupti, had one loose end.
AUX_USER_SP used for restoring user mode SP upon RTIE was not being
setup from orig task's saved value, causing task to use wrong SP,
leading to ProtV errors.
The reason being:
- INTERRUPT_EPILOGUE returns to a kernel trampoline, thus not expected to restore it
- EXCEPTION_EPILOGUE is not used at all
Fix that by restoring AUX_USER_SP explicitly in the trampoline.
This was broken in the original workaround, but the error scenarios got
reduced considerably since v3.14 due to following:
1. The Linuxthreads.old based userspace at the time caused many more
exceptions in delay slot than the current NPTL based one.
Infact with current userspace the error doesn't happen at all.
2. Return from interrupt (delay slot or otherwise) doesn't get exercised much
after commit 4de0e52867 ("Really Re-enable interrupts to avoid deadlocks")
since IRQ_ACTIVE.active being clear means most returns are as if from pure
kernel (even for active interrupts)
Infact the issue only happened in an experimental branch where I was tinkering with
reverted 4de0e52867
Cc: stable@kernel.org # v4.2+
Fixes: 4255b07f2c ("ARCv2: STAR 9000793984: Handle return from intr to Delay Slot")
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Move the generic implementation to <linux/dma-mapping.h> now that all
architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now
that everyone supports them.
[valentinrothberg@gmail.com: remove leftovers in Kconfig]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Helge Deller <deller@gmx.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As illustrated by commit a3afe70b83 ("[S390] latencytop s390
support."), HAVE_LATENCYTOP_SUPPORT is defined by an architecture to
advertise an implementation of save_stack_trace_tsk.
However, as of 9212ddb5ea ("stacktrace: provide save_stack_trace_tsk()
weak alias") a dummy implementation is provided if STACKTRACE=y. Given
that LATENCYTOP already depends on STACKTRACE_SUPPORT and selects
STACKTRACE, we can remove HAVE_LATENCYTOP_SUPPORT altogether.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Helge Deller <deller@gmx.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Let's define page_mapped() to be true for compound pages if any
sub-pages of the compound page is mapped (with PMD or PTE).
On other hand page_mapcount() return mapcount for this particular small
page.
This will make cases like page_get_anon_vma() behave correctly once we
allow huge pages to be mapped with PTE.
Most users outside core-mm should use page_mapcount() instead of
page_mapped().
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull livepatching updates from Jiri Kosina:
- RO/NX attribute fixes for patch module relocations from Josh
Poimboeuf. As part of this effort, module.c has been cleaned up as
well and livepatching is piggy-backing on this cleanup. Rusty is OK
with this whole lot going through livepatching tree.
- symbol disambiguation support from Chris J Arges. That series is
also
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
but this came in only after I've alredy pushed out. Didn't want to
rebase because of that, hence I am mentioning it here.
- symbol lookup fix from Miroslav Benes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
livepatch: Cleanup module page permission changes
module: keep percpu symbols in module's symtab
module: clean up RO/NX handling.
module: use a structure to encapsulate layout.
gcov: use within_module() helper.
module: Use the same logic for setting and unsetting RO/NX
livepatch: function,sympos scheme in livepatch sysfs directory
livepatch: add sympos as disambiguator field to klp_reloc
livepatch: add old_sympos as disambiguator field to klp_func
Blingly ignoring CIE.version != 1 was a bad idea.
It still leaves "desirability" when running perf with callgraphing where libgcc
symbols might show in hotspot.
More importantly, basic CIE.version == 3 support already exists in code:
|
| retAddrReg = state.version <= 1 ? *ptr++ : get_uleb128(&ptr, end);
|
Next commit with simply add continue-not-bail for CIE.version != 1
This reverts commit 323f41f9e7.
At -Os, ARC gcc generates millicode thunk for function prologue/epilogue,
which are served by libgcc.
Modules historically are NOT linked with libgcc to avoid code bloat, reducing
runtime relocation fixups etc. I even once tried doing that but got lost
in makefile intricacies.
This means modules at -Os don't get the millicode thunks, causing build
failures below:
| MODPOST 5 modules
| ERROR: "__ld_r13_to_r18" [crypto/sha256_generic.ko] undefined!
| ERROR: "__ld_r13_to_r18_ret" [crypto/sha256_generic.ko] undefined!
| ERROR: "__st_r13_to_r18" [crypto/sha256_generic.ko] undefined!
| ERROR: "__ld_r13_to_r17_ret" [crypto/sha256_generic.ko] undefined!
| ERROR: "__st_r13_to_r17" [crypto/sha256_generic.ko] undefined!
| ERROR: "__ld_r13_to_r16_ret" [crypto/sha256_generic.ko] undefined!
| ERROR: "__st_r13_to_r16" [crypto/sha256_generic.ko] undefined!
|....
|....
Workaround that by inhibiting millicode thunks for loadable modules
Fixes STAR 9000641864:
("Linux built with optimizations for size emits errors for modules")
Reported-by: Anton Kolesov <akolesov@synosys.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARC700 cores with MMU v2 don't have IC_PTAG AUX register and so we only
define ARC_REG_IC_PTAG for MMU versions >= 3.
But current implementation of cache_line_loop_vX() routines assumes
availability of all of them (v2, v3 and v4) simultaneously.
And given undefined ARC_REG_IC_PTAG if CONFIG_MMU_VER=2 we're seeing
compilation problem:
---------------------------------->8-------------------------------
CC arch/arc/mm/cache.o
arch/arc/mm/cache.c: In function '__cache_line_loop_v3':
arch/arc/mm/cache.c:270:13: error: 'ARC_REG_IC_PTAG' undeclared (first use in this function)
aux_tag = ARC_REG_IC_PTAG;
^
arch/arc/mm/cache.c:270:13: note: each undeclared identifier is reported only once for each function it appears in
scripts/Makefile.build:258: recipe for target 'arch/arc/mm/cache.o' failed
---------------------------------->8-------------------------------
The simples fix is to have ARC_REG_IC_PTAG defined regardless MMU
version being used.
We don't use it in cache_line_loop_v2() anyways so who cares.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
| WARNING: vmlinux.o(.text+0xd6c2): Section mismatch in reference from the function alloc_kmap_pgtable() to the function
| .init.text:__alloc_bootmem_low()
The function alloc_kmap_pgtable() references the function __init __alloc_bootmem_low().
This is often because alloc_kmap_pgtable lacks a __init annotation or the annotation of __alloc_bootmem_low is wrong.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This will better reflect its description i.e. "any needed setup..."
and not just do an "IPI request".
Signed-off-by: Noam Camus <noamc@ezchip.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARC dwarf unwinder only supports CIE version == 1
The boot time dwarf sanitizer (part of binary lookup table constructor)
would simply bail if it saw CIE version == 3, rendering unwinder with a
NULL lookup table.
It seems libgcc linked with kernel does have such entries.
With fallback linear search removed, and a NULL binary lookup table,
unwinder fails to generate any stack trace.
So allow graceful ignoring of unsupported CIE entries.
This problem was initially seen in Alexey's setup (and not mine) as he
was using buildroot built toolchain (libgcc) which doesn't get built with
CFLAGS_FOR_TARGET="-gdwarf-2 which is my default
Fixes STAR 9000985048: "kernel unwinder broken with stock tools"
Fixes: 2e22502c08 ARC: dw2 unwind: Remove falllback linear search thru FDE entries
Reported-by Alexey Brodkin <abrodkin@synopsys.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The fix which removed linear searching of dwarf (because binary lookup
data always exists) missed out on the fact that modules don't get the
binary lookup tables info. This caused unwinding out of modules to stop
working.
So add binary lookup header setup (equivalent of eh_frame_hdr setup) to
modules as well.
While at it, confine the header setup to within unwinder code,
reducing one API exposed out of unwinder code.
Fixes: 2e22502c08 ARC: dw2 unwind: Remove falllback linear search thru FDE entries
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
HIGHMEM support bumped the default memory size for nsim platform to 1G.
Thus total memory ended at the very edge of start of peripherals address
space. With linux link base shifted, memory started bleeding into
peripheral space which caused early boot bad_page spew !
Fixes: 29e332261d ("ARC: mm: HIGHMEM: populate high memory from DT")
Reported-by: Anton Kolesov <akolesov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This was the second perf intr issue
perf sampling on multicore requires intr to be enabled on all cores.
ARC perf probe code used helper arc_request_percpu_irq() which calls
- request_percpu_irq() on core0
- enable_percpu_irq() on all all cores (including core0)
genirq requires that request be made ahead of enable call.
However if perf probe happened on non core0 (observed on a 3.18 kernel),
enable would get called ahead of request, failing obviously and
rendering perf intr disabled on all such cores
[ 11.120000] 1 ARC perf : 8 counters (48 bits), 113 conditions, [overflow IRQ support]
[ 11.130000] 1 -----> enable_percpu_irq() IRQ 20 failed
[ 11.140000] 3 -----> enable_percpu_irq() IRQ 20 failed
[ 11.140000] 2 -----> enable_percpu_irq() IRQ 20 failed
[ 11.140000] 0 =====> request_percpu_irq() IRQ 20
[ 11.140000] 0 -----> enable_percpu_irq() IRQ 20
Fix this fragility, by calling request_percpu_irq() on whatever core
calls probe (there is no requirement on which core calls this anyways)
and then calling enable on each cores.
Interestingly this started as invesigation of STAR 9000838902:
"sporadically IRQs enabled on perf prob"
which was about occassional boot spew as request_percpu_irq got called
non-locally (from an IPI), and re-enabled interrupts in following path
proc_mkdir -> spin_unlock_irq()
which the irq work code didn't like.
| ARC perf : 8 counters (48 bits), 113 conditions, [overflow IRQ support]
|
| BUG: failure at ../kernel/irq_work.c:135/irq_work_run_list()!
| CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.10-01127-g285efb8e66d1 #2
|
| Stack Trace:
| arc_unwind_core.constprop.1+0x94/0x104
| dump_stack+0x62/0x98
| irq_work_run_list+0xb0/0xb4
| irq_work_run+0x22/0x3c
| do_IPI+0x74/0x9c
| handle_irq_event_percpu+0x34/0x164
| handle_percpu_irq+0x58/0x78
| generic_handle_irq+0x1e/0x2c
| arch_do_IRQ+0x3c/0x60
| ret_from_exception+0x0/0x8
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: <stable@vger.kernel.org> #4.2+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
arc_request_percpu_irq() is called by all cores to request/enable percpu
irq. It has some "prep" calls needed by genirq:
- setup percpu devid
- disable IRQ_NOAUTOEN
However given that enable_percpu_irq() is called enayways, latter can be
avoided.
We are now left with irq_set_percpu_devid() quirk and that too for
ARCompact builds only, since previous patch updated ARCv2 intc to do this
in the "right" place, i.e. irq map function.
By next release, this will ultimately be fixed for ARCompact as well.
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Current ARC SDP boards cannot reliably handle 1Gbit
Ethernet connections due to limitations in hardware.
To make sure networking is stable on the board we're
limiting phy to 100 Mbit.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Makes it easier to handle init vs core cleanly, though the change is
fairly invasive across random architectures.
It simplifies the rbtree code immediately, however, while keeping the
core data together in the same cachline (now iff the rbtree code is
enabled).
Acked-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Fixes STAR 9000953410: "perf callgraph profiling causing RCU stalls"
| perf record -g -c 15000 -e cycles /sbin/hackbench
|
| INFO: rcu_preempt self-detected stall on CPU
| 1: (1 GPs behind) idle=609/140000000000002/0 softirq=2914/2915 fqs=603
| Task dump for CPU 1:
in-kernel dwarf unwinder has a fast binary lookup and a fallback linear
search (which iterates thru each of ~11K entries) thus takes 2 orders of
magnitude longer (~3 million cycles vs. 2000). Routines written in hand
assembler lack dwarf info (as we don't support assembler CFI pseudo-ops
yet) fail the unwinder binary lookup, hit linear search, failing
nevertheless in the end.
However the linear search is pointless as binary lookup tables are created
from it in first place. It is impossible to have binary lookup fail while
succeed the linear search. It is pure waste of cycles thus removed by
this patch.
This manifested as RCU stalls / NMI watchdog splat when running
hackbench under perf with callgraph profiling. The triggering condition
was perf counter overflowing in routine lacking dwarf info (like memset)
leading to patheic 3 million cycle unwinder slow path and by the time it
returned new interrupts were already pending (Timer, IPI) and taken
rightaway. The original memset didn't make forward progress, system kept
accruing more interrupts and more unwinder delayes in a vicious feedback
loop, ultimately triggering the NMI diagnostic.
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
SYNC in __switch_to() is a historic relic and not needed at all.
- In UP context it is obviously useless, why would we want to stall
the core for all updates to stack memory of t0 to complete before
loading kernel mode callee registers from t1 stack's memory.
- In SMP, there could be potential race in which outgoing task could
be concurrently picked for running on a different core, thus writes
to stack here need to be visible before the reads from stack on
other core. Peter confirmed that generic schedular already has needed
barriers (by way of rq lock) so there is no need for additional arch
barrier.
This came up when Noam was trying to replace this SYNC with EZChip
specific hardware thread scheduling instruction for their platform
support.
Link: http://lkml.kernel.org/r/20151102092654.GM17308@twins.programming.kicks-ass.net
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Cc: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Although kernel doesn't support the multiple IRQ priority levels provided
by HS38x core intc yet, ensure that the default prio value is used
anyways by relevant code.
SLEEP insn needs to be provided the IRQ priority level which can
interrupt it. This needs to be the default level which maynot
necessarily be 0 as assumed by current code.
This change allows a kernel with ARCV2_IRQ_DEF_PRIO = 1 to boot fine.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
When building kernel with buildroot built toolchain, CROSS_COMPILE
currently needs adjustment even if minor. This is because the defconfigs
prefer "arc-linux-uclibc-" prefix from hand built (non buildroot) toolchain
while buildroot provides "arc-buildroot-linux-uclibc-"
To avoid this use the common "arc-linux-" prefix which is provided by
buildroot and has also been in hand built tools for quite some time.
Signed-off-by: sujayraaj <sujayraaj@gmail.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: updated changelog]
- A bunch of brown paper bag bugs (MAINTAINERS list email, SMP build failure)
- cpu_relax() now compiler barrier for UP as well
- Handling of userspace Bus Errors for ARCompact builds
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWRucxAAoJEGnX8d3iisJehaMP/RBYry4TGSSYDC7DqkbpTDlv
Wd/HE3HDNC0r6hqLXMO2MBLWLvK22pJPcGzXkXtw8kqwTYtKWjZ0IE3R72Lgfw4P
xlWKdjeI4RXkzJKOZhh6/7HVTkOSto4cUAcdwFcq4ec8asJBAwl/zWoM2Rw8PdfG
A11iAZTHUSavj/QkFpuqhtvRMRUe76cK41RvdoJfOtB9MjRF3XD0+ceXeDTYuoRb
aETH42JS5XjRvGShcvaUOCKDZcxlsPyd9LZZAzrLLIoepb7pBluh+YVJH3wJXBcl
ECjXSprv6GUR1C8R7G3lMtGwIt2tBINTuxH/ZVQp2pUKIUW/TL8MXEQGvWfrosXL
SbgsIYQSTuc9aO5c5qZ7MSEWz+hLDVlrgWZzs7FsNH7wxREQgl0hjsr382X0Y4n0
tWPVMr0Hvu7rLdH0gvxIXA3rF92Q9kfLTXj2flaiwXgCxoOoeQj/CtqhUs1xbzed
qJXf9bwWsjhexxwvHi9CJpyYaVFDp2kkxMoJvzvsLT9cOUGC0XKHnCT4Q96VE5Ms
9bVBngAugeZRNqB8vKi/84oU8A1Sq+KoTk6b87Z/wpzkh06tmsMQrOEbzoZQsDh9
6yPW7hgYb794apY9oKwTshHZzXDPg8J/+SVzKht8f84YtSbzS476K70PqnFeoUkD
2W7IeYKaUolgt3n3BoOO
=isVd
-----END PGP SIGNATURE-----
Merge tag 'arc-4.4-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"Found a couple of brown paper bag bugs with the prev pull request
(including a SMP build breakage report from Guenter). Since these are
urgent I also decided to send over a bunch of other pending fixes
which could have otherwise waited an rc or two.
Summary:
- A bunch of brown paper bag bugs (MAINTAINERS list email, SMP build
failure)
- cpu_relax() now compiler barrier for UP as well
- handling of userspace Bus Errors for ARCompact builds"
* tag 'arc-4.4-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: Fix silly typo in MAINTAINERS file
ARC: cpu_relax() to be compiler barrier even for UP
ARC: use ASL assembler mnemonic
ARC: [arcompact] Handle bus error from userspace as Interrupt not exception
ARC: remove extraneous header include
ARCv2: lib: memcpy: use local symbols
cpu_relax() on ARC has been barrier only for SMP (and no-op for UP). Per
recent discussions, it is safer to make it a compiler barrier
unconditionally.
Link: http://lkml.kernel.org/r/53A7D3AA.9020100@synopsys.com
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARCompact and ARCv2 only have ASL, while binutils used to support LSL as
a alias mnemonic.
Newer binutils (upstream) don't want to do that so replace it.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Bus errors from userspace on ARCompact based cores are handled by core
as a high priority L2 interrupt but current code treated it as interrupt
Handling an interrupt like exception is certainly not going to go unnoticed.
(and it worked so far as we never saw a Bus error from userspace until
IPPK guys tested a DDR controller with ECC error detection etc hence
needed to explicitly trigger/handle such errors)
- So move mem_service exception handler from common code into ARCv2 code.
- In ARCompact code, define mem_service as L2 interrupt handler which
just drops down to pure kernel mode and goes of to enqueue SIGBUS
Reported-by: Nelson Pereira <npereira@synopsys.com>
Tested-by: Ana Martins <amartins@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>