Refactor the setting of kexec OF properties, moving the common code
from machine_kexec_64.c to machine_kexec.c where it can be used on
both ppc64 and ppc32. This is needed for kexec to work on ppc32
platforms.
Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently, pseries_cpu_die() calls msleep() while polling RTAS for
the status of the dying cpu.
However, if the cpu that is going down also happens to be the one
doing the tick then we're hosed as the tick_do_timer_cpu 'baton' is
only passed later on in tick_shutdown() when _cpu_down() does the
CPU_DEAD notification. Therefore jiffies won't be updated anymore.
This replaces that msleep() with a cpu_relax() to make sure we're not
going to schedule at that point.
With this patch my test box survives a 100k iterations hotplug stress
test on _all_ cpus, whereas without it, it quickly dies after ~50
iterations.
Signed-off-by: Sebastien Dugue <sebastien.dugue@bull.net>
Cc: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Commit 2a4aca1144 ("powerpc/mm: Split
low level tlb invalidate for nohash processors") changed a call to
_tlbia to _tlbil_all but didn't include the header that defines
_tlbil_all, leading to a build failure on 440 if KVM is enabled.
This fixes it.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since the QPACE (Chromodynamics Parallel Computing on the
Cell Broadband Engine) platform doesn't use a iommu, doesn't
have PCI devices and a MPIC much lesser setup and
configurations are needed. So far all devices are detected
as OF device. A notifier function is used to set the dma_ops
for the of_platform bus. Further this patch splits the
PPC_CELL_NATIVE into PPC_CELL_COMMON which are parts that are
shared with the QPACE platform and the rest.
Signed-off-by: Benjamin Krill <ben@codiert.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
CBE_THERM and OPROFILE_CELL both cannot be built without
SPU_FS disabled, so make the dependency explicit.
Reported-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Add RTS/CTS-support for the PSC of the MPC5200B. Tested with a Phytec
MPC5200B-IO.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This patch adds MDMA/UDMA support using BestComm for DMA on the MPC5200
platform. Based heavily on previous work by Freescale (Bernard Kuhn,
John Rigby) and Domen Puncer.
With this patch, a SanDisk Extreme IV CF card gets read speeds of
approximately 26.70 MB/sec.
Signed-off-by: Tim Yamin <plasm@roo.me.uk>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
When ATA DMA is enabled, bestcomm prefetching does not work. This
patch adds a function to disable bestcomm prefetch when the ATA
Bestcomm task is initialized.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
1) ata.h has dst_pa in the wrong place (needs to match what the BestComm
task microcode in bcom_ata_task.c expects); fix it.
2) The BestComm ATA task priority was changed to maximum in bestcomm_priv.h;
this fixes a deadlock issue experienced with heavy DMA occurring on
both the ATA and Ethernet BestComm tasks, e.g. when downloading a large
file over a LAN to disk.
Signed-off-by: Tim Yamin <plasm@roo.me.uk>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
The buffer descriptors for the ATA BestComm task are larger than the
current definition for bcom_bd. This causes problems because the
various bcom_... functions dereference the buffer descriptor pointer
by using the array operator which doesn't work when the buffer
descriptors are a different size.
This patch adds the bcom_get_bd() function which uses the value in
bcom_task.bd_size to calculate the offset into the BD table. This
patch also changes the definition of bcom_bd to specify a data size
of 0 instead of 1 so that it will never work if anyone attempts to
dereference the bd list as an array (as opposed to something that
might work even though it is wrong).
Finally, this patch moves the definition of bcom_bd up in the file
to eliminate a forward declaration.
Based on patch originally written by Tim Yamin.
Signed-off-by: Tim Yamin <plasm@roo.me.uk>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
The MPC5200 internal interrupt controller setup function needs to set
the default interrupt controller when it is called. Without this
irq_create_of_mapping() cannot be called without first determining
the pointer to the irq controller (ie. call with controller = NULL).
Reported-by: Steven Cavanagh <scavanagh@secretlab.ca>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This patch adds documentation to the mpc5200 interrupt controller
driver and cleans up some minor coding conventions. It also moves the
contents of mpc52xx_pic.h into the driver proper (except for a small
common bit that is moved to the common mpc52xx.h) because the
information encoded there is not required by any other part of kernel
code. Finally for code readability sake, the L2_OFFSET shift value
is removed because the code using it resolves to a noop.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Rework to MMU code dropped a much missed 'blr' instruction.
Brown-Paper-Bag-Worn-By: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
The correct #address-cells was still used for the actual translation,
so the impact is only a possibility of choosing the wrong range entry
or failing to find any match. Most common cases were not affected.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add const qualifier to device_node argument for
dcr_resource_{start,len} as of_get_property also const-qualifies this
argument.
Signed-off-by: Grant Erickson <gerickson@nuovations.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
After discussing with chip designers, it appears that it's not
necessary to set G everywhere on 440 cores. The various core
errata related to prefetch should be sorted out by firmware by
disabling icache prefetching in CCR0. We add the workaround to
the kernel however just in case oooold firmwares don't do it.
This is valid for -all- 4xx core variants. Later ones hard wire
the absence of prefetch but it doesn't harm to clear the bits
in CCR0 (they should already be cleared anyway).
We still leave G=1 on the linear mapping for now, we need to
stop over-mapping RAM to be able to remove it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently, we never set _PAGE_COHERENT in the PTEs, we just OR it in
in the hash code based on some CPU feature bit. We also manipulate
_PAGE_NO_CACHE and _PAGE_GUARDED by hand in all sorts of places.
This changes the logic so that instead, the PTE now contains
_PAGE_COHERENT for all normal RAM pages thay have I = 0 on platforms
that need it. The hash code clears it if the feature bit is not set.
It also adds some clean accessors to setup various valid combinations
of access flags and change various bits of code to use them instead.
This should help having the PTE actually containing the bit
combinations that we really want.
I also removed _PAGE_GUARDED from _PAGE_BASE on 44x and instead
set it explicitely from the TLB miss. I will ultimately remove it
completely as it appears that it might not be needed after all
but in the meantime, having it in the TLB miss makes things a
lot easier.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This makes the MMU context code used for CPUs with no hash table
(except 603) dynamically allocate the various maps used to track
the state of contexts.
Only the main free map and CPU 0 stale map are allocated at boot
time. Other CPU maps are allocated when those CPUs are brought up
and freed if they are unplugged.
This also moves the initialization of the MMU context management
slightly later during the boot process, which should be fine as
it's really only needed when userland if first started anyways.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The handlers for Critical, Machine Check or Debug interrupts
will save and restore MMUCR nowadays, thus we only need to
disable normal interrupts when invalidating TLB entries.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently, the various forms of low level TLB invalidations are all
implemented in misc_32.S for 32-bit processors, in a fairly scary
mess of #ifdef's and with interesting duplication such as a whole
bunch of code for FSL _tlbie and _tlbia which are no longer used.
This moves things around such that _tlbie is now defined in
hash_low_32.S and is only used by the 32-bit hash code, and all
nohash CPUs use the various _tlbil_* forms that are now moved to
a new file, tlb_nohash_low.S.
I moved all the definitions for that stuff out of
include/asm/tlbflush.h as they are really internal mm stuff, into
mm/mmu_decl.h
The code should have no functional changes. I kept some variants
inline for trivial forms on things like 40x and 8xx.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This commit moves the whole no-hash TLB handling out of line into a
new tlb_nohash.c file, and implements some basic SMP support using
IPIs and/or broadcast tlbivax instructions.
Note that I'm using local invalidations for D->I cache coherency.
At worst, if another processor is trying to execute the same and
has the old entry in its TLB, it will just take a fault and re-do
the TLB flush locally (it won't re-do the cache flush in any case).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We're soon running out of CPU features and I need to add some new
ones for various MMU related bits, so this patch separates the MMU
features from the CPU features. I moved over the 32-bit MMU related
ones, added base features for MMU type families, but didn't move
over any 64-bit only feature yet.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This reworks the context management code used by 4xx,8xx and
freescale BookE. It adds support for SMP by implementing a
concept of stale context map to lazily flush the TLB on
processors where a context may have been invalidated. This
also contains the ground work for generalizing such lazy TLB
flushing by just picking up a new PID and marking the old one
stale. This will be implemented later.
This is a first implementation that uses a global spinlock.
Ideally, we should try to get at least the fast path (context ID
already assigned) lockless or limited to a per context lock,
but for now this will do.
I tried to keep the UP case reasonably simple to avoid adding
too much overhead to 8xx which does a lot of context stealing
since it effectively has only 16 PIDs available.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This splits the mmu_context handling between 32-bit hash based
processors, 64-bit hash based processors and everybody else. This is
preliminary work for adding SMP support for BookE processors.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds supports to the "extended" DCR addressing via the indirect
mfdcrx/mtdcrx instructions supported by some 4xx cores (440H6 and
later).
I enabled the feature for now only on AMCC 460 chips.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When running Active Memory Sharing, pages can get marked as
"loaned" with the hypervisor by the CMM driver. This state gets
cleared by the system firmware when rebooting the partition.
When using kexec to boot a new kernel, this state never gets
cleared and the hypervisor and CMM driver can get out of sync
with respect to the number of pages currently marked "loaned".
Fix this by adding a reboot notifier to the CMM driver to deflate
the balloon and mark all pages as active.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When running Active Memory Sharing, the Collaborative Memory Manager
(CMM) may mark some pages as "loaned" with the hypervisor.
Periodically, the CMM will query the hypervisor for a loan request,
which is a single signed value. When kexec'ing into a kdump kernel,
the CMM driver in the kdump kernel is not aware of the pages the
previous kernel had marked as "loaned", so the hypervisor and the CMM
driver are out of sync. This results in the CMM driver getting a
negative loan request, which can then get treated as a large unsigned
value and can cause kdump to hang due to the CMM driver inflating too
large. Since there really is no clean way for the CMM driver in the
kdump kernel to clean this up, simply disable CMM in the kdump kernel.
This fixes hangs we were seeing doing kdump with AMS.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Otherwise you get lot of errors like these:
drivers/block/viodasd.c:72: error: dereferencing pointer to incomplete type
drivers/block/viodasd.c: In function 'viodasd_open':
drivers/block/viodasd.c:135: error: dereferencing pointer to incomplete type
drivers/block/viodasd.c: In function 'viodasd_release':
drivers/block/viodasd.c:184: error: dereferencing pointer to incomplete type
drivers/block/viodasd.c: In function 'viodasd_getgeo':
drivers/block/viodasd.c:209: error: dereferencing pointer to incomplete type
drivers/block/viodasd.c:214: error: implicit declaration of function 'get_capacity'
drivers/block/viodasd.c: At top level:
drivers/block/viodasd.c:222: error: variable 'viodasd_fops' has initializer but incomplete type
drivers/block/viodasd.c:223: error: unknown field 'owner' specified in initializer
Discovered by a randconfig build.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
ibm_configure_kernel_dump is passed as the token to rtas_call() is
never initialised. This sets it to something sane.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Acked-by: Nathan Lynch <ntl@pobox.com>
Acked-by: Manish Ahuja <mahujam@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
print_dump_header() will be called at least once with a NULL pointer in
a normal boot sequence. If DEBUG is defined then we will dereference
the pointer and crash. Add a quick fix to exit early in the NULL pointer
case.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Acked-by: Manish Ahuja <mahujam@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Rename PowerPC's struct vm_region so that I can introduce my own
global version for NOMMU. It's feasible that the PowerPC version may
wish to use my global one instead.
The NOMMU vm_region struct defines areas of the physical memory map
that are under mmap. This may include chunks of RAM or regions of
memory mapped devices, such as flash. It is also used to retain
copies of file content so that shareable private memory mappings of
files can be made. As such, it may be compatible with what is
described in the banner comment for PowerPC's vm_region struct.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Using the common code means that more complete cache information will
provided in sysfs on platforms that don't use the l2-cache property
convention.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The smp code uses cache information to populate cpu_core_map; change
it to use common code for cache lookup.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We have more than one piece of code that looks up cache nodes manually
using the "l2-cache" property. Add a common helper routine which does
this and handles ePAPR's "next-level-cache" property as well as
powermac.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc:
powerpc: Fix corruption error in rh_alloc_fixed()
powerpc/fsl-booke: Fix the miss interrupt restore
There is an error in rh_alloc_fixed() of the Remote Heap code:
If there is at least one free block blk won't be NULL at the end of the
search loop, so -ENOMEM won't be returned and the else branch of
"if (bs == s || be == e)" will be taken, corrupting the management
structures.
Signed-off-by: Guillaume Knispel <gknispel@proformatique.com>
Acked-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* 'sh/for-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: Disable GENERIC_HARDIRQS_NO__DO_IRQ for unconverted platforms.
sh: maple: Do not pass SLAB_POISON to kmem_cache_create()
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc/cell/axon-msi: Fix MSI after kexec
powerpc: Fix bootmem reservation on uninitialized node
powerpc: Check for valid hugepage size in hugetlb_get_unmapped_area
Presently limited to Cayman, Dreamcast, Microdev, and SystemH 7751.
Re-enable it for everyone once these have been fixed up.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The function flush_HPTE() is used in only one place, the implementation
of DEBUG_PAGEALLOC on ppc32.
It's actually a dup of flush_tlb_page() though it's -slightly- more
efficient on hash based processors. We remove it and replace it by
a direct call to the hash flush code on those processors and to
flush_tlb_page() for everybody else.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This renames the files to clarify the fact that they are used by
the hash based family of CPUs (the 603 being an exception in that
family but is still handled by that code).
This paves the way for the new tlb_nohash.c coming via a subsequent
commit.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds a local_flush_tlb_mm() call as a pre-requisite for some
SMP work for BookE processors.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Instead of not defining it at all, this defines the macro as
being empty, thus avoiding ifdef's in call sites when CONFIG_BUG
is not set.
Also removes an extra whitespace in the existing definition.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The block layer dropped the virtual merge feature
(b8b3e16cfe). BIO_VMERGE_BOUNDARY
definition is meaningless now (For POWER, BIO_VMERGE_BOUNDARY has been
meaningless for a long time since POWER disables the virtual merge
feature).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently there are a number of platforms that open code access to
the ppc_pci_flags global variable. However, that variable is not
present if CONFIG_PCI is not set, which can lead to a build break.
This introduces a number of accessor functions that are defined
to be empty in the case of CONFIG_PCI being disabled. The
various platform files in the kernel are updated to use these.
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since "Factor out cpu joining/unjoining the GIQ"
(b4963255ad) the WARN_ON in
xics_set_cpu_giq() is being triggered during boot on JS20 because the
GIQ indicator is not available on that platform. While the warning is
harmless and the system runs normally, it's nicer to check for the
existence of the indicator before trying to manipulate it.
Implement rtas_indicator_present(), which searches the
/rtas/rtas-indicators property for the given indicator token, and use
this function in xics_set_cpu_giq().
Also use a WARN statement in xics_set_cpu_giq to get better
information on failure.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Acked-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>