The current low level hash code on LPAR configurations clears
_PAGE_COHERENT (M) when either _PAGE_GUARDED (G) or _PAGE_NO_CACHE (I)
is set. This conflicts with _PAGE_SAO which has M, I and W bits sets at
once (normally invalid combo) to indicate the new SAO attribute.
This changes the code to allow that case.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Allow an application to enable Strong Access Ordering on specific pages of
memory on Power 7 hardware. Currently, power has a weaker memory model than
x86. Implementing a stronger memory model allows an emulator to more
efficiently translate x86 code into power code, resulting in faster code
execution.
On Power 7 hardware, storing 0b1110 in the WIMG bits of the hpte enables
strong access ordering mode for the memory page. This patchset allows a
user to specify which pages are thus enabled by passing a new protection
bit through mmap() and mprotect(). I have defined PROT_SAO to be 0x10.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add the CPU feature bit for the new Strong Access Ordering
facility of Power7
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Joel Schopp <jschopp@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch defines:
- PROT_SAO, which is passed into mmap() and mprotect() in the prot field
- VM_SAO in vma->vm_flags, and
- _PAGE_SAO, the combination of WIMG bits in the pte that enables strong
access ordering for the page.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch allows architectures to define functions to deal with
additional protections bits for mmap() and mprotect().
arch_calc_vm_prot_bits() maps additonal protection bits to vm_flags
arch_vm_get_page_prot() maps additional vm_flags to the vma's vm_page_prot
arch_validate_prot() checks for valid values of the protection bits
Note: vm_get_page_prot() is now pretty ugly, but the generated code
should be identical for architectures that don't define additional
protection bits.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The task_pt_regs() macro allows access to the pt_regs of a given task.
This macro is not currently defined for the powerpc architecture, but
we need it for some upcoming utrace additions.
Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This changes the oops and backtrace code to use the new %pS
printk extension to print out symbols rather than manually
calling print_symbol.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Move device_to_mask() to dma-mapping.h because we need to use it from
outside dma_64.c in a later patch.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Make cell_dma_dev_setup_iommu() return a pointer to the struct iommu_table
(or NULL if no table can be found) rather than putting this pointer into
dev->archdata.dma_data (let the caller do that), and rename this function
to cell_get_iommu_table() to reflect this change.
This will allow us to get the iommu table for a device that doesn't have
the table in the archdata.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Update powerpc to use the new dma_*map*_attrs() interfaces. In doing so
update struct dma_mapping_ops to accept a struct dma_attrs and propagate
these changes through to all users of the code (generic IOMMU and the
64bit DMA code, and the iseries and ps3 platform code).
The old dma_*map_*() interfaces are reimplemented as calls to the
corresponding new interfaces.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Make iommu_map_sg take a struct iommu_table. It did so before commit
740c3ce667 (iommu sg merging: ppc: make
iommu respect the segment size limits).
This stops the function looking in the archdata.dma_data for the iommu
table because in the future it will be called with a device that has
no table there.
This also has the nice side effect of making iommu_map_sg() match the
other map functions.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
As nr_active counter includes also spus waiting for syscalls to return
we need a seperate counter that only counts spus that are currently running
on spu side. This counter shall be used by a cpufreq governor that targets
a frequency dependent from the number of running spus.
Signed-off-by: Christian Krafft <krafft@de.ibm.com>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reduce the output verbosity of ps3_system_bus_match().
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently, the .ctx debug file in spu context directories is always
present.
We'd prefer to prevent users from relying on this file, so add a
"debug" mount option to spufs. The .ctx file will only be added to
the context directories when this option is present.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Populate the size member of a few context files. Leave out files that
have different semantics with read vs mmap, or contain a
variable-length hex string.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Currently, spufs never specifies the i_size for the files in context
directories, so stat() always reports 0-byte files.
This change adds allows the spufs_dir_(nosched_)contents arrays to
specify a file size. This allows stat() to report correct file sizes,
and makes SEEK_END work.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
An spu context shouldn't get an extra tick if the time slice code
couldn't find something else to run. This means contexts that are not
within spu_run (ie, SPU_SCHED_SPU_RUN is cleared) will not receive
extra ticks while we have no other contexts waiting.
Signed-off-by: Luke Browning <lukebrowning@us.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Add a ctxt file to spufs that shows spu context information that is used
in scheduling. This info can be used for debugging spufs scheduler
issues, and to isolate between application and spufs problems as it
shows a lot of state such as priorities and dispatch counts.
This file contains internal spufs state and is subject to change at any
time, and therefore no applications should depend on it. The file is
intended for the use of spufs kernel developers.
Signed-off-by: Luke Browning <lukebrowning@us.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Update the association of a memory section with a numa node that
occurs during hotplug add of a memory section. This adds a check in
the hot_add_scn_to_nid() routine for the
ibm,dynamic-reconfiguration-memory node in the device tree. If
present the new hot_add_drconf_scn_to_nid() routine is invoked, which
can properly parse the ibm,dynamic-reconfiguration-memory node of the
device tree and make the proper numa node associations.
This also introduces the valid_hot_add_scn() routine as a helper
function for code that is common to the hot_add_scn_to_nid() and
hot_add_drconf_scn_to_nid() routines.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This splits off several pieces of code that parse the
ibm,dynamic-reconfiguration-memory node of the device tree into separate
helper routines. This is in preparation for the next commit that will
use these helper routines. There are no functional changes in this patch.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This updates the device tree manipulation routines so that memory
add/remove of lmbs represented under the
ibm,dynamic-reconfiguration-memory node of the device tree invokes the
hotplug notifier chain.
This change is needed because of the change in the way memory is
represented under the ibm,dynamic-reconfiguration-memory node. All lmbs
are described in the ibm,dynamic-memory property instead of having a
separate node for each lmb as in previous device tree layouts. This
requires the update_node() routine to check for updates to the
ibm,dynamic-memory property and invoke the hotplug notifier chain.
This also updates the pseries hotplug notifier to be able to gather information
for lmbs represented under the ibm,dynamic-reconfiguration-memory node and
have the lmbs added/removed.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Use the base address of the lmb to derive the starting page frame number
instead of trying to extract it from the drc index of the lmb. The drc
index should not be used for this as it will, and did, break.
Until this point, systems that have had memory represented in the device
tree with a node for each lmb the drc index would (luckily) closely
track the base address of the lmb. For example a lmb with a drc index
of 8000000a would have a base address of a0000000. This correlation
allowed the current code to derive the starting page frame number from
the drc inddex
Device tree layouts where lmbs are represented under the
ibm,dynamic-reconfiguration-memory node in the ibm,dynamic-memory
property do not have this correlation between the drc index and base
address of the lmb.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Allow the phandle passed to the /proc/ppc64/ofdt file to be specified
in formats other than decimal. This allows us to easily specify phandle
values in hex that would otherwise appear as negative integers.
This is an issue on systems where the value of
/proc/device-tree/ibm,dynamic-reconfiguration-memory.ibm,phandle is
fffffff9. Having to pass this to the ofdt file as a string results in
a large negative number, and simple_strtoul() does not handle negative
numbers.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since Roland's ptrace cleanup starting with commit
f65255e8d5 ("[POWERPC] Use user_regset
accessors for FP regs"), the dump_task_* functions are no longer being
used.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This merges and cleans up some of the ugly copy/to from user code
which is required for the new fpr and vsx layout in the thread_struct.
Also fixes some hard coded buffer sizes and removes a redundant
fpr_flush_to_thread.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
To allow for a single kernel image on e500 v1/v2/mc we need to fixup lwsync
at runtime. On e500v1/v2 lwsync causes an illop so we need to patch up
the code. We default to 'sync' since that is always safe and if the cpu
is capable we will replace 'sync' with 'lwsync'.
We introduce CPU_FTR_LWSYNC as a way to determine at runtime if this is
needed. This flag could be moved elsewhere since we dont really use it
for the normal CPU_FTR purpose.
Finally we only store the relative offset in the fixup section to keep it
as small as possible rather than using a full fixup_entry.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We need to use PPC_LCMPI otherwise we get compile errors like:
arch/powerpc/lib/feature-fixups-test.S: Assembler messages:
arch/powerpc/lib/feature-fixups-test.S:142: Error: Unrecognized opcode: `cmpdi'
arch/powerpc/lib/feature-fixups-test.S:149: Error: Unrecognized opcode: `cmpdi'
arch/powerpc/lib/feature-fixups-test.S:164: Error: Unrecognized opcode: `cmpdi'
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently we get this warning:
arch/powerpc/kernel/init_task.c:33: warning: missing braces around initializer
arch/powerpc/kernel/init_task.c:33: warning: (near initialization for 'init_task.thread.fpr[0]')
This fixes it.
Noticed by Stephen Rothwell.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently the kernel fails to build with the above config options with:
CC arch/powerpc/mm/mem.o
arch/powerpc/mm/mem.c: In function 'arch_add_memory':
arch/powerpc/mm/mem.c:130: error: implicit declaration of function 'create_section_mapping'
This explicitly includes asm/sparsemem.h in arch/powerpc/mm/mem.c and
moves the guards in include/asm-powerpc/sparsemem.h to protect the
SPARSEMEM specific portions only.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This correctly hooks the VSX dump into Roland McGrath core file
infrastructure. It adds the VSX dump information as an additional elf
note in the core file (after talking more to the tool chain/gdb guys).
This also ensures the formats are consistent between signals, ptrace
and core files.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fix compile error when CONFIG_VSX is enabled.
arch/powerpc/kernel/signal_64.c: In function 'restore_sigcontext':
arch/powerpc/kernel/signal_64.c:241: error: 'i' undeclared (first use in this function)
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently when a 32 bit process is exec'd on a powerpc 64 bit host the
value in the top three bytes of the personality is clobbered. patch
adds a check in the SET_PERSONALITY macro that will carry all the
values in the top three bytes across the exec.
These three bytes currently carry flags to disable address randomisation,
limit the address space, force zeroing of an mmapped page, etc. Should an
application set any of these bits they will be maintained and honoured on
homogeneous environment but discarded and ignored on a heterogeneous
environment. So if an application requires all mmapped pages to be initialised
to zero and a wrapper is used to setup the personality and exec the target,
these flags will remain set on an all 32 or all 64 bit envrionment, but they
will be lost in the exec on a mixed 32/64 bit environment. Losing these bits
means that the same application would behave differently in different
environments. Tested on a POWER5+ machine with 64bit kernel and a mixed
64/32 bit user space.
Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When compiling kernel modules for ppc that include <linux/spinlock.h>,
gcc prints a warning message every time it encounters a function
declaration where the inline keyword appears after the return type.
This makes sure that the order of the inline keyword and the return
type is as gcc expects it. Additionally, the __inline__ keyword is
replaced by inline, as checkpatch expects.
Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Gcc 4.3 produced this warning:
arch/powerpc/kernel/signal_64.c: In function 'restore_sigcontext':
arch/powerpc/kernel/signal_64.c:161: warning: array subscript is above array bounds
This is caused by us copying to aliases of elements of the pt_regs
structure. Make those explicit.
This adds one extra __get_user and unrolls a loop.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This removes the experimental status of kdump on PPC64. kdump is on
PPC64 now since more than one year and it has proven to be stable.
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The implementation of huge_ptep_set_wrprotect() directly calls
ptep_set_wrprotect() to mark a hugepte write protected. However this
call is not appropriate on ppc64 kernels as this is a small page only
implementation. This can lead to the hash not being flushed correctly
when a mapping is being converted to COW, allowing processes to continue
using the original copy.
Currently huge_ptep_set_wrprotect() unconditionally calls
ptep_set_wrprotect(). This is fine on ppc32 kernels as this call is
generic. On 64 bit this is implemented as:
pte_update(mm, addr, ptep, _PAGE_RW, 0);
On ppc64 this last parameter is the page size and is passed directly on
to hpte_need_flush():
hpte_need_flush(mm, addr, ptep, old, huge);
And this directly affects the page size we pass to flush_hash_page():
flush_hash_page(vaddr, rpte, psize, ssize, 0);
As this changes the way the hash is calculated we will flush the wrong
pages, potentially leaving live hashes to the original page.
Move the definition of huge_ptep_set_wrprotect() to the 32/64 bit specific
headers.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
On PowerPC processors with non-coherent cache architectures the DMA
subsystem calls invalidate_dcache_range() before performing a DMA read
operation. If the address and length of the DMA buffer are not aligned
to a cache-line boundary this can result in memory outside of the DMA
buffer being invalidated in the cache. If this memory has an
uncommitted store then the data will be lost and a subsequent read of
that address will result in an old value being returned from main memory.
Only when the DMA buffer starts on a cache-line boundary and is an exact
mutiple of the cache-line size can invalidate_dcache_range() be called,
otherwise flush_dcache_range() must be called. flush_dcache_range()
will first flush uncommitted writes, and then invalidate the cache.
Signed-off-by: Andrew Lewis <andrew-lewis at netspace.net.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since most bootloaders or wrappers tend to update or add some information
to the .dtb they a handled they need some working space to do that in.
By default add 1K of padding via a default setting of DTS_FLAGS.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add CONFIG_VSX config build option. Must compile with POWER4, FPU and ALTIVEC.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch extends the floating point save and restore code to use the
VSX load/stores when VSX is available. This will make FP context
save/restore marginally slower on FP only code, when VSX is available,
as it has to load/store 128bits rather than just 64bits.
Mixing FP, VMX and VSX code will get constant architected state.
The signals interface is extended to enable access to VSR 0-31
doubleword 1 after discussions with tool chain maintainers. Backward
compatibility is maintained.
The ptrace interface is also extended to allow access to VSR 0-31 full
registers.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds the macros for the VSX load/store instruction as most
binutils are not going to support this for a while.
Also add VSX register save/restore macros and vsr[0-63] register definitions.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add a VSX CPU feature. Also add code to detect if VSX is available
from the device tree.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Joel Schopp <jschopp@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The layout of the new VSR registers and how they overlap on top of the
legacy FPR and VR registers is:
VSR doubleword 0 VSR doubleword 1
----------------------------------------------------------------
VSR[0] | FPR[0] | |
----------------------------------------------------------------
VSR[1] | FPR[1] | |
----------------------------------------------------------------
| ... | |
| ... | |
----------------------------------------------------------------
VSR[30] | FPR[30] | |
----------------------------------------------------------------
VSR[31] | FPR[31] | |
----------------------------------------------------------------
VSR[32] | VR[0] |
----------------------------------------------------------------
VSR[33] | VR[1] |
----------------------------------------------------------------
| ... |
| ... |
----------------------------------------------------------------
VSR[62] | VR[30] |
----------------------------------------------------------------
VSR[63] | VR[31] |
----------------------------------------------------------------
VSX has 64 128bit registers. The first 32 regs overlap with the FP
registers and hence extend them with and additional 64 bits. The
second 32 regs overlap with the VMX registers.
This commit introduces the thread_struct changes required to reflect
this register layout. Ptrace and signals code is updated so that the
floating point registers are correctly accessed from the thread_struct
when CONFIG_VSX is enabled.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Make load_up_fpu and load_up_altivec callable so they can be reused by
the VSX code.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Move the altivec_unavailable code, to make room at 0xf40 where the
vsx_unavailable exception will be.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We are going to change where the floating point registers are stored
in the thread_struct, so in preparation add some macros to access the
floating point registers. Update all code to use these new macros.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
If we set the SPE MSR bit in save_user_regs we can blow away the VEC
bit. This doesn't matter in reality as they are in fact the same bit
but looks bad.
Also, when we add VSX in a later patch, we need to be able to set two
separate MSR bits here.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently we set the start of the .text section to be 4Mb for pSeries.
In situations where the zImage is > 8Mb we'll fail to boot (due to
overlapping with OF). Move .text in a zImage from 4MB to 64MB
(well past OF).
We still will not be able to load large zImage unless we also move OF,
to that end, add a note to the zImage ELF to move OF to 32Mb. If this
is the very first kernel booted then we'll need to move OF manually by
setting real-base.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Use an alternative feature section in _switch. There are three cases
handled here, either we don't have an SLB, in which case we jump over the
entire code section, or if we do we either do or don't have 1TB segments.
Boot tested on Power3, Power5 and Power5+.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>