The zcore code switches to real addressing mode when creating a kernel dump.
This is not possible, if it is built as a kernel module. With this patch
zcore (zfcpdump) can't be built as a kernel module any more.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The cksm function in system.h is duplicate to csum_partial in checksum.h.
Remove cksm and use csum_partial instead.
Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The lowcore.h header has quite a lot of whitespace damage and a rather
wild collection of entries. Remove all that whitespace and tidy up the
order of the lowcore fields.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We need to use this value in the checkpoint/restart code and would like to
have a constant instead of a magic '3'.
Cc: linux-s390@vger.kernel.org
Signed-off-by: Dan Smith <danms@us.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Avoid the detour over the PLT if the branch target of a function call
in a module is in the range of the bras (16-bit) or brasl (32-bit)
instruction. The PLT is still generated but it is unused.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
strlcpy() does already NUL-terminate the destination string.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Disable switch_amode by default because pagetable walk on pre z9
hardware has negative performance impact.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The clock sync mode flag CLOCK_SYNC_STP is not cleared when stp
is set offline. In this case the get_sync_clock() function returns
-EACCESS and the dasd driver will block all i/o until stp is enabled
again. In addition get_sync_clock can return -EACCESS if the clock is
not in sync instead of -EAGAIN.
Rework the stp/etr online handling to fix these problems.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Move all EXPORT_SYMBOLs to their corresponding definitions.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Everybody enables it so there is no point for an extra config option.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Split machine check handler code and move it to cio and kernel code
where it belongs to. No functional change.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Currently we use the cpuid (via STIDP instruction) to recognize LPAR,
z/VM and KVM.
The architecture states, that bit 0-7 of STIDP returns all zero, and
if STIDP is executed in a virtual machine, the VM operating system
will replace bits 0-7 with FF.
KVM should not use FE to distinguish z/VM from KVM for interested
guests. The proper way to detect the hypervisor is the STSI (Store
System Information) instruction, which return information about the
hypervisors via function code 3, selector1=2, selector2=2.
This patch changes the detection routine of Linux to use STSI instead
of STIDP. This detection is earlier than bootmem, we have to use a
static buffer. Since STSI expects a 4kb block (4kb aligned) this
patch also changes the init.data alignment for s390. As this section
will be freed during boot, this should be no problem.
Patch is tested with LPAR, z/VM, KVM on LPAR, and KVM under z/VM.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
With lockdep we got the following trace after a panic:
Badness at /home/autobuild/BUILD/linux-2.6.28-20090204/kernel/lockdep.c:2878
[...]
Call Trace:
[<0000000000176334>] lock_acquire+0x54/0xbc
[<000000000050b4fe>] __atomic_notifier_call_chain+0x6e/0xdc
[<000000000050b59c>] atomic_notifier_call_chain+0x30/0x44
[<0000000000504274>] panic+0xd0/0x1e8
[...]
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<0000000000170e62>] check_flags+0xae/0x15c
possible reason: unannotated irqs-off.
lockdep is right. We missed a trace_hardirq_off in our smp_send_stop
function and smp_send_stop is called before the panic call chain.
Reported-by: Mijo <Safradin mijo@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Initialize per thread timer values instead of just copying them from
the parent. That way it is easily possible to tell how much time a
thread spent in user/system context.
Doesn't fix a bug, this is just for debugging purposes.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix all the whitespace damage in process.c, especially copy_thread().
Next patch will add code to copy_thread() which needs to 'fixed' first.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
All in sysinfo.c is core kernel code and not driver code. So move it
to arch/s390/kernel. Also includes some small cleanups.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Provide new shutdown action "dump_reipl" for automatic ipl after dump.
Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
With packed stack the backchain is at a different location.
Just use __SF_BACKCHAIN as an offset to store the backchain.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Standby memory detected with the sclp interface gets always registered
with add_memory calls without considering the limitationt that the
"mem=" kernel paramater implies.
So fix this and only register standby memory that is below the specified
limit.
This fixes zfcpdump since it uses "mem=32M". In case there is appr.
2GB standby memory present all of usable memory would be used for the
struct pages needed for standby memory.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add wrapper functions for the following compat system calls:
* readahead
* sendfile64
* tkill
* tgkill
* keyctl
This ensures that the high order bits of the parameter registers are correctly
sign extended.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Precreate stop_machine threads in case the machine supports ETR/STP.
Otherwise we might deadlock if a time sync operation gets scheduled
and the creation of stop_machine threads would cause disk I/O.
This is just the minimal fix.
The real fix would be to only precreate stop_machine threads if
ETR/STP is actually used. But that would be a rather large and
complicated patch.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
On (initial) cpu hotplug the lowcore values for user_timer and
system_timer don't get initialized like they would get on each
process schedule.
On initial start of secondary cpus this leads to the situation
where per thread user/system_timer values are larger than the
corresponding contents of the lowcore. When later calculating
time spent in user/system context the result can be negative.
So for cpu hotplug we should manually initialize lowcore values.
Fixes this bug:
Kernel BUG at 000ec080 [verbose debug info unavailable]
fixpoint divide exception: 0009 [#1] PREEMPT SMP
Modules linked in:
CPU: 10 Not tainted 2.6.28 #4
Process sysctl (pid: 975, task: 3fa752e0, ksp: 3fbebca0)
Krnl PSW : 070c1000 800ec080 (show_stat+0x390/0x5fc)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0
Krnl GPRS: 7fffffff fefc7ce5 3faec080 003879ae
00000001 01388000 7fffffff 01388000
00000000 00000000 0049ad50 3fbebcf8
01388000 002f51a8 800ec1fe 3fbebcf8
Krnl Code: 800ec076: 9001b188 stm %r0,%r1,392(%r11)
800ec07a: 9801b0c0 lm %r0,%r1,192(%r11)
800ec07e: 1d05 dr %r0,%r5
>800ec080: 9001b0c0 stm %r0,%r1,192(%r11)
800ec084: 5860b0c4 l %r6,196(%r11)
800ec088: 1806 lr %r0,%r6
800ec08a: 8c800001 srdl %r8,1
800ec08e: 1d87 dr %r8,%r7
Call Trace:
([<00000000000ec1ee>] show_stat+0x4fe/0x5fc)
[<00000000000c13e8>] seq_read+0xc4/0x3ac
[<00000000000e4796>] proc_reg_read+0x6e/0x9c
[<00000000000a6a44>] vfs_read+0x78/0x100
[<00000000000a6ba8>] sys_read+0x40/0x80
[<00000000000234a8>] sysc_do_restart+0x1a/0x1e
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
When 31 bit user space programs call sigaltstack on a 64 bit Linux
OS, the system call returns -1 with errno=EFAULT. The 31 bit pointer passed
to the system call is extended to 64 bit, but the high order bits are not
set to zero. The kernel detects the invalid user space pointer and
returns -EFAULT. To solve the problem, sys32_sigaltstack_wrapper()
instead of sys32_sigaltstack() has to be called. The wrapper function sets
the high order bits to zero.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Remove __attribute__((weak)) from common code sys_pipe implemantation.
IA64, ALPHA, SUPERH (32bit) and SPARC (32bit) have own implemantations
with the same name. Just rename them.
For sys_pipe2 there is no architecture specific implementation.
Cc: Richard Henderson <rth@twiddle.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
!CONFIG_SMP:
arch/s390/kernel/vdso.c: In function 'vdso_init':
arch/s390/kernel/vdso.c:325: error: incompatible type for argument 2 of 'vdso_alloc_per_cpu'
Also move the code out of the BUG_ON statement since it won't be
executed on !CONFIG_BUG. And that would be a bug.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The system call isn't wired up on s390. Just delete the dead code.
Also we use the common code sys_ptrace system call, so the sys_ptrace
declaration is pointless is well.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (24 commits)
trivial: chack -> check typo fix in main Makefile
trivial: Add a space (and a comma) to a printk in 8250 driver
trivial: Fix misspelling of "firmware" in docs for ncr53c8xx/sym53c8xx
trivial: Fix misspelling of "firmware" in powerpc Makefile
trivial: Fix misspelling of "firmware" in usb.c
trivial: Fix misspelling of "firmware" in qla1280.c
trivial: Fix misspelling of "firmware" in a100u2w.c
trivial: Fix misspelling of "firmware" in megaraid.c
trivial: Fix misspelling of "firmware" in ql4_mbx.c
trivial: Fix misspelling of "firmware" in acpi_memhotplug.c
trivial: Fix misspelling of "firmware" in ipw2100.c
trivial: Fix misspelling of "firmware" in atmel.c
trivial: Fix misspelled firmware in Kconfig
trivial: fix an -> a typos in documentation and comments
trivial: fix then -> than typos in comments and documentation
trivial: update Jesper Juhl CREDITS entry with new email
trivial: fix singal -> signal typo
trivial: Fix incorrect use of "loose" in event.c
trivial: printk: fix indentation of new_text_line declaration
trivial: rtc-stk17ta8: fix sparse warning
...
Add kprobe_insn_mutex for protecting kprobe_insn_pages hlist, and remove
kprobe_mutex from architecture dependent code.
This allows us to call arch_remove_kprobe() (and free_insn_slot) while
holding kprobe_mutex.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'cpus4096-for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (77 commits)
x86: setup_per_cpu_areas() cleanup
cpumask: fix compile error when CONFIG_NR_CPUS is not defined
cpumask: use alloc_cpumask_var_node where appropriate
cpumask: convert shared_cpu_map in acpi_processor* structs to cpumask_var_t
x86: use cpumask_var_t in acpi/boot.c
x86: cleanup some remaining usages of NR_CPUS where s/b nr_cpu_ids
sched: put back some stack hog changes that were undone in kernel/sched.c
x86: enable cpus display of kernel_max and offlined cpus
ia64: cpumask fix for is_affinity_mask_valid()
cpumask: convert RCU implementations, fix
xtensa: define __fls
mn10300: define __fls
m32r: define __fls
h8300: define __fls
frv: define __fls
cris: define __fls
cpumask: CONFIG_DISABLE_OBSOLETE_CPUMASK_FUNCTIONS
cpumask: zero extra bits in alloc_cpumask_var_node
cpumask: replace for_each_cpu_mask_nr with for_each_cpu in kernel/time/
cpumask: convert mm/
...
* 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (66 commits)
x86: export vector_used_by_percpu_irq
x86: use logical apicid in x2apic_cluster's x2apic_cpu_mask_to_apicid_and()
sched: nominate preferred wakeup cpu, fix
x86: fix lguest used_vectors breakage, -v2
x86: fix warning in arch/x86/kernel/io_apic.c
sched: fix warning in kernel/sched.c
sched: move test_sd_parent() to an SMP section of sched.h
sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0
sched: activate active load balancing in new idle cpus
sched: bias task wakeups to preferred semi-idle packages
sched: nominate preferred wakeup cpu
sched: favour lower logical cpu number for sched_mc balance
sched: framework for sched_mc/smt_power_savings=N
sched: convert BALANCE_FOR_xx_POWER to inline functions
x86: use possible_cpus=NUM to extend the possible cpus allowed
x86: fix cpu_mask_to_apicid_and to include cpu_online_mask
x86: update io_apic.c to the new cpumask code
x86: Introduce topology_core_cpumask()/topology_thread_cpumask()
x86: xen: use smp_call_function_many()
x86: use work_on_cpu in x86/kernel/cpu/mcheck/mce_amd_64.c
...
Fixed up trivial conflict in kernel/time/tick-sched.c manually
The extract cpu time instruction (ectg) instruction allows the user
process to get the current thread cputime without calling into the
kernel. The code that uses the instruction needs to switch to the
access registers mode to get access to the per-cpu info page that
contains the two base values that are needed to calculate the current
cputime from the CPU timer with the ectg instruction.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Distinguish the cputime of the idle process where idle is actually using
cpu cycles from the cputime where idle is sleeping on an enabled wait psw.
The former is accounted as system time, the later as idle time.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Increase the precision of the idle time calculation that is exported
to user space via /sys/devices/system/cpu/cpu<x>/idle_time_us
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The unit of the cputime accouting values that are stored per process is
currently a microsecond. The CPU timer has a maximum granularity of
2**-12 microseconds. There is no benefit in storing the per process values
in the lesser precision and there is the disadvantage that the backend
has to do the rounding to microseconds. The better solution is to use
the maximum granularity of the CPU timer as cputime unit.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The cpu time spent by the idle process actually doing something is
currently accounted as idle time. This is plain wrong, the architectures
that support VIRT_CPU_ACCOUNTING=y can do better: distinguish between the
time spent doing nothing and the time spent by idle doing work. The first
is accounted with account_idle_time and the second with account_system_time.
The architectures that use the account_xxx_time interface directly and not
the account_xxx_ticks interface now need to do the check for the idle
process in their arch code. In particular to improve the system vs true
idle time accounting the arch code needs to measure the true idle time
instead of just testing for the idle process.
To improve the tick based accounting as well we would need an architecture
primitive that can tell us if the pt_regs of the interrupted context
points to the magic instruction that halts the cpu.
In addition idle time is no more added to the stime of the idle process.
This field now contains the system time of the idle process as it should
be. On systems without VIRT_CPU_ACCOUNTING this will always be zero as
every tick that occurs while idle is running will be accounted as idle
time.
This patch contains the necessary common code changes to be able to
distinguish idle system time and true idle time. The architectures with
support for VIRT_CPU_ACCOUNTING need some changes to exploit this.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The utimescaled / stimescaled fields in the task structure and the
global cpustat should be set on all architectures. On s390 the calls
to account_user_time_scaled and account_system_time_scaled never have
been added. In addition system time that is accounted as guest time
to the user time of a process is accounted to the scaled system time
instead of the scaled user time.
To fix the bugs and to prevent future forgetfulness this patch merges
account_system_time_scaled into account_system_time and
account_user_time_scaled into account_user_time.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Michael Neuling <mikey@neuling.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>