Several more Intel CPUs are now capable using the p4-clockmod cpufreq
driver. As it is of limited use most of the time, print a big bold warning
if a better cpufreq driver might be available.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>
Enable ondemand governor and acpi-cpufreq to use IA32_APERF and IA32_MPERF MSR
to get active frequency feedback for the last sampling interval. This will
make ondemand take right frequency decisions when hardware coordination of
frequency is going on.
Without APERF/MPERF, ondemand can take wrong decision at times due
to underlying hardware coordination or TM2.
Example:
* CPU 0 and CPU 1 are hardware cooridnated.
* CPU 1 running at highest frequency.
* CPU 0 was running at highest freq. Now ondemand reduces it to
some intermediate frequency based on utilization.
* Due to underlying hardware coordination with other CPU 1, CPU 0 continues to
run at highest frequency (as long as other CPU is at highest).
* When ondemand samples CPU 0 again next time, without actual frequency
feedback from APERF/MPERF, it will think that previous frequency change
was successful and can go to wrong target frequency. This is because it
thinks that utilization it has got this sampling interval is when running at
intermediate frequency, rather than actual highest frequency.
More information about IA32_APERF IA32_MPERF MSR:
Refer to IA-32 Intel® Architecture Software Developer's Manual at
http://developer.intel.com
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb:
V4L/DVB (4750): AGC command1/2 is board specific
V4L/DVB (4748): Fixed oops for Nova-T USB2
V4L/DVB (4746): HM12 is YUV 4:2:0, not YUV 4:1:1
V4L/DVB (4744): The Samsung TCPN2121P30A does not have a tda9887
V4L/DVB (4743): Fix oops in VIDIOC_G_PARM
V4L/DVB (4742): Drivers/media/video: handle sysfs errors
V4L/DVB (4741): {ov511,stv680}: handle sysfs errors
V4L/DVB (4740): Fixed an if-block to avoid floating with debug-messages
V4L/DVB (4739): SECAM support for saa7113 into saa7115
V4L/DVB (4738): Bt8xx/dvb-bt8xx.c: check kmalloc() return value.
V4L/DVB (4734): Tda826x: fix frontend selection for dvb_attach
V4L/DVB (4733): Tda10086: fix frontend selection for dvb_attach
V4L/DVB (4732): Fix spelling error in Kconfig help text for DVB_CORE_ATTACH
V4L/DVB (4731a): Kconfig: restore pvrusb2 menu items
V4L/DVB (4729): Fix VIDIOC_G_FMT for NTSC in cx25840.
V4L/DVB (4727): Support status readout for saa713x based FM radio
V4L/DVB (4725): Fix vivi compile on parisc
V4L/DVB (4692): Add WinTV-HVR3000 DVB-T support
Unify the following functions:
acpi_ec_poll_read()
acpi_ec_poll_write()
acpi_ec_poll_query()
acpi_ec_intr_read()
acpi_ec_intr_write()
acpi_ec_intr_query()
into:
acpi_ec_poll_transaction()
acpi_ec_intr_transaction()
These new functions take as arguments an ACPI EC command, a few bytes
to write to the EC data register and a buffer for a few bytes to read
from the EC data register. The old _read(), _write(), _query() are
just special cases of these functions.
Then unified the code in acpi_ec_poll_transaction() and
acpi_ec_intr_transaction() a little more. Both functions are now just
wrappers around the new acpi_ec_transaction_unlocked() function. The
latter contains the EC access logic, the two original
function now just do their special way of locking and call the the
new function for the actual work.
This saves a lot of very similar code. The primary reason for doing
this, however, is that my driver for MSI 270 laptops needs to issue
some non-standard EC commands in a safe way. Due to this I added a new
exported function similar to ec_write()/ec_write() which is called
ec_transaction() and is essentially just a wrapper around
acpi_ec_{poll,intr}_transaction().
Signed-off-by: Lennart Poettering <mzxreary@0pointer.de>
Acked-by: Luming Yu <luming.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Intel processors starting with the Core Duo support
support processor native C-state using the MWAIT instruction.
Refer: Intel Architecture Software Developer's Manual
http://www.intel.com/design/Pentium4/manuals/253668.htm
Platform firmware exports the support for Native C-state to OS using
ACPI _PDC and _CST methods.
Refer: Intel Processor Vendor-Specific ACPI: Interface Specification
http://www.intel.com/technology/iapc/acpi/downloads/302223.htm
With Processor Native C-state, we use 'MWAIT' instruction on the processor
to enter different C-states (C1, C2, C3). We won't use the special IO
ports to enter C-state and no SMM mode etc required to enter C-state.
Overall this will mean better C-state support.
One major advantage of using MWAIT for all C-states is, with this and
"treat interrupt as break event" feature of MWAIT, we can now get accurate
timing for the time spent in C1, C2, .. states.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Apparently whoever converted voyager never actually checked that the
patch would compile ...
Remove as much of the pt_regs references as possible and move the
remaining ones into line with what's in x86 generic.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The old style (attribute on each structure entry) never really worked.
Move it to an attribute per structure
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
* 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block:
[PATCH] block layer: ioprio_best function fix
[PATCH] ide-cd: fix breakage with internally queued commands
[PATCH] block layer: elv_iosched_show should get elv_list_lock
[PATCH] splice: fix pipe_to_file() ->prepare_write() error path
[PATCH] block layer: elevator_find function cleanup
[PATCH] elevator: elevator_type member not used
We still need to maintain a private PC style command, since it
isn't completely unified with REQ_TYPE_BLOCK_PC yet.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
elevator_type field in elevator_type structure is useless:
it isn't used anywhere in kernel sources.
Signed-off-by: Vasily Tarasov <vtaras@openvz.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
When doing receiver buffer accounting, we always used skb->truesize.
This is problematic when processing bundled DATA chunks because for
every DATA chunk that could be small part of one large skb, we would
charge the size of the entire skb. The new approach is to store the
size of the DATA chunk we are accounting for in the sctp_ulpevent
structure and use that stored value for accounting.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently when an IPSec policy rule doesn't specify a security
context, it is assumed to be "unlabeled" by SELinux, and so
the IPSec policy rule fails to match to a flow that it would
otherwise match to, unless one has explicitly added an SELinux
policy rule allowing the flow to "polmatch" to the "unlabeled"
IPSec policy rules. In the absence of such an explicitly added
SELinux policy rule, the IPSec policy rule fails to match and
so the packet(s) flow in clear text without the otherwise applicable
xfrm(s) applied.
The above SELinux behavior violates the SELinux security notion of
"deny by default" which should actually translate to "encrypt by
default" in the above case.
This was first reported by Evgeniy Polyakov and the way James Morris
was seeing the problem was when connecting via IPsec to a
confined service on an SELinux box (vsftpd), which did not have the
appropriate SELinux policy permissions to send packets via IPsec.
With this patch applied, SELinux "polmatching" of flows Vs. IPSec
policy rules will only come into play when there's a explicit context
specified for the IPSec policy rule (which also means there's corresponding
SELinux policy allowing appropriate domains/flows to polmatch to this context).
Secondly, when a security module is loaded (in this case, SELinux), the
security_xfrm_policy_lookup() hook can return errors other than access denied,
such as -EINVAL. We were not handling that correctly, and in fact
inverting the return logic and propagating a false "ok" back up to
xfrm_lookup(), which then allowed packets to pass as if they were not
associated with an xfrm policy.
The solution for this is to first ensure that errno values are
correctly propagated all the way back up through the various call chains
from security_xfrm_policy_lookup(), and handled correctly.
Then, flow_cache_lookup() is modified, so that if the policy resolver
fails (typically a permission denied via the security module), the flow
cache entry is killed rather than having a null policy assigned (which
indicates that the packet can pass freely). This also forces any future
lookups for the same flow to consult the security module (e.g. SELinux)
for current security policy (rather than, say, caching the error on the
flow cache entry).
This patch: Fix the selinux side of things.
This makes sure SELinux polmatching of flow contexts to IPSec policy
rules comes into play only when an explicit context is associated
with the IPSec policy rule.
Also, this no longer defaults the context of a socket policy to
the context of the socket since the "no explicit context" case
is now handled properly.
Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
When a security module is loaded (in this case, SELinux), the
security_xfrm_policy_lookup() hook can return an access denied permission
(or other error). We were not handling that correctly, and in fact
inverting the return logic and propagating a false "ok" back up to
xfrm_lookup(), which then allowed packets to pass as if they were not
associated with an xfrm policy.
The way I was seeing the problem was when connecting via IPsec to a
confined service on an SELinux box (vsftpd), which did not have the
appropriate SELinux policy permissions to send packets via IPsec.
The first SYNACK would be blocked, because of an uncached lookup via
flow_cache_lookup(), which would fail to resolve an xfrm policy because
the SELinux policy is checked at that point via the resolver.
However, retransmitted SYNACKs would then find a cached flow entry when
calling into flow_cache_lookup() with a null xfrm policy, which is
interpreted by xfrm_lookup() as the packet not having any associated
policy and similarly to the first case, allowing it to pass without
transformation.
The solution presented here is to first ensure that errno values are
correctly propagated all the way back up through the various call chains
from security_xfrm_policy_lookup(), and handled correctly.
Then, flow_cache_lookup() is modified, so that if the policy resolver
fails (typically a permission denied via the security module), the flow
cache entry is killed rather than having a null policy assigned (which
indicates that the packet can pass freely). This also forces any future
lookups for the same flow to consult the security module (e.g. SELinux)
for current security policy (rather than, say, caching the error on the
flow cache entry).
Signed-off-by: James Morris <jmorris@namei.org>
Testing revealed a problem with the NetLabel cache where a cached entry could
be freed while in use by the LSM layer causing an oops and other problems.
This patch fixes that problem by introducing a reference counter to the cache
entry so that it is only freed when it is no longer in use.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] Pass NULL not 0 for pointer value.
[MIPS] IP27: Make declaration of setup_replication_mask a proper prototype.
[MIPS] BigSur: More useful defconfig.
[MIPS] Cleanup definitions of speed_t and tcflag_t.
[MIPS] Fix compilation warnings in arch/mips/sibyte/bcm1480/smp.c
[MIPS] Optimize and cleanup get_saved_sp, set_saved_sp
[MIPS] <asm/irq.h> does not need pt_regs anymore.
[MIPS] Workaround for bug in gcc -EB / -EL options.
[MIPS] Fix timer setup for Jazz
If CONFIG_BUILD_ELF64 was not selected and gcc had -msym32 option
(i.e. 4.0 or newer), there is no point to use %highest, %higher for
kernel symbols.
This patch also fixes 64-bit SMTC version of get_saved_sp() which is
broken but harmless since there is no such CPUs for now.
A bonus is set_saved_sp() and SMP version of get_saved_sp() are more
readable now.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The attached patch destroys all the dentries attached to a superblock in one go
by:
(1) Destroying the tree rooted at s_root.
(2) Destroying every entry in the anon list, one at a time.
(3) Each entry in the anon list has its subtree consumed from the leaves
inwards.
This reduces the amount of work generic_shutdown_super() does, and avoids
iterating through the dentry_unused list.
Note that locking is almost entirely absent in the shrink_dcache_for_umount*()
functions added by this patch. This is because:
(1) at the point the filesystem calls generic_shutdown_super(), it is not
permitted to further touch the superblock's set of dentries, and nor may
it remove aliases from inodes;
(2) the dcache memory shrinker now skips dentries that are being unmounted;
and
(3) the superblock no longer has any external references through which the VFS
can reach it.
Given these points, the only locking we need to do is when we remove dentries
from the unused list and the name hashes, which we do a directory's worth at a
time.
We also don't need to guard against reference counts going to zero unexpectedly
and removing bits of the tree we're working on as nothing else can call dput().
A cut down version of dentry_iput() has been folded into
shrink_dcache_for_umount_subtree() function. Apart from not needing to unlock
things, it also doesn't need to check for inotify watches.
In this version of the patch, the complaint about a dentry still being in use
has been expanded from a single BUG_ON() and now gives much more information.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: NeilBrown <neilb@suse.de>
Acked-by: Ian Kent <raven@themaw.net>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The nbd header uses __be32 and such types but doesn't actually include the
header that defines these things (linux/types.h); so let's include it.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Place kernel-doc function comment header immediately before the function that
is being documented.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There's nothing arch-specific about check_signature(), so move it to
<linux/io.h>. Use a cross between the Alpha and i386 implementations as
the generic one.
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
lib/bitmap.c:bitmap_parse() is a library function that received as input a
user buffer. This seemed to have originated from the way the write_proc
function of the /proc filesystem operates.
This has been reworked to not use kmalloc and eliminates a lot of
get_user() overhead by performing one access_ok before using __get_user().
We need to test if we are in kernel or user space (is_user) and access the
buffer differently. We cannot use __get_user() to access kernel addresses
in all cases, for example in architectures with separate address space for
kernel and user.
This function will be useful for other uses as well; for example, taking
input for /sysfs instead of /proc, so it was changed to accept kernel
buffers. We have this use for the Linux UWB project, as part as the
upcoming bandwidth allocator code.
Only a few routines used this function and they were changed too.
Signed-off-by: Reinette Chatre <reinette.chatre@linux.intel.com>
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Joe Korty <joe.korty@ccur.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A couple of HDIO IOCTLs are not yet handled and a few others are marked
as using a pointer rather than an unsigned long. The formers include:
HDIO_GET_WCACHE, HDIO_GET_ACOUSTIC, HDIO_GET_ADDRESS and
HDIO_GET_BUSSTATE. The latters are: HDIO_SET_MULTCOUNT,
HDIO_SET_UNMASKINTR, HDIO_SET_KEEPSETTINGS, HDIO_SET_32BIT,
HDIO_SET_NOWERR, HDIO_SET_DMA, HDIO_SET_PIO_MODE and HDIO_SET_NICE.
Additionally 0x330 used to be HDIO_GETGEO_BIG and may be issued by 32-bit
`hdparm' run on a 64-bit kernel making Linux complain loudly.
This is a fix for these issues.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This likely profiling is pretty fun. I found a few possible problems
in sched.c.
This patch may be not measurable, but when I did measure long ago,
nooping (un)likely cost a couple of % on scheduler heavy benchmarks, so
it all adds up.
Tweak some branch hints:
- the 2nd 64 bits in the bitmask is likely to be populated, because it
contains the first 28 bits (nearly 3/4) of the normal priorities.
(ratio of 669669:691 ~= 1000:1).
- it isn't unlikely that context switching switches to another process. it
might be very rapidly switching to and from the idle process (ratio of
475815:419004 and 471330:423544). Let the branch predictor decide.
- preempt_enable seems to be very often called in a nested preempt_disable
or with interrupts disabled (ratio of 3567760:87965 ~= 40:1)
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Daniel Walker <dwalker@mvista.com>
Cc: Hua Zhong <hzhong@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Module taint flags listing in Oops/panic has a couple of issues:
* taint_flags() doesn't null-terminate the buffer after printing the flags
* per-module taints are only set if the kernel is not already tainted
(with that particular flag) => only the first offending module gets its
taint info correctly updated
Some additional changes:
* 'license_gplok' is no longer needed - equivalent to !(taints &
TAINT_PROPRIETARY_MODULE) - so we can drop it from struct module *
exporting module taint info via /proc/module:
pwc 88576 0 - Live 0xf8c32000
evilmod 6784 1 pwc, Live 0xf8bbf000 (PF)
Signed-off-by: Florin Malita <fmalita@gmail.com>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is a follow-up patch based on the review for perfmon2. This patch
adds the carta_random32() library routine + carta_random32.h header file.
This is fast, simple, and efficient pseudo number generator algorithm. We
use it in perfmon2 to randomize the sampling periods. In this context, we
do not need any fancy randomizer.
Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Cc: David Mosberger <david.mosberger@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Implement the epoll_pwait system call, that extend the event wait mechanism
with the same logic ppoll and pselect do. The definition of epoll_pwait
is:
int epoll_pwait(int epfd, struct epoll_event *events, int maxevents,
int timeout, const sigset_t *sigmask, size_t sigsetsize);
The difference between the vanilla epoll_wait and epoll_pwait is that the
latter allows the caller to specify a signal mask to be set while waiting
for events. Hence epoll_pwait will wait until either one monitored event,
or an unmasked signal happen. If sigmask is NULL, the epoll_pwait system
call will act exactly like epoll_wait. For the POSIX definition of
pselect, information is available here:
http://www.opengroup.org/onlinepubs/009695399/functions/select.html
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Move the lock debug checks below the page reserved checks. Also, having
debug_check_no_locks_freed in kernel_map_pages is wrong.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
move '_hi' bits of block numbers in the larger part of the
block group descriptor structure
Signed-off-by: Alexandre Ratchov <alexandre.ratchov@bull.net>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Similar to ext4, change blocks in JBD2 from sector_t to unsigned long long.
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Change ext4 in-kernel block type (ext4_fsblk_t) from sector_t to unsigned
long long. Remove ext4 block type string micro E3FSBLK, replaced with "%llu"
[akpm@osdl.org: build fix]
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As we are planning to support 48-bit block numbers for ext4, we need to
support 48-bit block numbers for extended attributes. In the short term, we
can do this by reuse (on-disk) 16-bit padding (linux2.i_pad1 currently used
only by "hurd") as high order bits for xattr. This patch basically does that.
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
JBD layer in-kernel block varibles type fixes to support >32 bit block number
and convert to sector_t type.
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>