Граф коммитов

147 Коммитов

Автор SHA1 Сообщение Дата
Akinobu Mita f9b4192923 [PATCH] bitops: hweight() speedup
<linux@horizon.com> wrote:

This is an extremely well-known technique.  You can see a similar version that
uses a multiply for the last few steps at
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel whch
refers to "Software Optimization Guide for AMD Athlon 64 and Opteron
Processors"
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25112.PDF

It's section 8.6, "Efficient Implementation of Population-Count Function in
32-bit Mode", pages 179-180.

It uses the name that I am more familiar with, "popcount" (population count),
although "Hamming weight" also makes sense.

Anyway, the proof of correctness proceeds as follows:

	b = a - ((a >> 1) & 0x55555555);
	c = (b & 0x33333333) + ((b >> 2) & 0x33333333);
	d = (c + (c >> 4)) & 0x0f0f0f0f;
#if SLOW_MULTIPLY
	e = d + (d >> 8)
	f = e + (e >> 16);
	return f & 63;
#else
	/* Useful if multiply takes at most 4 cycles */
	return (d * 0x01010101) >> 24;
#endif

The input value a can be thought of as 32 1-bit fields each holding their own
hamming weight.  Now look at it as 16 2-bit fields.  Each 2-bit field a1..a0
has the value 2*a1 + a0.  This can be converted into the hamming weight of the
2-bit field a1+a0 by subtracting a1.

That's what the (a >> 1) & mask subtraction does.  Since there can be no
borrows, you can just do it all at once.

Enumerating the 4 possible cases:

0b00 = 0  ->  0 - 0 = 0
0b01 = 1  ->  1 - 0 = 1
0b10 = 2  ->  2 - 1 = 1
0b11 = 3  ->  3 - 1 = 2

The next step consists of breaking up b (made of 16 2-bir fields) into
even and odd halves and adding them into 4-bit fields.  Since the largest
possible sum is 2+2 = 4, which will not fit into a 4-bit field, the 2-bit
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                          "which will not fit into a 2-bit field"

fields have to be masked before they are added.

After this point, the masking can be delayed.  Each 4-bit field holds a
population count from 0..4, taking at most 3 bits.  These numbers can be added
without overflowing a 4-bit field, so we can compute c + (c >> 4), and only
then mask off the unwanted bits.

This produces d, a number of 4 8-bit fields, each in the range 0..8.  From
this point, we can shift and add d multiple times without overflowing an 8-bit
field, and only do a final mask at the end.

The number to mask with has to be at least 63 (so that 32 on't be truncated),
but can also be 128 or 255.  The x86 has a special encoding for signed
immediate byte values -128..127, so the value of 255 is slower.  On other
processors, a special "sign extend byte" instruction might be faster.

On a processor with fast integer multiplies (Athlon but not P4), you can
reduce the final few serially dependent instructions to a single integer
multiply.  Consider d to be 3 8-bit values d3, d2, d1 and d0, each in the
range 0..8.  The multiply forms the partial products:

	           d3 d2 d1 d0
	        d3 d2 d1 d0
	     d3 d2 d1 d0
	+ d3 d2 d1 d0
	----------------------
	           e3 e2 e1 e0

Where e3 = d3 + d2 + d1 + d0.   e2, e1 and e0 obviously cannot generate
any carries.

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:59:30 -08:00
Akinobu Mita 37d54111c1 [PATCH] bitops: hweight() related cleanup
By defining generic hweight*() routines

- hweight64() will be defined on all architectures
- hweight_long() will use architecture optimized hweight32() or hweight64()

I found two possible cleanups by these reasons.

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:15 -08:00
Akinobu Mita 930ae745f5 [PATCH] bitops: generic ext2_{set,clear,test,find_first_zero,find_next_zero}_bit()
This patch introduces the C-language equivalents of the functions below:

int ext2_set_bit(int nr, volatile unsigned long *addr);
int ext2_clear_bit(int nr, volatile unsigned long *addr);
int ext2_test_bit(int nr, const volatile unsigned long *addr);
unsigned long ext2_find_first_zero_bit(const unsigned long *addr,
                                       unsigned long size);
unsinged long ext2_find_next_zero_bit(const unsigned long *addr,
                                      unsigned long size);

In include/asm-generic/bitops/ext2-non-atomic.h

This code largely copied from:

include/asm-powerpc/bitops.h
include/asm-parisc/bitops.h

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:11 -08:00
Akinobu Mita 3b9ed1a5d2 [PATCH] bitops: generic hweight{64,32,16,8}()
This patch introduces the C-language equivalents of the functions below:

unsigned int hweight32(unsigned int w);
unsigned int hweight16(unsigned int w);
unsigned int hweight8(unsigned int w);
unsigned long hweight64(__u64 w);

In include/asm-generic/bitops/hweight.h

This code largely copied from: include/linux/bitops.h

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:11 -08:00
Akinobu Mita c7f612cdf0 [PATCH] bitops: generic find_{next,first}{,_zero}_bit()
This patch introduces the C-language equivalents of the functions below:

unsigned logn find_next_bit(const unsigned long *addr, unsigned long size,
                            unsigned long offset);
unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
                                 unsigned long offset);
unsigned long find_first_zero_bit(const unsigned long *addr,
                                  unsigned long size);
unsigned long find_first_bit(const unsigned long *addr, unsigned long size);

In include/asm-generic/bitops/find.h

This code largely copied from: arch/powerpc/lib/bitops.c

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:11 -08:00
Andi Kleen 6a0f03e0d3 [PATCH] x86_64: Don't enable CONFIG_UNWIND_INFO by default for DEBUG_KERNEL
DEBUG_KERNEL is often enabled just for sysrq, but this doesn't
mean the user wants more heavyweight debugging information.

Cc: jbeulich@novell.com

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:14:39 -08:00
Linus Torvalds 1e8c573933 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (21 commits)
  BUG_ON() Conversion in drivers/video/
  BUG_ON() Conversion in drivers/parisc/
  BUG_ON() Conversion in drivers/block/
  BUG_ON() Conversion in sound/sparc/cs4231.c
  BUG_ON() Conversion in drivers/s390/block/dasd.c
  BUG_ON() Conversion in lib/swiotlb.c
  BUG_ON() Conversion in kernel/cpu.c
  BUG_ON() Conversion in ipc/msg.c
  BUG_ON() Conversion in block/elevator.c
  BUG_ON() Conversion in fs/coda/
  BUG_ON() Conversion in fs/binfmt_elf_fdpic.c
  BUG_ON() Conversion in input/serio/hil_mlc.c
  BUG_ON() Conversion in md/dm-hw-handler.c
  BUG_ON() Conversion in md/bitmap.c
  The comment describing how MS_ASYNC works in msync.c is confusing
  rcu: undeclared variable used in documentation
  fix typos "wich" -> "which"
  typo patch for fs/ufs/super.c
  Fix simple typos
  tabify drivers/char/Makefile
  ...
2006-03-25 08:41:09 -08:00
Andrew Morton 96a9b4d31e [PATCH] cpumask: uninline any_online_cpu()
text    data     bss     dec     hex filename
before: 3605597 1363528  363328 5332453  515de5 vmlinux
after:  3605295 1363612  363200 5332107  515c8b vmlinux

218 bytes saved.

Also, optimise any_online_cpu() out of existence on CONFIG_SMP=n.

This function seems inefficient.  Can't we simply AND the two masks, then use
find_first_bit()?

Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:23:00 -08:00
Andrew Morton 8630282070 [PATCH] cpumask: uninline highest_possible_processor_id()
Shrinks the only caller (net/bridge/netfilter/ebtables.c) by 174 bytes.

Also, optimise highest_possible_processor_id() out of existence on
CONFIG_SMP=n.

Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:23:00 -08:00
Andrew Morton 3d18bd74a2 [PATCH] cpumask: uninline next_cpu()
text    data     bss     dec     hex filename
before: 3488027 1322496  360128 5170651  4ee5db vmlinux
after:  3485112 1322480  359968 5167560  4ed9c8 vmlinux

2931 bytes saved

Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:59 -08:00
Andrew Morton ccb46000f4 [PATCH] cpumask: uninline first_cpu()
text    data     bss     dec     hex filename
before: 3490577 1322408  360000 5172985  4eeef9 vmlinux
after:  3488027 1322496  360128 5170651  4ee5db vmlinux

Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:59 -08:00
Jonathan Corbet daff89f324 [PATCH] radix-tree documentation cleanups
Documentation changes to help radix tree users avoid overrunning the tags
array.  RADIX_TREE_TAGS moves to linux/radix-tree.h and is now known as
RADIX_TREE_MAX_TAGS (Nick Piggin's idea).  Tag parameters are changed to
unsigned, and some comments are updated.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:59 -08:00
Andrew Morton 4a2f0acf0f [PATCH] kconfig: clarify memory debug options
The Kconfig text for CONFIG_DEBUG_SLAB and CONFIG_DEBUG_PAGEALLOC have always
seemed a bit confusing.  Change them to:

CONFIG_DEBUG_SLAB: "Debug slab memory allocations"
CONFIG_DEBUG_PAGEALLOC: "Debug page memory allocations"

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:54 -08:00
Al Viro 871751e25d [PATCH] slab: implement /proc/slab_allocators
Implement /proc/slab_allocators.   It produces output like:

idr_layer_cache: 80 idr_pre_get+0x33/0x4e
buffer_head: 2555 alloc_buffer_head+0x20/0x75
mm_struct: 9 mm_alloc+0x1e/0x42
mm_struct: 20 dup_mm+0x36/0x370
vm_area_struct: 384 dup_mm+0x18f/0x370
vm_area_struct: 151 do_mmap_pgoff+0x2e0/0x7c3
vm_area_struct: 1 split_vma+0x5a/0x10e
vm_area_struct: 11 do_brk+0x206/0x2e2
vm_area_struct: 2 copy_vma+0xda/0x142
vm_area_struct: 9 setup_arg_pages+0x99/0x214
fs_cache: 8 copy_fs_struct+0x21/0x133
fs_cache: 29 copy_process+0xf38/0x10e3
files_cache: 30 alloc_files+0x1b/0xcf
signal_cache: 81 copy_process+0xbaa/0x10e3
sighand_cache: 77 copy_process+0xe65/0x10e3
sighand_cache: 1 de_thread+0x4d/0x5f8
anon_vma: 241 anon_vma_prepare+0xd9/0xf3
size-2048: 1 add_sect_attrs+0x5f/0x145
size-2048: 2 journal_init_revoke+0x99/0x302
size-2048: 2 journal_init_revoke+0x137/0x302
size-2048: 2 journal_init_inode+0xf9/0x1c4

Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Alexander Nyberg <alexn@telia.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
DESC
slab-leaks3-locking-fix
EDESC
From: Andrew Morton <akpm@osdl.org>

Update for slab-remove-cachep-spinlock.patch

Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Alexander Nyberg <alexn@telia.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:49 -08:00
Eric Sesterhenn 3481454589 BUG_ON() Conversion in lib/swiotlb.c
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-24 18:47:11 +01:00
Jan Beulich 604bf5a216 [PATCH] CONFIG_UNWIND_INFO
As a foundation for reliable stack unwinding, this adds a config option
(available to all architectures except IA64 and those where the module
loader might have problems with the resulting relocations) to enable the
generation of frame unwind information.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>,
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:25 -08:00
Paul Jackson 3cf64b933c [PATCH] bitmap: region restructuring
Restructure the bitmap_*_region() operations, to avoid code duplication.

Also reduces binary text size by about 100 bytes (ia64 arch).  The original
Bottomley bitmap_*_region patch added about 1000 bytes of compiled kernel text
(ia64).  The Mundt multiword extension added another 600 bytes, and this
restructuring patch gets back about 100 bytes.

But the real motivation was the reduced amount of duplicated code.

Tested by Paul Mundt using <= BITS_PER_LONG as well as power of
2 aligned multiword spanning allocations.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:20 -08:00
Paul Mundt 74373c6acc [PATCH] bitmap: region multiword spanning support
Add support to the lib/bitmap.c bitmap_*_region() routines

For bitmap regions larger than one word (nbits > BITS_PER_LONG).  This removes
a BUG_ON() in lib bitmap.

I have an updated store queue API for SH that is currently using this with
relative success, and at first glance, it seems like this could be useful for
x86 (arch/i386/kernel/pci-dma.c) as well.  Particularly for anything using
dma_declare_coherent_memory() on large areas and that attempts to allocate
large buffers from that space.

Paul Jackson also did some cleanup to this patch.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:20 -08:00
Paul Jackson 87e2480258 [PATCH] bitmap: region cleanup
Paul Mundt <lethal@linux-sh.org> says:

This patch set implements a number of patches to clean up and restructure the
bitmap region code, in addition to extending the interface to support
multiword spanning allocations.

The current implementation (before this patch set) is limited by only being
able to allocate pages <= BITS_PER_LONG, as noted by the strategically
positioned BUG_ON() at lib/bitmap.c:752:

        /* We don't do regions of pages > BITS_PER_LONG.  The
	 * algorithm would be a simple look for multiple zeros in the
	 * array, but there's no driver today that needs this.  If you
	 * trip this BUG(), you get to code it... */
        BUG_ON(pages > BITS_PER_LONG);

As I seem to have been the first person to trigger this, the result ends up
being the following patch set with the help of Paul Jackson.

The final patch in the series eliminates quite a bit of code duplication, so
the bitmap code size ends up being smaller than the current implementation as
an added bonus.

After these are applied, it should already be possible to do multiword
allocations with dma_alloc_coherent() out of ranges established by
dma_declare_coherent_memory() on x86 without having to change any of the code,
and the SH store queue API will follow up on this as the other user that needs
support for this.

This patch:

Some code cleanup on the lib/bitmap.c bitmap_*_region() routines:

 * spacing
 * variable names
 * comments

Has no change to code function.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:20 -08:00
Arjan van de Ven 97d1f15b7e [PATCH] sem2mutex: kernel/
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:10 -08:00
Linus Torvalds 2e6e33bab6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (78 commits)
  [PATCH] powerpc: Add FSL SEC node to documentation
  [PATCH] macintosh: tidy-up driver_register() return values
  [PATCH] powerpc: tidy-up of_register_driver()/driver_register() return values
  [PATCH] powerpc: via-pmu warning fix
  [PATCH] macintosh: cleanup the use of i2c headers
  [PATCH] powerpc: dont allow old RTC to be selected
  [PATCH] powerpc: make powerbook_sleep_grackle static
  [PATCH] powerpc: Fix warning in add_memory
  [PATCH] powerpc: update mailing list addresses
  [PATCH] powerpc: Remove calculation of io hole
  [PATCH] powerpc: iseries: Add bootargs to /chosen
  [PATCH] powerpc: iseries: Add /system-id, /model and /compatible
  [PATCH] powerpc: Add strne2a() to convert a string from EBCDIC to ASCII
  [PATCH] powerpc: iseries: Make more stuff static in platforms/iseries/mf.c
  [PATCH] powerpc: iseries: Remove pointless iSeries_(restart|power_off|halt)
  [PATCH] powerpc: iseries: mf related cleanups
  [PATCH] powerpc: Replace platform_is_lpar() with a firmware feature
  [PATCH] powerpc: trivial: Cleanup whitespace in cputable.h
  [PATCH] powerpc: Remove unused iommu_off logic from pSeries_init_early()
  [PATCH] powerpc: Unconfuse htab_bolt_mapping() callers
  ...
2006-03-22 22:20:46 -08:00
Andrew Morton f4a641d66c [PATCH] multiple exports of strpbrk
Sam's tree includes a new check, which found that we're exporting strpbrk()
multiple times.

It seems that the convention is that this is exported from the arch files, so
reove the lib/string.c export.

Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: David Howells <dhowells@redhat.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22 07:53:56 -08:00
Jun'ichi Nomura 7423172a50 [PATCH] kobject_add_dir
Adding kobject_add_dir() function which creates a subdirectory
for a given kobject.

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-20 13:42:59 -08:00
Greg Kroah-Hartman dcd0da0021 [PATCH] Kobject: provide better warning messages when people do stupid things
Now that kobject_add() is used more than kobject_register() the kernel
wasn't always letting people know that they were doing something wrong.
This change fixes this.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-20 13:42:59 -08:00
Eric Dumazet 8b5536bbee [PATCH] kref: avoid an atomic operation in kref_put()
Avoid an atomic operation in kref_put() when the last reference is
dropped. On most platforms, atomic_read() is a plan read of the counter
and involves no atomic at all.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-20 13:42:57 -08:00
Jun'ichi Nomura 51107301b6 [PATCH] kobject: fix build error if CONFIG_SYSFS=n
Moving uevent_seqnum and uevent_helper to kobject_uevent.c
because they are used even if CONFIG_SYSFS=n
while kernel/ksysfs.c is built only if CONFIG_SYSFS=y,

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-20 13:42:57 -08:00
Paul Mackerras a00428f5b1 Merge ../powerpc-merge 2006-02-24 14:05:47 +11:00
Greg Kroah-Hartman fa675765af Revert mount/umount uevent removal
This change reverts the 033b96fd30 commit
from Kay Sievers that removed the mount/umount uevents from the kernel.
Some older versions of HAL still depend on these events to detect when a
new device has been mounted.  These events are not correctly emitted,
and are broken by design, and so, should not be relied upon by any
future program.  Instead, the /proc/mounts file should be polled to
properly detect this kind of event.

A feature-removal-schedule.txt entry has been added, noting when this
interface will be removed from the kernel.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-02-22 09:39:02 -08:00
Al Viro ad6b97fc92 [PATCH] iomap_copy fallout (m68k)
added __raw_writel(), sanitized include order in iomap_copy.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-02-18 16:30:40 -05:00
NeilBrown 90f9dd8f72 [PATCH] Fix over-zealous tag clearing in radix_tree_delete
If a tag is set for a node being deleted from a radix_tree, then that
tag gets cleared from the parent of the node, even if it is set for some
siblings of the node begin deleted.

This patch changes the logic to include a test for any_tag_set similar
to the logic a little futher down.  Care is taken to ensure that
'nr_cleared_tags' remains equals to the number of entries in the 'tags'
array which are set to '0' (which means that this tag is not set in the
tree below pathp->node, and should be cleared at pathp->node and
possibly above.

[ Nick says: "Linus FYI, I was able to modify the radix tree test
  harness to catch the bug and can no longer trigger it after the fix.
  Resulting code passes all other harness tests as well of course." ]

Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-16 08:45:50 -08:00
Jon Mason 2ef9481e66 [PATCH] powerpc: trivial: modify comments to refer to new location of files
This patch removes all self references and fixes references to files
in the now defunct arch/ppc64 tree.  I think this accomplises
everything wanted, though there might be a few references I missed.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-10 16:53:51 +11:00
Linus Torvalds 92118c739d Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 2006-02-07 16:29:55 -08:00
Ingo Molnar e0a6029634 [PATCH] Fix spinlock debugging delays to not time out too early
The spinlock-debug wait-loop was using loops_per_jiffy to detect too long
spinlock waits - but on fast CPUs this led to a way too fast timeout and false
messages.

The fix is to include a __delay(1) call in the loop, to correctly approximate
the intended delay timeout of 1 second.  The code assumes that every
architecture implements __delay(1) to last around 1/(loops_per_jiffy*HZ)
seconds.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-07 16:12:33 -08:00
Benjamin Herrenschmidt d87499ed1a [PATCH] Fix uevent buffer overflow in input layer
The buffer used for kobject uevent is too small for some of the events generated
by the input layer. Bump it to 2k.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-02-06 12:17:18 -08:00
Chuck Ebbert b365b3daf2 [PATCH] kobject: don't oops on null kobject.name
kobject_get_path() will oops if one of the component names is
NULL.  Fix that by returning NULL instead of oopsing.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-02-06 12:17:17 -08:00
Greg Kroah-Hartman c171fef5c8 [PATCH] kobject_add() must have a valid name in order to succeed.
So we might as well check to verify this, and let the user know that
something is wrong if they didn't do it correctly, instead of oopsing
later on in kobject_get_name() or somewhere else.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-02-06 12:17:17 -08:00
Linus Torvalds d6c8f6aaa1 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2006-02-03 08:33:06 -08:00
Peter Williams f0c00257d6 [PATCH] lib: Fix bug in int_sqrt() for 64 bit longs
The implementation of int_sqrt() assumes that longs have 32 bits.  On
systems that have 64 bit longs this will result in gross errors when the
argument to the function is greater than 2^32 - 1 on such systems.  I doubt
whether any such use is currently made of int_sqrt() but the attached patch
fixes the problem anyway.

Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-03 08:32:08 -08:00
Pablo Neira Ayuso 3f330317ab [TEXTSEARCH]: Fix broken good shift array calculation in Boyer-Moore
The current logic does not calculate correctly the good shift array:
Let x be the pattern that is being searched. Let y be the block of data. 
The good shift array aligns the segment:

x[i+1 ... m-1] = y[i+j+1 ... j+m-1]

with its rightmost occurrence in x that fulfils x[i] neq y[i+j].

In previous version, the good shift array for the pattern ANPANMAN is:
[1, 8, 3, 8, 8, 8, 8, 8]
and should be:
[1, 8, 3, 6, 6, 6, 6, 6]

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-02 17:15:41 -08:00
Bryan O'Sullivan c27a0d75b3 [PATCH] Introduce __iowrite32_copy
This arch-independent routine copies data to a memory-mapped I/O region,
using 32-bit accesses.  The naming is double-underscored to make it clear
that it does not guarantee write ordering, nor does it perform a memory
barrier afterwards; the kernel doc also explicitly states this.  This style
of access is required by some devices.

This change also introduces include/linux/io.h, at Andrew's suggestion.  It
only has one occupant at the moment, but is a logical destination for
oft-replicated contents of include/asm-*/{io,iomap}.h to migrate to.

Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:13 -08:00
Ingo Molnar a9df3d0f31 [PATCH] When CONFIG_CC_OPTIMIZE_FOR_SIZE, allow gcc4 to control inlining
If optimizing for size (CONFIG_CC_OPTIMIZE_FOR_SIZE), allow gcc4 compilers
to decide what to inline and what not - instead of the kernel forcing gcc
to inline all the time.  This requires several places that require to be
inlined to be marked as such, previous patches in this series do that.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-14 18:27:16 -08:00
Muli Ben-Yehuda 17a941d854 [PATCH] x86_64: Use function pointers to call DMA mapping functions
AK: I hacked Muli's original patch a lot and there were a lot
of changes - all bugs are probably to blame on me now.
There were also some changes in the fall back behaviour
for swiotlb - in particular it doesn't try to use GFP_DMA
now anymore. Also all DMA mapping operations use the
same core dma_alloc_coherent code with proper fallbacks now.
And various other changes and cleanups.

Known problems: iommu=force swiotlb=force together breaks
                needs more testing.

This patch cleans up x86_64's DMA mapping dispatching code. Right now
we have three possible IOMMU types: AGP GART, swiotlb and nommu, and
in the future we will also have Xen's x86_64 swiotlb and other HW
IOMMUs for x86_64. In order to support all of them cleanly, this
patch:

- introduces a struct dma_mapping_ops with function pointers for each
  of the DMA mapping operations of gart (AMD HW IOMMU), swiotlb
  (software IOMMU) and nommu (no IOMMU).

- gets rid of:

  if (swiotlb)
      return swiotlb_xxx();

- PCI_DMA_BUS_IS_PHYS is now checked against the dma_ops being set
This makes swiotlb faster by avoiding double copying in some cases.

Signed-Off-By: Muli Ben-Yehuda <mulix@mulix.org>
Signed-Off-By: Jon D. Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11 19:04:55 -08:00
Adrian Bunk f346f4b373 [PATCH] let MAGIC_SYSRQ no longer depend on DEBUG_KERNEL
I know several people using MAGIC_SYSRQ not for kernel debugging but for
trying to do a halfway normal shutdown in case of problems.

Since there's no technical reason why MAGIC_SYSRQ would have to depend on
DEBUG_KERNEL, I'm therefore suggesting to drop this dependency.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:02:02 -08:00
Adrian Bunk 87c2ce3b93 [PATCH] lib/zlib*: cleanups
This patch contains the following possible cleanups:
- #if 0 the following unused functions:
  - zlib_deflate/deflate.c: zlib_deflateSetDictionary
  - zlib_deflate/deflate.c: zlib_deflateParams
  - zlib_deflate/deflate.c: zlib_deflateCopy
  - zlib_inflate/infblock.c: zlib_inflate_set_dictionary
  - zlib_inflate/infblock.c: zlib_inflate_blocks_sync_point
  - zlib_inflate/inflate_sync.c: zlib_inflateSync
  - zlib_inflate/inflate_sync.c: zlib_inflateSyncPoint
- remove the following unneeded EXPORT_SYMBOL's:
  - zlib_deflate/deflate_syms.c: zlib_deflateCopy
  - zlib_deflate/deflate_syms.c: zlib_deflateParams
  - zlib_inflate/inflate_syms.c: zlib_inflateSync
  - zlib_inflate/inflate_syms.c: zlib_inflateSyncPoint

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:57 -08:00
Dave Jones 51989b9ffe [PATCH] printk levels for spinlock debug
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:24 -08:00
Ingo Molnar 408894ee4d [PATCH] mutex subsystem, debugging code
mutex implementation - add debugging code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
2006-01-09 15:59:20 -08:00
Nick Piggin a57004e1af [PATCH] atomic: dec_and_lock use atomic primitives
Convert atomic_dec_and_lock to use new atomic primitives.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08 20:13:48 -08:00
Paul Jackson 96b7f34143 [PATCH] cpuset: better bitmap remap defaults
Fix the default behaviour for the remap operators in bitmap, cpumask and
nodemask.

As previously submitted, the pair of masks <A, B> defined a map of the
positions of the set bits in A to the corresponding bits in B.  This is still
true.

The issue is how to map the other positions, corresponding to the unset (0)
bits in A.  As previously submitted, they were all mapped to the first set bit
position in B, a constant map.

When I tried to code per-vma mempolicy rebinding using these remap operators,
I realized this was wrong.

This patch changes the default to map all the unset bit positions in A to the
same positions in B, the identity map.

For example, if A has bits 4-7 set, and B has bits 9-12 set, then the map
defined by the pair <A, B> maps each bit position in the first 32 bits as
follows:

	0 ==> 0
	  ...
	3 ==> 3
	4 ==> 9
	  ...
	7 ==> 12
	8 ==> 8
	9 ==> 9
	  ...
	31 ==> 31

This now corresponds to the typical behaviour desired when migrating pages and
policies from one cpuset to another.

The pages on nodes within the original cpuset, and the references in memory
policies to nodes within the original cpuset, are migrated to the
corresponding cpuset-relative nodes in the destination cpuset.  Other pages
and node references are left untouched.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08 20:13:42 -08:00
Ingo Molnar 50dd26ba09 [PATCH] DEBUG_SLAB depends on SLAB
Make DEBUG_SLAB depend on SLAB.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08 20:13:41 -08:00
Nick Piggin a5f51c9667 [PATCH] radix-tree: reduce tree height upon partial truncation
Shrink the height of a radix tree when it is partially truncated - we only do
shrinkage of full truncation at present.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08 20:13:41 -08:00