Add macvlan driver, which allows to create virtual ethernet devices
based on MAC address.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The set_multicast_list function may be called without holding the rtnl
mutex, resulting in races when changing the underlying device's promiscous
and allmulti state. Use the change_rx_mode hook, which is always invoked
under the rtnl.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method drivers currently use to synchronize multicast lists is not
very pretty:
- walk the multicast list
- search each entry on a copy of the previous list
- if new add to lower device
- walk the copy of the previous list
- search each entry on the current list
- if removed delete from lower device
- copy entire list
This patch adds a new field to struct dev_addr_list to store the
synchronization state and adds two helper functions for synchronization
and cleanup.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the set_multicast_list (and set_rx_mode) callbacks are
responsible for configuring the device according to the IFF_PROMISC,
IFF_MULTICAST and IFF_ALLMULTI flags and the mc_list (and uc_list in
case of set_rx_mode).
These callbacks can be invoked from BH context without the rtnl_mutex
by dev_mc_add/dev_mc_delete, which makes reading the device flags and
promiscous/allmulti count racy. For real hardware drivers that just
commit all changes to the hardware this is not a real problem since
the stack guarantees to call them for every change, so at least the
final call will not race and commit the correct configuration to the
hardware.
For software devices that want to synchronize promiscous and multicast
state to an underlying device however this can cause corruption of the
underlying device's flags or promisc/allmulti counts.
When the software device is concurrently put in promiscous or allmulti
mode while set_multicast_list is invoked from bottem half context, the
device might synchronize the change to the underlying device without
holding the rtnl_mutex, which races with concurrent changes to the
underlying device.
Add a dev->change_rx_flags hook that is invoked when any of the flags
that affect rx filtering change (under the rtnl_mutex), which allows
drivers to perform synchronization immediately and only synchronize
the address lists in set_multicast_list/set_rx_mode.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removes an obsolete method scsi_device_cancel which isn't being used
anywhere in the kernel.
Signed-off-by: Priyanka Gupta <priyankag@google.com>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patchset moves non-filesystem interfaces of v9fs from fs/9p to net/9p.
It moves the transport, packet marshalling and connection layers to net/9p
leaving only the VFS related files in fs/9p. This work is being done in
preparation for in-kernel 9p servers as well as alternate 9p clients (other
than VFS).
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (122 commits)
sunrpc: drop BKL around wrap and unwrap
NFSv4: Make sure unlock is really an unlock when cancelling a lock
NLM: fix source address of callback to client
SUNRPC client: add interface for binding to a local address
SUNRPC server: record the destination address of a request
SUNRPC: cleanup transport creation argument passing
NFSv4: Make the NFS state model work with the nosharedcache mount option
NFS: Error when mounting the same filesystem with different options
NFS: Add the mount option "nosharecache"
NFS: Add support for mounting NFSv4 file systems with string options
NFS: Add final pieces to support in-kernel mount option parsing
NFS: Introduce generic mount client API
NFS: Add enums and match tables for mount option parsing
NFS: Improve debugging output in NFS in-kernel mount client
NFS: Clean up in-kernel NFS mount
NFS: Remake nfsroot_mount as a permanent part of NFS client
SUNRPC: Add a convenient default for the hostname when calling rpc_create()
SUNRPC: Rename rpcb_getport to be consistent with new rpcb_getport_sync name
SUNRPC: Rename rpcb_getport_external routine
SUNRPC: Allow rpcbind requests to be interrupted by a signal.
...
Add the needed constants and defines to activate this for the IA64
platform.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
* 'ioat-md-accel-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop: (28 commits)
ioatdma: add the unisys "i/oat" pci vendor/device id
ARM: Add drivers/dma to arch/arm/Kconfig
iop3xx: surface the iop3xx DMA and AAU units to the iop-adma driver
iop13xx: surface the iop13xx adma units to the iop-adma driver
dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines
md: remove raid5 compute_block and compute_parity5
md: handle_stripe5 - request io processing in raid5_run_ops
md: handle_stripe5 - add request/completion logic for async expand ops
md: handle_stripe5 - add request/completion logic for async read ops
md: handle_stripe5 - add request/completion logic for async check ops
md: handle_stripe5 - add request/completion logic for async compute ops
md: handle_stripe5 - add request/completion logic for async write ops
md: common infrastructure for running operations with raid5_run_ops
md: raid5_run_ops - run stripe operations outside sh->lock
raid5: replace custom debug PRINTKs with standard pr_debug
raid5: refactor handle_stripe5 and handle_stripe6 (v3)
async_tx: add the async_tx api
xor: make 'xor_blocks' a library routine for use with async_tx
dmaengine: make clients responsible for managing channels
dmaengine: refactor dmaengine around dma_async_tx_descriptor
...
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] Workaround for a sparse warning in include/asm-mips/mach-tx4927/ioremap.h
[MIPS] Make show_code static and add __user tag
[MIPS] Workaround for a sparse warning in include/asm-mips/compat.h
[MIPS] Add some __user tags
[MIPS] math-emu minor cleanup
[MIPS] Kill CONFIG_TX4927BUG_WORKAROUND
[MIPS] Alchemy: Remove code wrapped by dead symbol CONFIG_FB_XPERT98
[MIPS] Alchemy: Remove code wrapped by dead symbol CONFIG_AU1000_SRC_CLK
[MIPS] Alchemy: Remove code wrapped by dead symbol CONFIG_AU1000_USE32K
[MIPS] Alchemy: Remove code wrapped by dead symbol CONFIG_AU1XXX_PSC_SPI
[CHAR] Delete leftovers of old Alchemy UART driver
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
[PATCH] sched: small topology.h cleanup
[PATCH] sched: fix show_task()/show_tasks() output
[PATCH] sched: remove stale version info from kernel/sched_debug.c
[PATCH] sched: allow larger granularity
[PATCH] sched: fix prio_to_wmult[] for nice 1
[ I re-did the commits to get rid of some bogus merge commit that
Ingo had. - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
trivial cleanup: LOCAL_DISTANCE and REMOTE_DISTANCE are only used in
topology.h and inside an #ifndef section - limit their existence to
that #ifndef.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cast to a __user pointer via "unsigned long" to get rid of this warning:
include2/asm/compat.h:135:10: warning: cast adds address space to expression (<asn:1>)
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: John Magolan <john.magolan@unisys.com>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Adds the platform device definitions and the architecture specific support
routines (i.e. register initialization and descriptor formats) for the
iop-adma driver.
Changelog:
* add support for > 1k zero sum buffer sizes
* added dma/aau platform devices to iq80321 and iq80332 setup
* fixed the calculation in iop_desc_is_aligned
* support xor buffer sizes larger than 16MB
* fix places where software descriptors are assumed to be contiguous, only
hardware descriptors are contiguous for up to a PAGE_SIZE buffer size
* convert to async_tx
* add interrupt support
* add platform devices for 80219 boards
* do not call platform register macros in driver code
* remove switch() statements for compatible register offsets/layouts
* change over to bitmap based capabilities
* remove unnecessary ARM assembly statement
* checkpatch.pl fixes
* gpl v2 only correction
* phys move to dma_async_tx_descriptor
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Adds the platform device definitions and the architecture specific
support routines (i.e. register initialization and descriptor formats) for the
iop-adma driver.
Changelog:
* added 'descriptor pool size' to the platform data
* add base support for buffer sizes larger than 16MB (hw max)
* build error fix from Kirill A. Shutemov
* rebase for async_tx changes
* add interrupt support
* do not call platform register macros in driver code
* remove unnecessary ARM assembly statement
* checkpatch.pl fixes
* gpl v2 only correction
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The Intel(R) IOP series of i/o processors integrate an Xscale core with
raid acceleration engines. The capabilities per platform are:
iop219:
(2) copy engines
iop321:
(2) copy engines
(1) xor and block fill engine
iop33x:
(2) copy and crc32c engines
(1) xor, xor zero sum, pq, pq zero sum, and block fill engine
iop34x (iop13xx):
(2) copy, crc32c, xor, xor zero sum, and block fill engines
(1) copy, crc32c, xor, xor zero sum, pq, pq zero sum, and block fill engine
The driver supports the features of the async_tx api:
* asynchronous notification of operation completion
* implicit (interupt triggered) handling of inter-channel transaction
dependencies
The driver adapts to the platform it is running by two methods.
1/ #include <asm/arch/adma.h> which defines the hardware specific
iop_chan_* and iop_desc_* routines as a series of static inline
functions
2/ The private platform data attached to the platform_device defines the
capabilities of the channels
20070626: Callbacks are run in a tasklet. Given the recent discussion on
LKML about killing tasklets in favor of workqueues I did a quick conversion
of the driver. Raid5 resync performance dropped from 50MB/s to 30MB/s, so
the tasklet implementation remains until a generic softirq interface is
available.
Changelog:
* fixed a slot allocation bug in do_iop13xx_adma_xor that caused too few
slots to be requested eventually leading to data corruption
* enabled the slot allocation routine to attempt to free slots before
returning -ENOMEM
* switched the cleanup routine to solely use the software chain and the
status register to determine if a descriptor is complete. This is
necessary to support other IOP engines that do not have status writeback
capability
* make the driver iop generic
* modified the allocation routines to understand allocating a group of
slots for a single operation
* added a null xor initialization operation for the xor only channel on
iop3xx
* support xor operations on buffers larger than the hardware maximum
* split the do_* routines into separate prep, src/dest set, submit stages
* added async_tx support (dependent operations initiation at cleanup time)
* simplified group handling
* added interrupt support (callbacks via tasklets)
* brought the pending depth inline with ioat (i.e. 4 descriptors)
* drop dma mapping methods, suggested by Chris Leech
* don't use inline in C files, Adrian Bunk
* remove static tasklet declarations
* make iop_adma_alloc_slots easier to read and remove chances for a
corrupted descriptor chain
* fix locking bug in iop_adma_alloc_chan_resources, Benjamin Herrenschmidt
* convert capabilities over to dma_cap_mask_t
* fixup sparse warnings
* add descriptor flush before iop_chan_enable
* checkpatch.pl fixes
* gpl v2 only correction
* move set_src, set_dest, submit to async_tx methods
* move group_list and phys to async_tx
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When a read bio is attached to the stripe and the corresponding block is
marked R5_UPTODATE, then a read (biofill) operation is scheduled to copy
the data from the stripe cache to the bio buffer. handle_stripe flags the
blocks to be operated on with the R5_Wantfill flag. If new read requests
arrive while raid5_run_ops is running they will not be handled until
handle_stripe is scheduled to run again.
Changelog:
* cleanup to_read and to_fill accounting
* do not fail reads that have reached the cache
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-By: NeilBrown <neilb@suse.de>
handle_stripe will compute a block when a backing disk has failed, or when
it determines it can save a disk read by computing the block from all the
other up-to-date blocks.
Previously a block would be computed under the lock and subsequent logic in
handle_stripe could use the newly up-to-date block. With the raid5_run_ops
implementation the compute operation is carried out a later time outside
the lock. To preserve the old functionality we take advantage of the
dependency chain feature of async_tx to flag the block as R5_Wantcompute
and then let other parts of handle_stripe operate on the block as if it
were up-to-date. raid5_run_ops guarantees that the block will be ready
before it is used in another operation.
However, this only works in cases where the compute and the dependent
operation are scheduled at the same time. If a previous call to
handle_stripe sets the R5_Wantcompute flag there is no facility to pass the
async_tx dependency chain across successive calls to raid5_run_ops. The
req_compute variable protects against this case.
Changelog:
* remove the req_compute BUG_ON
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-By: NeilBrown <neilb@suse.de>
When the raid acceleration work was proposed, Neil laid out the following
attack plan:
1/ move the xor and copy operations outside spin_lock(&sh->lock)
2/ find/implement an asynchronous offload api
The raid5_run_ops routine uses the asynchronous offload api (async_tx) and
the stripe_operations member of a stripe_head to carry out xor+copy
operations asynchronously, outside the lock.
To perform operations outside the lock a new set of state flags is needed
to track new requests, in-flight requests, and completed requests. In this
new model handle_stripe is tasked with scanning the stripe_head for work,
updating the stripe_operations structure, and finally dropping the lock and
calling raid5_run_ops for processing. The following flags outline the
requests that handle_stripe can make of raid5_run_ops:
STRIPE_OP_BIOFILL
- copy data into request buffers to satisfy a read request
STRIPE_OP_COMPUTE_BLK
- generate a missing block in the cache from the other blocks
STRIPE_OP_PREXOR
- subtract existing data as part of the read-modify-write process
STRIPE_OP_BIODRAIN
- copy data out of request buffers to satisfy a write request
STRIPE_OP_POSTXOR
- recalculate parity for new data that has entered the cache
STRIPE_OP_CHECK
- verify that the parity is correct
STRIPE_OP_IO
- submit i/o to the member disks (note this was already performed outside
the stripe lock, but it made sense to add it as an operation type
The flow is:
1/ handle_stripe sets STRIPE_OP_* in sh->ops.pending
2/ raid5_run_ops reads sh->ops.pending, sets sh->ops.ack, and submits the
operation to the async_tx api
3/ async_tx triggers the completion callback routine to set
sh->ops.complete and release the stripe
4/ handle_stripe runs again to finish the operation and optionally submit
new operations that were previously blocked
Note this patch just defines raid5_run_ops, subsequent commits (one per
major operation type) modify handle_stripe to take advantage of this
routine.
Changelog:
* removed ops_complete_biodrain in favor of ops_complete_postxor and
ops_complete_write.
* removed the raid5_run_ops workqueue
* call bi_end_io for reads in ops_complete_biofill, saves a call to
handle_stripe
* explicitly handle the 2-disk raid5 case (xor becomes memcpy), Neil Brown
* fix race between async engines and bi_end_io call for reads, Neil Brown
* remove unnecessary spin_lock from ops_complete_biofill
* remove test_and_set/test_and_clear BUG_ONs, Neil Brown
* remove explicit interrupt handling for channel switching, this feature
was absorbed (i.e. it is now implicit) by the async_tx api
* use return_io in ops_complete_biofill
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-By: NeilBrown <neilb@suse.de>
handle_stripe5 and handle_stripe6 have very deep logic paths handling the
various states of a stripe_head. By introducing the 'stripe_head_state'
and 'r6_state' objects, large portions of the logic can be moved to
sub-routines.
'struct stripe_head_state' consumes all of the automatic variables that previously
stood alone in handle_stripe5,6. 'struct r6_state' contains the handle_stripe6
specific variables like p_failed and q_failed.
One of the nice side effects of the 'stripe_head_state' change is that it
allows for further reductions in code duplication between raid5 and raid6.
The following new routines are shared between raid5 and raid6:
handle_completed_write_requests
handle_requests_to_failed_array
handle_stripe_expansion
Changes:
* v2: fixed 'conf->raid_disk-1' for the raid6 'handle_stripe_expansion' path
* v3: removed the unused 'dirty' field from struct stripe_head_state
* v3: coalesced open coded bi_end_io routines into return_io()
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-By: NeilBrown <neilb@suse.de>
The async_tx api provides methods for describing a chain of asynchronous
bulk memory transfers/transforms with support for inter-transactional
dependencies. It is implemented as a dmaengine client that smooths over
the details of different hardware offload engine implementations. Code
that is written to the api can optimize for asynchronous operation and the
api will fit the chain of operations to the available offload resources.
I imagine that any piece of ADMA hardware would register with the
'async_*' subsystem, and a call to async_X would be routed as
appropriate, or be run in-line. - Neil Brown
async_tx exploits the capabilities of struct dma_async_tx_descriptor to
provide an api of the following general format:
struct dma_async_tx_descriptor *
async_<operation>(..., struct dma_async_tx_descriptor *depend_tx,
dma_async_tx_callback cb_fn, void *cb_param)
{
struct dma_chan *chan = async_tx_find_channel(depend_tx, <operation>);
struct dma_device *device = chan ? chan->device : NULL;
int int_en = cb_fn ? 1 : 0;
struct dma_async_tx_descriptor *tx = device ?
device->device_prep_dma_<operation>(chan, len, int_en) : NULL;
if (tx) { /* run <operation> asynchronously */
...
tx->tx_set_dest(addr, tx, index);
...
tx->tx_set_src(addr, tx, index);
...
async_tx_submit(chan, tx, flags, depend_tx, cb_fn, cb_param);
} else { /* run <operation> synchronously */
...
<operation>
...
async_tx_sync_epilog(flags, depend_tx, cb_fn, cb_param);
}
return tx;
}
async_tx_find_channel() returns a capable channel from its pool. The
channel pool is organized as a per-cpu array of channel pointers. The
async_tx_rebalance() routine is tasked with managing these arrays. In the
uniprocessor case async_tx_rebalance() tries to spread responsibility
evenly over channels of similar capabilities. For example if there are two
copy+xor channels, one will handle copy operations and the other will
handle xor. In the SMP case async_tx_rebalance() attempts to spread the
operations evenly over the cpus, e.g. cpu0 gets copy channel0 and xor
channel0 while cpu1 gets copy channel 1 and xor channel 1. When a
dependency is specified async_tx_find_channel defaults to keeping the
operation on the same channel. A xor->copy->xor chain will stay on one
channel if it supports both operation types, otherwise the transaction will
transition between a copy and a xor resource.
Currently the raid5 implementation in the MD raid456 driver has been
converted to the async_tx api. A driver for the offload engines on the
Intel Xscale series of I/O processors, iop-adma, is provided in a later
commit. With the iop-adma driver and async_tx, raid456 is able to offload
copy, xor, and xor-zero-sum operations to hardware engines.
On iop342 tiobench showed higher throughput for sequential writes (20 - 30%
improvement) and sequential reads to a degraded array (40 - 55%
improvement). For the other cases performance was roughly equal, +/- a few
percentage points. On a x86-smp platform the performance of the async_tx
implementation (in synchronous mode) was also +/- a few percentage points
of the original implementation. According to 'top' on iop342 CPU
utilization drops from ~50% to ~15% during a 'resync' while the speed
according to /proc/mdstat doubles from ~25 MB/s to ~50 MB/s.
The tiobench command line used for testing was: tiobench --size 2048
--block 4096 --block 131072 --dir /mnt/raid --numruns 5
* iop342 had 1GB of memory available
Details:
* if CONFIG_DMA_ENGINE=n the asynchronous path is compiled away by making
async_tx_find_channel a static inline routine that always returns NULL
* when a callback is specified for a given transaction an interrupt will
fire at operation completion time and the callback will occur in a
tasklet. if the the channel does not support interrupts then a live
polling wait will be performed
* the api is written as a dmaengine client that requests all available
channels
* In support of dependencies the api implicitly schedules channel-switch
interrupts. The interrupt triggers the cleanup tasklet which causes
pending operations to be scheduled on the next channel
* Xor engines treat an xor destination address differently than a software
xor routine. To the software routine the destination address is an implied
source, whereas engines treat it as a write-only destination. This patch
modifies the xor_blocks routine to take a an explicit destination address
to mirror the hardware.
Changelog:
* fixed a leftover debug print
* don't allow callbacks in async_interrupt_cond
* fixed xor_block changes
* fixed usage of ASYNC_TX_XOR_DROP_DEST
* drop dma mapping methods, suggested by Chris Leech
* printk warning fixups from Andrew Morton
* don't use inline in C files, Adrian Bunk
* select the API when MD is enabled
* BUG_ON xor source counts <= 1
* implicitly handle hardware concerns like channel switching and
interrupts, Neil Brown
* remove the per operation type list, and distribute operation capabilities
evenly amongst the available channels
* simplify async_tx_find_channel to optimize the fast path
* introduce the channel_table_initialized flag to prevent early calls to
the api
* reorganize the code to mimic crypto
* include mm.h as not all archs include it in dma-mapping.h
* make the Kconfig options non-user visible, Adrian Bunk
* move async_tx under crypto since it is meant as 'core' functionality, and
the two may share algorithms in the future
* move large inline functions into c files
* checkpatch.pl fixes
* gpl v2 only correction
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-By: NeilBrown <neilb@suse.de>
The async_tx api tries to use a dma engine for an operation, but will fall
back to an optimized software routine otherwise. Xor support is
implemented using the raid5 xor routines. For organizational purposes this
routine is moved to a common area.
The following fixes are also made:
* rename xor_block => xor_blocks, suggested by Adrian Bunk
* ensure that xor.o initializes before md.o in the built-in case
* checkpatch.pl fixes
* mark calibrate_xor_blocks __init, Adrian Bunk
Cc: Adrian Bunk <bunk@stusta.de>
Cc: NeilBrown <neilb@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The current implementation assumes that a channel will only be used by one
client at a time. In order to enable channel sharing the dmaengine core is
changed to a model where clients subscribe to channel-available-events.
Instead of tracking how many channels a client wants and how many it has
received the core just broadcasts the available channels and lets the
clients optionally take a reference. The core learns about the clients'
needs at dma_event_callback time.
In support of multiple operation types, clients can specify a capability
mask to only be notified of channels that satisfy a certain set of
capabilities.
Changelog:
* removed DMA_TX_ARRAY_INIT, no longer needed
* dma_client_chan_free -> dma_chan_release: switch to global reference
counting only at device unregistration time, before it was also happening
at client unregistration time
* clients now return dma_state_client to dmaengine (ack, dup, nak)
* checkpatch.pl fixes
* fixup merge with git-ioat
Cc: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
The current dmaengine interface defines mutliple routines per operation,
i.e. dma_async_memcpy_buf_to_buf, dma_async_memcpy_buf_to_page etc. Adding
more operation types (xor, crc, etc) to this model would result in an
unmanageable number of method permutations.
Are we really going to add a set of hooks for each DMA engine
whizbang feature?
- Jeff Garzik
The descriptor creation process is refactored using the new common
dma_async_tx_descriptor structure. Instead of per driver
do_<operation>_<dest>_to_<src> methods, drivers integrate
dma_async_tx_descriptor into their private software descriptor and then
define a 'prep' routine per operation. The prep routine allocates a
descriptor and ensures that the tx_set_src, tx_set_dest, tx_submit routines
are valid. Descriptor creation and submission becomes:
struct dma_device *dev;
struct dma_chan *chan;
struct dma_async_tx_descriptor *tx;
tx = dev->device_prep_dma_<operation>(chan, len, int_flag)
tx->tx_set_src(dma_addr_t, tx, index /* for multi-source ops */)
tx->tx_set_dest(dma_addr_t, tx, index)
tx->tx_submit(tx)
In addition to the refactoring, dma_async_tx_descriptor also lays the
groundwork for definining cross-channel-operation dependencies, and a
callback facility for asynchronous notification of operation completion.
Changelog:
* drop dma mapping methods, suggested by Chris Leech
* fix ioat_dma_dependency_added, also caught by Andrew Morton
* fix dma_sync_wait, change from Andrew Morton
* uninline large functions, change from Andrew Morton
* add tx->callback = NULL to dmaengine calls to interoperate with async_tx
calls
* hookup ioat_tx_submit
* convert channel capabilities to a 'cpumask_t like' bitmap
* removed DMA_TX_ARRAY_INIT, no longer needed
* checkpatch.pl fixes
* make set_src, set_dest, and tx_submit descriptor specific methods
* fixup git-ioat merge
* move group_list and phys to dma_async_tx_descriptor
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6: (149 commits)
USB: ohci-pnx4008: Remove unnecessary cast of return value of kzalloc
USB: additions to the quirk list
usb-storage: implement autosuspend
USB: cdc-acm: add new device id to option driver
USB: goku_udc trivial cleanups
USB: usb gadget stack can now -DDEBUG with Kconfig
usb gadget stack: remove usb_ep_*_buffer(), part 2
usb gadget stack: remove usb_ep_*_buffer(), part 1
USB: pxa2xx_udc -- cleanups, mostly removing dma hooks
USB: pxa2xx_udc: use generic gpio layer
USB: quirk for samsung printer
USB: usb/dma doc updates
USB: drivers/usb/storage/unusual_devs.h whitespace cleanup
USB: remove Makefile reference to obsolete OHCI_AT91
USB: io_*: remove bogus termios no change checks
USB: mos7720: remove bogus no termios change check
USB: visor and whiteheat: remove bogus termios change checks
USB: pl2303: remove bogus checks and fix speed support to use tty_get_baud_rate()
USB: mos7840.c: turn this into a serial driver
USB: make the usb_device numa_node get assigned from controller
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (76 commits)
IB: Update MAINTAINERS with Hal's new email address
IB/mlx4: Implement query SRQ
IB/mlx4: Implement query QP
IB/cm: Send no match if a SIDR REQ does not match a listen
IB/cm: Fix handling of duplicate SIDR REQs
IB/cm: cm_msgs.h should include ib_cm.h
IB/cm: Include HCA ACK delay in local ACK timeout
IB/cm: Use spin_lock_irq() instead of spin_lock_irqsave() when possible
IB/sa: Make sure SA queries use default P_Key
IPoIB: Recycle loopback skbs instead of freeing and reallocating
IB/mthca: Replace memset(<addr>, 0, PAGE_SIZE) with clear_page(<addr>)
IPoIB/cm: Fix warning if IPV6 is not enabled
IB/core: Take sizeof the correct pointer when calling kmalloc()
IB/ehca: Improve latency by unlocking after triggering the hardware
IB/ehca: Notify consumers of LID/PKEY/SM changes after nondisruptive events
IB/ehca: Return QP pointer in poll_cq()
IB/ehca: Change idr spinlocks into rwlocks
IB/ehca: Refactor sync between completions and destroy_cq using atomic_t
IB/ehca: Lock renaming, static initializers
IB/ehca: Report RDMA atomic attributes in query_qp()
...
This patch removes controller driver infrastructure which supported
the now-removed usb_ep_{alloc,free}_buffer() calls.
As can be seen, many of the implementations of this were broken to
various degrees. Many didn't properly return dma-coherent mappings;
those which did so were necessarily ugly because of bogosity in the
underlying dma_free_coherent() calls ... which on many platforms
can't be called from the same contexts (notably in_irq) from which
their dma_alloc_coherent() sibling can be called.
The main potential downside of removing this is that gadget drivers
wouldn't have specific knowledge that the controller drivers have:
endpoints that aren't dma-capable don't need any dma mappings at all.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Remove usb_ep_{alloc,free}_buffer() calls, for small dma-coherent buffers.
This patch just removes the interface and its users; later patches will
remove controller driver support.
- This interface is invariably not implemented correctly in the
controller drivers (e.g. using dma pools, a mechanism which
post-dates the interface by several years).
- At this point no gadget driver really *needs* to use it. In
current kernels, any driver that needs such a mechanism could
allocate a dma pool themselves.
Removing this interface is thus a simplification and improvement.
Note that the gmidi.c driver had a bug in this area; fixed.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch lets the pxa2xx_udc use the generic gpio layer,
on the relevant PXA and IXP systems.
Signed-off-by: Milan Svoboda <msvoboda@ra.rockwell.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
USB_IAD: Adds support for USB Interface Association Descriptors.
This patch adds support to the USB host stack for parsing, storing, and
displaying Interface Association Descriptors. In /proc/bus/usb/devices
lines starting with A: show the fields in an IAD. In sysfs if an
interface on a USB device is referenced by an IAD the following files
will be added to the sysfs directory for that interface:
iad_bFirstInterface, iad_bInterfaceCount, iad_bFunctionClass, and
iad_bFunctionSubClass, iad_bFunctionProtocol
Signed-off-by: Craig W. Nadler <craig@nadler.us>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
USB: Add URB_FREE_BUFFER flag for freeing the transfer buffer
In some cases it is not needed that the driver keeps track of the
transfer buffer of an URB. It can be simply freed along with the
URB itself when the reference count goes down to zero. The new
flag URB_FREE_BUFFER enables this behavior.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as920) adds an extra level of protection to the
USB-Persist facility. Now it will apply by default only to hubs; for
all other devices the user must enable it explicitly by setting the
power/persist device attribute.
The disconnect_all_children() routine in hub.c has been removed and
its code placed inline. This is the way it was originally as part of
hub_pre_reset(); the revised usage in hub_reset_resume() is
sufficiently different that the code can no longer be shared.
Likewise, mark_children_for_reset() is now inline as part of
hub_reset_resume(). The end result looks much cleaner than before.
The sysfs interface is updated to add the new attribute file, and
there are corresponding documentation updates.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as918) introduces a new USB driver method: reset_resume.
It is called when a device needs to be reset as part of a resume
procedure (whether because of a device quirk or because of the
USB-Persist facility), thereby taking over a role formerly assigned to
the post_reset method. As a consequence, post_reset no longer needs
an argument indicating whether it is being called as part of a
reset-resume. This separation of functions makes the code clearer.
In addition, the pre_reset and post_reset method return types are
changed; they now must return an error code. The return value is
unused at present, but at some later time we may unbind drivers and
re-probe if they encounter an error during reset handling.
The existing pre_reset and post_reset methods in the usbhid,
usb-storage, and hub drivers are updated to match the new
requirements. For usbhid the post_reset routine is also used for
reset_resume (duplicate method pointers); for the other drivers a new
reset_resume routine is added. The change to hub.c looks bigger than
it really is, because mark_children_for_reset_resume() gets moved down
next to the new hub_reset_resume() routine.
A minor change to usb-storage makes the usb_stor_report_bus_reset()
routine acquire the host lock instead of requiring the caller to hold
it already.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Make sure gadgetfs userspace interface is properly exported:
- Move <linux/usb_gadgetfs.h> to <linux/usb/gadgetfs.h>;
- Export it using Kbuild;
- Add an #include guard;
- Correct some internal documentation;
- Update struct layout so it's the same on 32/64 bit kernels.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Recently, the USB device matching code stopped matching generic interface
matches against devices with vendor-specific device class values.
Some drivers now need to explicitly match USB device ID's (in addition to
generic interface info) to retain the same behaviour as before. This new macro,
suggested by Alan Stern, makes the explicit device/interface matching a little
simpler for those users.
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as888) adds a new USB device quirk for devices which are
unable to resume correctly. By using the new code added for the
USB-persist facility, it is a simple matter to reset these devices
instead of resuming them. To get things kicked off, a quirk entry is
added for the Philips PSC805.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as886) adds the controversial USB-persist facility,
allowing USB devices to persist across a power loss during system
suspend.
The facility is controlled by a new Kconfig option (with appropriate
warnings about the potential dangers); when the option is off the
behavior will remain the same as it is now. But when the option is
on, people will be able to use suspend-to-disk and keep their USB
filesystems intact -- something particularly valuable for small
machines where the root filesystem is on a USB device!
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
this implements generic support for suspend/resume for usb serial.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit 91a6902958 added an extra
argument to pci_read_legacy_io() and pci_write_legacy_io(). But
the prototypes in include/asm-ia64/pci.h were not updated.
Signed-off-by: Tony Luck <tony.luck@intel.com>
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (50 commits)
[ARM] sa1100: remove boot time RTC initialisation
[ARM] sa1100: stop doing our own rtc management over suspend
[ARM] 4474/1: Do not check the PSR_F_BIT in valid_user_regs
[ARM] 4473/2: Take the HWCAP definitions out of the elf.h file
[ARM] pxa: move platform devices to separate header file
[ARM] pxa: move device registration into CPU-specific file
[ARM] pxa: remove boot time RTC initialisation
[ARM] pxa: stop doing our own rtc management over suspend
[ARM] 4451/1: pxa: make dma.c generic and remove cpu specific dma code
[ARM] 4450/1: pxa: add pxa25x_init_irq() and pxa27x_init_irq()
[ARM] 4440/1: PXA: enable the checking of ICIP2 for IRQs
[ARM] 4438/1: PXA: remove #ifdef .. #endif from pxa_gpio_demux_handler()
[ARM] 4437/1: PXA: move the GPIO IRQ initialization code to pxa_init_irq_gpio()
[ARM] 4436/1: PXA: move low IRQ initialization code to pxa_init_irq_low()
[ARM] 4435/1: PXA: remove PXA_INTERNAL_IRQS
[ARM] 4434/1: PXA: remove PXA_IRQ_SKIP
[ARM] pxa: Fix PXA27x suspend type validation, remove pxa_pm_prepare()
[ARM] pxa: move pm_ops structure into CPU specific files
[ARM] pxa: introduce cpu_is_pxaXXX macros
[ARM] pxa: remove MMC register defines from pxa-regs.h
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
security: unexport mmap_min_addr
SELinux: use SECINITSID_NETMSG instead of SECINITSID_UNLABELED for NetLabel
security: Protection for exploiting null dereference using mmap
SELinux: Use %lu for inode->i_no when printing avc
SELinux: allow preemption between transition permission checks
selinux: introduce schedule points in policydb_destroy()
selinux: add selinuxfs structure for object class discovery
selinux: change sel_make_dir() to specify inode counter.
selinux: rename sel_remove_bools() for more general usage.
selinux: add support for querying object classes and permissions from the running policy
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Support multiple CPUs going through OS_MCA
[IA64] silence GCC ia64 unused variable warnings
[IA64] prevent MCA when performing MMIO mmap to PCI config space
[IA64] add sn_register_pmi_handler oemcall
[IA64] Stop bit for brl instruction
[IA64] SN: Correct ROM resource length for BIOS copy
[IA64] Don't set psr.ic and psr.i simultaneously
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (34 commits)
PCI: Only build PCI syscalls on architectures that want them
PCI: limit pci_get_bus_and_slot to domain 0
PCI: hotplug: acpiphp: avoid acpiphp "cannot get bridge info" PCI hotplug failure
PCI: hotplug: acpiphp: remove hot plug parameter write to PCI host bridge
PCI: hotplug: acpiphp: fix slot poweroff problem on systems without _PS3
PCI: hotplug: pciehp: wait for 1 second after power off slot
PCI: pci_set_power_state(): check for PM capabilities earlier
PCI: cpci_hotplug: Convert to use the kthread API
PCI: add pci_try_set_mwi
PCI: pcie: remove SPIN_LOCK_UNLOCKED
PCI: ROUND_UP macro cleanup in drivers/pci
PCI: remove pci_dac_dma_... APIs
PCI: pci-x-pci-express-read-control-interfaces cleanups
PCI: Fix typo in include/linux/pci.h
PCI: pci_ids, remove double or more empty lines
PCI: pci_ids, add atheros and 3com_2 vendors
PCI: pci_ids, reorder some entries
PCI: i386: traps, change VENDOR to DEVICE
PCI: ATM: lanai, change VENDOR to DEVICE
PCI: Change all drivers to use pci_device->revision
...
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (61 commits)
sysfs: add parameter "struct bin_attribute *" in .read/.write methods for sysfs binary attributes
sysfs: make directory dentries and inodes reclaimable
sysfs: implement sysfs_get_dentry()
sysfs: move sysfs_drop_dentry() to dir.c and make it static
sysfs: restructure add/remove paths and fix inode update
sysfs: use sysfs_mutex to protect the sysfs_dirent tree
sysfs: consolidate sysfs spinlocks
sysfs: make kobj point to sysfs_dirent instead of dentry
sysfs: implement sysfs_find_dirent() and sysfs_get_dirent()
sysfs: implement SYSFS_FLAG_REMOVED flag
sysfs: rename sysfs_dirent->s_type to s_flags and make room for flags
sysfs: make sysfs_drop_dentry() access inodes using ilookup()
sysfs: Fix oops in sysfs_drop_dentry on x86_64
sysfs: use singly-linked list for sysfs_dirent tree
sysfs: slim down sysfs_dirent->s_active
sysfs: move s_active functions to fs/sysfs/dir.c
sysfs: fix root sysfs_dirent -> root dentry association
sysfs: use iget_locked() instead of new_inode()
sysfs: reorganize sysfs_new_indoe() and sysfs_create()
sysfs: fix parent refcounting during rename and move
...
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: (21 commits)
libata: remove irq_on from ata_bus_reset() and ata_std_postreset()
ata_piix: kill incorrect invalid map value warning
libata: add another Maxtor drive with broken NCQ to the list
[libata] sata_mv: Fix and clean up per-chip-generation tests
[libata] sata_mv: Convert to new exception handling (EH) infrastructure
[libata] sata_mv: minor bug fixes, enhancements, and cleanups (prep for new EH)
[libata] sata_mv: Minor cleanups and renaming, preparing for new EH & NCQ
libata-link: add PMP related ATA constants
libata-link: separate out ata_eh_handle_dev_fail()
pata_hpt3x3: fix DMA Kconfig option to actually have a hope of working
Add Hitachi HDS7250SASUN500G 0621KTAWSD to NCQ blacklist
pata_scc.c: Workaround for errata A308
libata: add FUJITSU MHV2080BH to NCQ blacklist
pata_hpt3x3: major reworking and testing
libata: clean up horkage handling
libata: quirk IOMEGA ZIP 250 ATAPI FLOPPY
libata: simplify PCI legacy SFF host handling
pata_mpc52xx: suspend/resume support
sata_promise: SATA hotplug support, take 2
pata_sis: FIFO whack
...
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] Rename PC speaker code
[MIPS] Don't use genrtc.
[MIPS] Remove unused time.c for swarm
[MIPS] Sparse: Use NULL for pointer
[MIPS] Fix a sparse warning in arch/mips/pci/pci.c
[MIPS] SMTC: Interrupt mask backstop hack
[MIPS] separate platform_device registration for VR41xx RTC
[MIPS] Separate platform_device registration for VR41xx GPIO
[MIPS] MIPSsim: Fix build.
[MIPS] separate platform_device registration for VR41xx serial interface
[MIPS] Include cacheflush.h in uncache.c
[MIPS] Cleanup tlbdebug.h
[MIPS] Change names of local variables to silence sparse (part 2)
[MIPS] Workaround for a sparse warning in include/asm-mips/io.h
[MIPS] RM: Use only phyiscal address for 82596 and 53c710
[MIPS] Hydrogen3: Remove remaining bits of code.
[MIPS] DEC: Fix modpost warning.
Revert "[MIPS] DEC: Fix modpost warning."
[MIPS] Fix resume for 64K page size on R4000 class processors.
* master.kernel.org:/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: (30 commits)
Blackfin serial driver: supporting BF548-EZKIT serial port
Video Console: Blackfin doesnt support VGA console
Blackfin arch: Add peripheral io API to gpio header file
Blackfin arch: set up gpio interrupt IRQ_PJ9 for BF54x ATAPI PATA driver
Blackfin arch: add missing CONFIG_LARGE_ALLOCS when upstream merging
Blackfin arch: as pointed out by Robert P. J. Day, update the CPU_FREQ name to match current Kconfig
Blackfin arch: extract the entry point from the linked kernel
Blackfin arch: clean up some coding style issues
Blackfin arch: combine the common code of free_initrd_mem and free_initmem
Blackfin arch: Add Support for Peripheral PortMux and resouce allocation
Blackfin arch: use PAGE_SIZE when doing aligns rather than hardcoded values
Blackfin arch: fix bug set dma_address properly in dma_map_sg
Blackfin arch: Disable CACHELINE_ALIGNED_L1 for BF54x by default
Blackfin arch: Port the dm9000 driver to Blackfin by using the correct low-level io routines
Blackfin arch: There is no CDPRIO Bit in the EBIU_AMGCTL Register of BF54x arch
Blackfin arch: scrub dead code
Blackfin arch: Fix Warning add some defines in BF54x header file
Blackfin arch: add BF54x missing GPIO access functions
Blackfin arch: Some memory and code optimizations - Fix SYS_IRQS
Blackfin arch: Enable BF54x PIN/GPIO interrupts
...
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: (26 commits)
i2c-rpx: Remove
i2c-mpc: work around missing-9th-clock-pulse bug
i2c: New PMC MSP71xx TWI bus driver
i2c-savage4: Delete many unused defines
i2c/tsl2550: Speed up initialization
i2c: New bus driver for the TAOS evaluation modules
i2c-i801: Use the internal 32-byte buffer on ICH4+
i2c-i801: Various cleanups
i2c: Add support for the TSL2550
i2c-pxa: Support new-style I2C drivers
i2c-gpio: Make some internal functions static
i2c-gpio: Add support for new-style clients
i2c-iop3xx: Switch to static adapter numbering
i2c-sis5595: Resolve resource conflict with sis5595
matroxfb: Clean-up i2c header inclusions
i2c-nforce2: Add support for SMBus block transactions
i2c-mpc: Use i2c_add_numbered_adapter
i2c-mv64xxx: Use i2c_add_numbered_adapter
i2c-piix4: Add support for the ATI SB700
i2c: New DS1682 chip driver
...
The "protection needed" flag is currently parsed out of the ERP IE in
beacons. This patch allows the ERP IE to be available at assocation time
and causes the appropriate actions to be performed earlier.
It is slightly complicated by the fact that most APs don't include the
ERP IE in association responses. To work around this, we store ERP
values in the ieee80211_sta_bss structure.
Also added some WLAN_ERP defines for use by upcoming patches.
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The semantics of not having an add_interface callback are not well
defined, this callback is required because otherwise you cannot obtain
the requested MAC address of the device. Change the documentation to
reflect this, add a note about having no MAC address at all, add a
warning that mac_addr in struct ieee80211_if_init_conf can be NULL and
finally verify that a few callbacks are assigned by way of BUG_ON()
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Remove ieee80211_set_aid_for_sta and associated code.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Generic code to walk through the fields in a radiotap header, accounting
for nasties like extended "field present" bitfields and alignment rules
Signed-off-by: Andy Green <andy@warmcat.com>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This removes the old i386 setup code. This is done as a separate patch
to avoid breaking git bisect as some of the i386 code was also used by
the old x86-64 code.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add symbolic constants for the segment selectors/GDT slots used by
the setup code, for consistency with i386.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make struct boot_params a real structure, and remove the handling of
some obsolete fields, in particular hd*_info, which was only used by
the ST-506 driver, and likely to be wrong for that driver on any
modern BIOS.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make definitions for struct e820entry and struct e820map
consistent between i386 and x86-64.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some Intel features are spread around in different CPUID leafs like 0x5,
0x6 and 0xA. Make this feature detection code common across i386 and
x86_64.
Display Intel Dynamic Acceleration feature in /proc/cpuinfo. This feature
will be enabled automatically by current acpi-cpufreq driver.
Refer to Intel Software Developer's Manual for more details about the feature.
Thanks to hpa (H Peter Anvin) for the making the actual code detecting the
scattered features data-driven.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
struct screen_info has unaligned members, it needs to be packed.
In the process, fix the naming of some of the members, which don't
belong in this structure but are part of it anyway.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Unify the handling of the CPU features vectors between i386 and x86-64.
This also adopts the collapsing of features which are required at
compile-time into constant tests from x86-64 to i386.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
include/asm-i386/boot.h incorrectly has the multiple include guards
as _LINUX_BOOT_H instead of _ASM_BOOT_H. Fix.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The only pseudo-legitimate MIPS user of genrtc was a systems that doesn't
have an RTC in hardware at all. At this point faking one is a little
pointless ...
To support multiple TC microthreads acting as "CPUs" within a VPE,
VPE-wide interrupt mask bits must be specially manipulated during
interrupt handling. To support legacy drivers and interrupt controller
management code, SMTC has a "backstop" to track and if necessary restore
the interrupt mask. This has some performance impact on interrupt service
overhead. Disable it only if you know what you are doing.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Also include tlbdebug.h in dump_tlb.c and r3k_dump_tlb.c.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch is an workaround for these sparse warnings:
include2/asm/mmu_context.h:172:2: warning: symbol 'flags' shadows an earlier one
include2/asm/mmu_context.h:133:16: originally declared here
include2/asm/mmu_context.h:232:2: warning: symbol 'flags' shadows an earlier one
include2/asm/mmu_context.h:203:16: originally declared here
include2/asm/mmu_context.h:277:3: warning: symbol 'flags' shadows an earlier one
include2/asm/mmu_context.h:250:16: originally declared here
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
CKSEG1ADDR() returns unsigned int value on 32bit kernel. Cast it to
unsigned long to get rid of this warning:
include2/asm/io.h:215:12: warning: cast adds address space to expression (<asn:2>)
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When running Linux in non-secure mode (on ARM1176 for example),
depending on the CP15 secure configuration register, the CPSR.F bit
(6) might only be modified from the secure mode. However, the
valid_user_regs() function checks for this bit being cleared. With
commit a6c61e9d, a SIGSEGV is forced in handle_signal() if the user
registers are not considered valid.
The patch also ensures that the CPSR.A bit is cleared and the USR mode
is set if the CPU does not support the 26bit user mode.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The patch moves the HWCAP definitions and the extern elf_hwcap
declaration to the hwcap.h header file.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Since the number of dma channels varies between pxa25x and pxa27x, it
introduces some specific code in dma.c. This patch moves the specific
code to pxa25x.c and pxa27x.c and makes dma.c more generic.
1. add pxa_init_dma() for dma initialization, the number of channels
are passed in by the argument
2. add a "prio" field to the "struct pxa_dma_channel" for the channel
priority, and is initialized in pxa_init_dma()
3. use a general priority comparison with the channels "prio" field so
to remove the processor specific pxa_for_each_dma_prio macro, this
is not lightning fast as the original one, but it is acceptable as
it happens when requesting dma, which is usually not so performance
critical
Signed-off-by: eric miao <eric.miao@marvell.com>
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
ICIP2 is not examined during IRQ entrance, this patch add the
checking if the processor is PXA27x or later, with CoreG bits
in CPUID (Core Generation) > 1
Signed-off-by: eric miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
1. define PXA_GPIO_IRQ_BASE to be right after the internal IRQs,
and define PXA_GPIO_IRQ_NUM to be 128 for all PXA2xx variants
2. make the code specific to the high IRQ numbers (32..64) to be
PXA27x specific
3. add a function pxa_init_irq_high() to initialize the internal
high IRQ chip, the invoke of this function could be moved to
PXA27x specific initialization code
Signed-off-by: eric miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
1. PXA_IRQ_SKIP is defined to be 7 on PXA25x so that the first IRQ
starts from zero. This makes IRQ numbering inconsistent between
PXA25x and PXA27x. Remove this macro so that the same IRQ_XXXXX
definition has the same value on both PXA25x and PXA27x.
2. make IRQ_SSP3..IRQ_PWRI2C valid only if PXA27x is defined, this
avoids unintentional use of these macros on PXA25x
Signed-off-by: eric miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
pxamci.h redefines the MMC registers differently so they can be used
with ioremap. Remove the incompatible definitions from pxa-regs.h.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
pxa_pm_finish() does nothing but return zero. The core code
does nothing with this return value, and will not try to call
the finish method in the pm_ops structure if it is NULL.
Therefore, we can remove this useless function.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This is a new I2C bus driver for the TAOS evaluation modules. Developped
and tested on the TAOS TSL2550 EVM.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Let the drivers specify how many bytes they want to read with
i2c_smbus_read_i2c_block_data(). So far, the block count was
hard-coded to I2C_SMBUS_BLOCK_MAX (32), which did not make much sense.
Many driver authors complained about this before, and I believe it's
about time to fix it. Right now, authors have to do technically stupid
things, such as individual byte reads or full-fledged I2C messaging,
to work around the problem. We do not want to encourage that.
I even found that some bus drivers (e.g. i2c-amd8111) already
implemented I2C block read the "right" way, that is, they didn't
follow the old, broken standard. The fact that it was never noticed
before just shows how little i2c_smbus_read_i2c_block_data() was used,
which isn't that surprising given how broken its prototype was so far.
There are some obvious compatiblity considerations:
* This changes the i2c_smbus_read_i2c_block_data() prototype. Users
outside the kernel tree will notice at compilation time, and will
have to update their code.
* User-space has access to i2c_smbus_xfer() directly using i2c-dev, so
the changed expectations would affect tools such as i2cdump. In order
to preserve binary compatibility, we give I2C_SMBUS_I2C_BLOCK_DATA
a new numeric value, and define I2C_SMBUS_I2C_BLOCK_BROKEN with the
old numeric value. When i2c-dev receives a transaction with the
old value, it can convert it to the new format on the fly.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Kill a sparse warning by un-nesting two container_of() calls.
Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Generate I2C kerneldoc; fix various glitches and add "context" sections to
that documentation. Most I2C and SMBus functions still have no kerneldoc.
Let me suggest providing kerneldoc for all the i2c_smbus_*() functions as
a small and mostly self-contained project for anyone so inclined. :)
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
The ARM show_regs() tombstone only partially decodes which ARM ISA was
executing at the time a fault occurred displaying either "(T)" for the
Thumb case or nothing at all for other cases. This patch therefore
explicitly identifies which state the processor is in at the time of
a fault: ARM, Thumb, Jazelle or JazelleEE.
Signed-off-by: George G. Davis <gdavis@mvista.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Driver to control the GPIO pins on the KS8695 processor.
The driver natively supports the Generic GPIO interface.
Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch provides support for the Netgear WG302 v2 and WAG302 v2 AccessPoint series.
This patch relies on the patch "Gateway 7001 series support" minimally, as they only have UART2 connected.
Updated to stay below the 80 char limit in uncompress.h
Signed-off-by: Imre Kaloz <kaloz@openwrt.org>
Signed-off-by: Deepak Saxena <dsaxena@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch provides support for the Gateway 7001 AccessPoint series.
Updated to stay below the 80 char limit in uncompress.h
Signed-off-by: Imre Kaloz <kaloz@openwrt.org>
Signed-off-by: Deepak Saxena <dsaxena@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
IXDP425 NAND support (arch specific part).
The generic platform driver that is used by ixdp425 platfrom is already
in upstream kernel in 2.6.22-rc1.
Signed-off-by: Vladimir Barinov <vbarinov@ru.mvista.com>
Signed-off-by: Ruslan Sushko <rsushko@ru.mvista.com>
Signed-off-by: Deepak Saxena <dsaxena@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Support for generic input output for MX1 family.
The implementation prevents allocation of one pin
by two users, but does not store pointer to the user
description permanently, because this solution
would have bigger memory overhead.
The simple way to integrate code with per BSP
pins setup and allocation is required else all GPIO
registration checking is useless. The function
imx_gpio_setup_multiple_pins() can be used for this
purpose in future.
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Modify the common at91 hardware support to deal with the non-MMU
at91x40 family. The base RAM (which is most likely not DRAM) is
set to the configured value. Virtual IO device mapping is set
to be 1 to 1 with the physical addresses.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Base at91x40 architecture support defines. These parts are somewhat
simpler than the ARM9 Atmel based parts.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The AT91x40 family doesn't have the debug unit like its bigger brothers.
But it does have the ID and extension registers (with the bit meanings
the same). Reorganize at91_dbgu.h to cater for this.
This also affects the load uncompressor, since it outputs to the
debug port.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Support pin multiplexing configurations driver for TI DaVinci SoC
Signed-off-by: Vladimir Barinov <vbarinov@ru.mvista.com>
Acked-by: Kevin Hilman <khilman@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Support GPIO driver for TI DaVinci SoC
Signed-off-by: Vladimir Barinov <vbarino@ru.mvista.com>
Acked-by: David Brownell <david-b@pacbell.net>
Acked-by: Kevin Hilman <khilman@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Support clock control driver for TI DaVinci SoC
Signed-off-by: Vladimir Barinov <vbarinov@ru.mvista.com>
Signed-off-by: Kevin Hilman <khilman@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
However there are similar things in the EBIU_DDRQUE Register
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Add a new security check on mmap operations to see if the user is attempting
to mmap to low area of the address space. The amount of space protected is
indicated by the new proc tunable /proc/sys/vm/mmap_min_addr and defaults to
0, preserving existing behavior.
This patch uses a new SELinux security class "memprotect." Policy already
contains a number of allow rules like a_t self:process * (unconfined_t being
one of them) which mean that putting this check in the process class (its
best current fit) would make it useless as all user processes, which we also
want to protect against, would be allowed. By taking the memprotect name of
the new class it will also make it possible for us to move some of the other
memory protect permissions out of 'process' and into the new class next time
we bump the policy version number (which I also think is a good future idea)
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
The VLAN MAC address handling is broken in multiple ways. When the address
differs when setting it, the real device is put in promiscous mode twice,
but never taken out again. Additionally it doesn't resync when the real
device's address is changed and needlessly puts it in promiscous mode when
the vlan device is still down.
Fix by moving address handling to vlan_dev_open/vlan_dev_stop and properly
deal with address changes in the device notifier. Also switch to
dev_unicast_add (which needs the exact same handling).
Since the set_mac_address handler is identical to the generic ethernet one
with these changes, kill it and use ether_setup().
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Keep netpoll/poll_napi from messing with the poll_list.
Only net_rx_action is allowed to manipulate the list.
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we dont have PIO mapping anymore we need to make sure we
got the correct value in our headers. Some well needed comments
have also been added.
Signed-off-by: Kristoffer Ericson <kristoffer.ericson@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Well, first of all, I don't want to change so many files either.
What I do:
Adding a new parameter "struct bin_attribute *" in the
.read/.write methods for the sysfs binary attributes.
In fact, only the four lines change in fs/sysfs/bin.c and
include/linux/sysfs.h do the real work.
But I have to update all the files that use binary attributes
to make them compatible with the new .read and .write methods.
I'm not sure if I missed any. :(
Why I do this:
For a sysfs attribute, we can get a pointer pointing to the
struct attribute in the .show/.store method,
while we can't do this for the binary attributes.
I don't know why this is different, but this does make it not
so handy to use the binary attributes as the regular ones.
So I think this patch is reasonable. :)
Who benefits from it:
The patch that exposes ACPI tables in sysfs
requires such an improvement.
All the table binary attributes share the same .read method.
Parameter "struct bin_attribute *" is used to get
the table signature and instance number which are used to
distinguish different ACPI table binary attributes.
Without this parameter, we need to offer different .read methods
for different ACPI table binary attributes.
This is impossible as there are various ACPI tables on different
platforms, and we don't know what they are until they are loaded.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch makes dentries and inodes for sysfs directories
reclaimable.
* sysfs_notify() is modified to walk sysfs_dirent tree instead of
dentry tree.
* sysfs_update_file() and sysfs_chmod_file() use sysfs_get_dentry() to
grab the victim dentry.
* sysfs_rename_dir() and sysfs_move_dir() grab all dentries using
sysfs_get_dentry() on startup.
* Dentries for all shadowed directories are pinned in memory to serve
as lookup start point.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
As kobj sysfs dentries and inodes are gonna be made reclaimable,
dentry can't be used as naming token for sysfs file/directory, replace
kobj->dentry with kobj->sd. The only external interface change is
shadow directory handling. All other changes are contained in kobj
and sysfs.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Implement SYSFS_FLAG_REMOVED flag which currently is used only to
improve sanity check in sysfs_deactivate(). The flag will be used to
make directory entries reclamiable.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Rename sysfs_dirent->s_type to s_flags, pack type into lower eight
bits and reserve the rest for flags. sysfs_type() can used to access
the type. All existing sd->s_type accesses are converted to use
sysfs_type(). While at it, type test is changed to equality test
instead of bit-and test where appropriate.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
devt_attr and uevent_attr are either allocated dynamically with or
embedded in device and class_device as they needed their owner field
set to the module implementing the driver. Now that sysfs implements
immediate disconnect and owner field removed from struct attribute,
there is no reason to do this. Remove these attributes from
[class_]device and use static attribute structures instead.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
sysfs is now completely out of driver/module lifetime game. After
deletion, a sysfs node doesn't access anything outside sysfs proper,
so there's no reason to hold onto the attribute owners. Note that
often the wrong modules were accounted for as owners leading to
accessing removed modules.
This patch kills now unnecessary attribute->owner. Note that with
this change, userland holding a sysfs node does not prevent the
backing module from being unloaded.
For more info regarding lifetime rule cleanup, please read the
following message.
http://article.gmane.org/gmane.linux.kernel/510293
(tweaked by Greg to not delete the field just yet, to make it easier to
merge things properly.)
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add s_name to sysfs_dirent. This is to further reduce dependency to
the associated dentry. Name is copied for directories and symlinks
but not for attributes.
Where possible, name dereferences are converted to use sd->s_name.
sysfs_symlink->link_name and sysfs_get_name() are unused now and
removed.
This change allows symlink to be implemented using sysfs_dirent tree
proper, which is the last remaining dentry-dependent sysfs walk.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Implement idr based id allocator. ida is used the same way idr is
used but lacks id -> ptr translation and thus consumes much less
memory. struct ida_bitmap is attached as leaf nodes to idr tree which
is managed by the idr code. Each ida_bitmap is 128bytes long and
contains slightly less than a thousand slots.
ida is more aggressive with releasing extra resources acquired using
ida_pre_get(). After every successful id allocation, ida frees one
reserved idr_layer if possible. Reserved ida_bitmap is not freed
automatically but only one ida_bitmap is reserved and it's almost
always used right away. Under most circumstances, ida won't hold on
to memory for too long which isn't actively used.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The prev_state member of struct dev_pm_info (defined in include/linux/pm.h) is
only used during a resume to check if the device's state before the suspend was
'off', in which case the device is not resumed. However, in such cases the
decision whether or not to resume the device should be made on the driver level
and the resume callbacks from the device's bus and class should be executed
anyway (the may be needed for some things other than just powering on the
device).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The saved_state member of struct dev_pm_info, defined in include/linux/pm.h, is
not used anywhere, so it can be removed.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The pm_parent member of struct dev_pm_info (defined in include/linux/pm.h) is
only used to check if the device's parent is in the right state while the
device is being suspended or resumed. However, this can be done just as well
with the help of the parent pointer in struct device, so pm_parent can be
removed along with some code that handles it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The patch below adds DMI/SMBIOS based module autoloading to the Linux
kernel. The idea is to load laptop drivers automatically (and other
drivers which cannot be autoloaded otherwise), based on the DMI system
identification information of the BIOS.
Right now most distros manually try to load all available laptop
drivers on bootup in the hope that at least one of them loads
successfully. This patch does away with all that, and uses udev to
automatically load matching drivers on the right machines.
Basically the patch just exports the DMI information that has been
parsed by the kernel anyway to userspace via a sysfs device
/sys/class/dmi/id and makes sure that proper modalias attributes are
available. Besides adding the "modalias" attribute it also adds
attributes for a few other DMI fields which might be useful for
writing udev rules.
This patch is not an attempt to export the entire DMI/SMBIOS data to
userspace. We already have "dmidecode" which parses the complete DMI
info from userspace. The purpose of this patch is machine model
identification and good udev integration.
To take advantage of DMI based module autoloading, a driver should
export one or more MODULE_ALIAS fields similar to these:
MODULE_ALIAS("dmi:*:svnMICRO-STARINT'LCO.,LTD:pnMS-1013:pvr0131*:cvnMICRO-STARINT'LCO.,LTD:ct10:*");
MODULE_ALIAS("dmi:*:svnMicro-StarInternational:pnMS-1058:pvr0581:rvnMSI:rnMS-1058:*:ct10:*");
MODULE_ALIAS("dmi:*:svnMicro-StarInternational:pnMS-1412:*:rvnMSI:rnMS-1412:*:cvnMICRO-STARINT'LCO.,LTD:ct10:*");
MODULE_ALIAS("dmi:*:svnNOTEBOOK:pnSAM2000:pvr0131*:cvnMICRO-STARINT'LCO.,LTD:ct10:*");
These lines are specific to my msi-laptop.c driver. They are basically
just a concatenation of a few carefully selected DMI fields with all
potentially bad characters stripped.
Besides laptop drivers, modules like "hdaps", the i2c modules
and the hwmon modules are good candidates for "dmi:" MODULE_ALIAS
lines.
Besides merely exporting the DMI data via sysfs the patch adds
support for a few more DMI fields. Especially the CHASSIS fields are
very useful to identify different laptop modules. The patch also adds
working MODULE_ALIAS lines to my msi-laptop.c driver.
I'd like to thank Kay Sievers for helping me to clean up this patch
for posting it on lkml.
Patch is against Linus' current GIT HEAD. Should probably apply to
older kernels as well without modification.
Signed-off-by: Lennart Poettering <mzxreary@0pointer.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Implement debugfs_rename() to allow renaming files/directories in debugfs.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
As suggested by Andrew, add pci_try_set_mwi(), which does not require
return-value checking.
- add pci_try_set_mwi() without __must_check
- make it return 0 on success, errno if the "try" failed or error
- review callers
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Based on replies to a respective query, remove the pci_dac_dma_...() APIs
(except for pci_dac_dma_supported() on Alpha, where this function is used
in non-DAC PCI DMA code).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Jesse Barnes <jesse.barnes@intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
pci_ids, add atheros and 3com_2 vendors
Atheros is wifi vendor. 3com_2 (0xa727) is an vendor id for one card with
ath chip.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
pci_ids, reorder some entries
Some lines are not vendor sorted, reorder it to comply with the rest of
document.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
traps, change VENDOR to DEVICE
Change macro for SGI lithium (arch/i386/mach-visws/traps.c) device from
VENDOR to DEVICE, because it's a device id.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
lanai, change VENDOR to DEVICE
There were 2 bad named macros in pci_ids (LANAI 2 and IHB). Rename it to
DEVICE, because it's device id. Also make some cleanpu in pci_device_id
table (use PCI_VDEVICE).
Cc: Mitchell Blank Jr <mitch@sfgoth.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Currently there are 97 occurrences where drivers need the pci
revision ID. We can do this once for all devices. Even the pci
subsystem needs the revision several times for quirks. The extra
u8 member pads out nicely in the pci_dev struct.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Remove pointless and never-called enable_wake() hook from pci_driver and
from documentation. Evidently this was introduced in the 2.4.6 kernel,
but there's no evidence it was ever called; and it was rarely implemented.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Function to clear bogus correctable errors. Analog to pci_aer_uncorrect_are_status.
The Marvell chips seem to start out with a bogus value that needs to be
cleared.
Yanmin ported it to 2.6.22-rc4 by fixing a fuzz patch applying info.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Acked-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The stubs used when advanced error reporting is not enabled
must have same return type as real functions.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Acked-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We've now fixed up most users of pci_find_slot, and the remainder are either
hard and need someone with the hardware and info to work on it, or patches
exist but are not yet merged.
Time therefore for some gentle encouragement
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Currently pcibios_add_platform_entries() returns void, but could fail,
so instead have it return an int and propagate errors up to
pci_create_sysfs_dev_files().
Fixes:
arch/powerpc/kernel/pci_64.c: In function 'pcibios_add_platform_entries':
arch/powerpc/kernel/pci_64.c:878: warning: ignoring return value of
'device_create_file', declared with attribute warn_unused_result
arch/powerpc/kernel/pci_32.c: In function 'pcibios_add_platform_entries':
arch/powerpc/kernel/pci_32.c:1043: warning: ignoring return value of
'device_create_file', declared with attribute warn_unused_result
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I'm not sure if this is going to fly, weak symbols work on the compilers I'm
using, but whether they work for all of the affected architectures I can't say.
I've cc'ed as many arch maintainers/lists as I could find.
But assuming they do, we can use a weak empty definition of
pcibios_add_platform_entries() to avoid having an empty definition on every
arch.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch introduces an interface to read and write PCI-X / PCI-Express
maximum read byte count values from PCI config space. There is a second
function that returns the maximum _designed_ read byte count, which marks the
maximum value for a device, since some drivers try to set MMRBC to the
highest allowed value and rely on such a function.
Based on patch set by Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Peter Oruba <peter.oruba@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Throw out the old mark & sweep garbage collector and put in a
refcounting cycle detecting one.
The old one had a race with recvmsg, that resulted in false positives
and hence data loss. The old algorithm operated on all unix sockets
in the system, so any additional locking would have meant performance
problems for all users of these.
The new algorithm instead only operates on "in flight" sockets, which
are very rare, and the additional locking for these doesn't negatively
impact the vast majority of users.
In fact it's probable, that there weren't *any* heavy senders of
sockets over sockets, otherwise the above race would have been
discovered long ago.
The patch works OK with the app that exposed the race with the old
code. The garbage collection has also been verified to work in a few
simple cases.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linux does not gracefully deal with multiple processors going
through OS_MCA aa part of the same MCA event. The first cpu
into OS_MCA grabs the ia64_mca_serialize lock. Subsequent
cpus wait for that lock, preventing them from reporting in as
rendezvoused. The first cpu waits 5 seconds then complains
that all the cpus have not rendezvoused. The first cpu then
handles its MCA and frees up all the rendezvoused cpus and
releases the ia64_mca_serialize lock. One of the subsequent
cpus going thought OS_MCA then gets the ia64_mca_serialize
lock, waits another 5 seconds and then complains that none of
the other cpus have rendezvoused.
This patch allows multiple CPUs to gracefully go through OS_MCA.
The first CPU into ia64_mca_handler() grabs a mca_count lock.
Subsequent CPUs into ia64_mca_handler() are added to a list of cpus
that need to go through OS_MCA (a bit set in mca_cpu), and report
in as rendezvoused, and but spin waiting their turn.
The first CPU sees everyone rendezvous, handles his MCA, wakes up
one of the other CPUs waiting to process their MCA (by clearing
one mca_cpu bit), and then waits for the other cpus to complete
their MCA handling. The next CPU handles his MCA and the process
repeats until all the CPUs have handled their MCA. When the last
CPU has handled it's MCA, it sets monarch_cpu to -1, releasing all
the CPUs.
In testing this works more reliably and faster.
Thanks to Keith Owens for suggesting numerous improvements
to this code.
Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add wrapper function to make SN_SAL_REGISTER_PMI_HANDLER ia64_sal_oemcall.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Because reversing RH0 is no longer supported by deprecation
of RH0, let's make IPV6_{RECV,2292}RTHDR boolean options.
Boolean are more appropriate from standard POV.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on <draft-ietf-ipv6-deprecate-rh0-00.txt>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rusty (whose comments we should all study and emulate :) pointed
out that our comments for skb checksums are no longer up-to-date.
So here is a patch to
1) add the case of partial checksums on input;
2) update partial checksum case to mention csum_start/csum_offset;
3) mention the new IPv6 feature bit.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
To better support and handle eSCO links in the future a bunch of
constants needs to be added and some basic routines need to be
updated. This is the initial step.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The queue handlers registered by ip[6]_queue.ko at initialization should
not be unregistered according to requests from userland program
using nfnetlink_queue. If we allow that, there is no way to register
the handlers of built-in ip[6]_queue again.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert DEBUGP to pr_debug and fix lots of non-compiling debug statements.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adjust structure size and don't expect pointers passed in from
userspace to be valid. Also replace an enum in an ABI structure
by a fixed size type.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminate the last global list searched for every new connection.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
As a last step of preventing DoS by creating lots of expectations, this
patch introduces a global maximum and a sysctl to control it. The default
is initialized to 4 * the expectation hash table size, which results in
1/64 of the default maxmimum of conntracks.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch brings back the per-conntrack expectation list that was
removed around 2.6.10 to avoid walking all expectations on expectation
eviction and conntrack destruction.
As these were the last users of the global expectation list, this patch
also kills that.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently all expectations are kept on a global list that
- needs to be searched for every new conncetion
- needs to be walked for evicting expectations when a master connection
has reached its limit
- needs to be walked on connection destruction for connections that
have open expectations
This is obviously not good, especially when considering helpers like
H.323 that register *lots* of expectations and can set up permanent
expectations, but it also allows for an easy DoS against firewalls
using connection tracking helpers.
Use a hashtable for expectations to avoid incurring the search overhead
for every new connection. The default hash size is 1/256 of the conntrack
hash table size, this can be overriden using a module parameter.
This patch only introduces the hash table for expectation lookups and
keeps other users to reduce the noise, the following patches will get
rid of it completely.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since conntrack currently allows to use masks for every bit of both
helper and expectation tuples, we can't hash them and have to keep
them on two global lists that are searched for every new connection.
This patch removes the never used ability to use masks for the
destination part of the expectation tuple and completely removes
masks from helpers since the only reasonable choice is a full
match on l3num, protonum and src.u.all.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently there is a wild mix of nf_conntrack_expect_, nf_ct_exp_,
expect_, exp_, ...
Consistently use nf_ct_ as prefix for exported functions.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
All callers pass NULL, this also doesn't seem very useful for modules.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert conntrack hash to hlists to reduce its size and cache
footprint. Since the default hashsize to max. entries ratio
sucks (1:16), this patch doesn't reduce the amount of memory
used for the hash by default, but instead uses a better ratio
of 1:8, which results in the same max. entries value.
One thing worth noting is early_drop. It really should use LRU,
so it now has to iterate over the entire chain to find the last
unconfirmed entry. Since chains shouldn't be very long and the
entire operation is very rare this shouldn't be a problem.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This kills the global 'destroy' operation which was used by NAT.
Instead it uses the extension infrastructure so that multiple
extensions can register own operations.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now memory space for help and NAT are allocated by extension
infrastructure.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
I will split 'struct nf_nat_info' out from conntrack. So I cannot use
'offsetof' to get the pointer to conntrack from it.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Old space allocator of conntrack had problems about extensibility.
- It required slab cache per combination of extensions.
- It expected what extensions would be assigned, but it was impossible
to expect that completely, then we allocated bigger memory object than
really required.
- It needed to search helper twice due to lock issue.
Now basic informations of a connection are stored in 'struct nf_conn'.
And a storage for extension (helper, NAT) is allocated by kmalloc.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TRACE target can be used to follow IP and IPv6 packets through
the ruleset.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick NcHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Along comes... xt_u32, a revamped ipt_u32 from POM-NG,
Plus:
* 2007-06-02: added ipv6 support
* 2007-06-05: uses kmalloc for the big buffer
* 2007-06-05: added inversion
* 2007-06-20: use skb_copy_bits() and get rid of the big buffer
and lock (suggested by Pablo Neira Ayuso)
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch the return type of target checkentry functions to boolean.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch the return type of match functions to boolean
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch the return type of match functions to boolean
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch the "hotdrop" variables to boolean
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This explains the allowed upper protocol numbers. IP6T_F_NOPROTO was
introduced to use 0 as Hop-by-Hop option header, not wildcard. But that
seemed to be forgotten. 0 has been used as wildcard since 2002-08-23.
Signed-off-by: Yasuyuki Kozakai <yasuyuki@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This cleanup fell out after adding L2TP support where a new encap_rcv
funcptr was added to struct udp_sock. Have XFRM use the new encap_rcv
funcptr, which allows us to move the XFRM encap code from udp.c into
xfrm4_input.c.
Make xfrm4_rcv_encap() static since it is no longer called externally.
Signed-off-by: James Chapman <jchapman@katalix.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Through the IrDA netlink set mode command, we switch to IrDA monitor
mode, where one IrLAP instance receives all the packets on the media,
without ever responding to them.
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
First IrDA configuration netlink layer implementation.
Currently, we only support the set/get mode commands.
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce a new syscall TUNSETGROUP for group ownership setting of tap
devices. The user now is allowed to send packages if either his euid or
his egid matches the one specified via tunctl (via -u or -g
respecitvely). If both, gid and uid, are set via tunctl, both have to
match.
Signed-off-by: Guido Guenther <agx@sigxcpu.org>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove stats_lock pointers from qdisc-internal structures, in all cases
it points to dev->queue_lock. The only case where it is necessary is for
top-level qdiscs, where it might also point to dev->ingress_lock in case
of the ingress qdisc. Also remove it from actions completely, it always
points to the actions internal lock.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows other in-kernel functions to do SAD lookups.
The only known user at the moment is pktgen.
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a reference to an existing address is increased or decreased without
hitting zero, the address count is incorrectly adjusted.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the new sch_rr qdisc for multiqueue network device support. Allow
sch_prio and sch_rr to be compiled with or without multiqueue hardware
support.
sch_rr is part of sch_prio, and is referenced from MODULE_ALIAS. This
was done since sch_prio and sch_rr only differ in their dequeue
routine.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the multiqueue hardware device support API to the core network
stack. Allow drivers to allocate multiple queues and manage them at
the netdev level if they choose to do so.
Added a new field to sk_buff, namely queue_mapping, for drivers to
know which tx_ring to select based on OS classification of the flow.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add struct sockaddr_pppol2tp to carry L2TP-specific address
information for the PPPoX (PPPoL2TP) socket. Unfortunately we can't
use the union inside struct sockaddr_pppox because the L2TP-specific
data is larger than the current size of the union and we must preserve
the size of struct sockaddr_pppox for binary compatibility.
Also add a PPPIOCGL2TPSTATS ioctl to allow userspace to obtain
L2TP counters and state from the kernel.
Add new if_pppol2tp.h header.
[ Modified to use aligned_u64 in statistics structure -DaveM ]
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a new UDP_ENCAP_L2TPINUDP encapsulation type for UDP
sockets. When a UDP socket's encap_type is UDP_ENCAP_L2TPINUDP, the
skb is delivered to a function pointed to by the udp_sock's
encap_rcv funcptr. If the skb isn't wanted by L2TP, it returns >0, which
causes it to be passed through to UDP.
Include padding to put the new encap_rcv field on a 4-byte boundary.
Previously, the only user of UDP encap sockets was ESP, so when
CONFIG_XFRM was not defined, some of the encap code was compiled
out. This patch changes that. As a result, udp_encap_rcv() will
now do a little more work when CONFIG_XFRM is not defined.
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for configuring secondary unicast addresses on network
devices. To support this devices capable of filtering multiple
unicast addresses need to change their set_multicast_list function
to configure unicast filters as well and assign it to dev->set_rx_mode
instead of dev->set_multicast_list. Other devices are put into promiscous
mode when secondary unicast addresses are present.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use generic net_device address lists for multicast list handling.
Some defines are used to keep drivers working.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce struct dev_addr_list and list maintenance functions
based on dev_mc_list and the related functions. This will be
used by follow-up patches for both multicast and secondary
unicast addresses.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing model for checksum offload does not correctly handle
devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag
implies device can do any arbitrary protocol.
This patch:
* adds NETIF_F_IPV6_CSUM for those devices
* fixes bnx2 and tg3 devices that need it
* add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO)
* fixes assumptions about NETIF_F_ALL_CSUM in nat
* adjusts bridge union of checksumming computation
Signed-off-by: David S. Miller <davem@davemloft.net>
It is clean-up for XFRM type modules and adds aliases with its
protocol:
ESP, AH, IPCOMP, IPIP and IPv6 for IPsec
ROUTING and DSTOPTS for MIPv6
It is almost the same thing as XFRM mode alias, but it is added
new defines XFRM_PROTO_XXX for preprocessing since some protocols
are defined as enum.
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Acked-by: Ingo Oeser <netdev@axxeo.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes MIPv6 loadable module named "mip6".
Here is a modprobe.conf(5) example to load it automatically
when user application uses XFRM state for MIPv6:
alias xfrm-type-10-43 mip6
alias xfrm-type-10-60 mip6
Some MIPv6 feature is not included by this modular, however,
it should not be affected to other features like either IPsec
or IPv6 with and without the patch.
We may discuss XFRM, MH (RAW socket) and ancillary data/sockopt
separately for future work.
Loadable features:
* MH receiving check (to send ICMP error back)
* RO header parsing and building (i.e. RH2 and HAO in DSTOPTS)
* XFRM policy/state database handling for RO
These are NOT covered as loadable:
* Home Address flags and its rule on source address selection
* XFRM sub policy (depends on its own kernel option)
* XFRM functions to receive RO as IPv6 extension header
* MH sending/receiving through raw socket if user application
opens it (since raw socket allows to do so)
* RH2 sending as ancillary data
* RH2 operation with setsockopt(2)
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kill unnecessary CONFIG_IPV6_MIP6.
o It is redundant for RAW socket to keep MH out with the config then
it can handle any protocol.
o Clean-up at AH.
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a nested compat attribute type that can be used to convert
attributes that contain a structure to nested attributes in a
backwards compatible way.
The attribute looks like this:
struct {
[ compat contents ]
struct rtattr {
.rta_len = total size,
.rta_type = type,
} rta;
struct old_structure struct;
[ nested top-level attribute ]
struct rtattr {
.rta_len = nest size,
.rta_type = type,
} nest_attr;
[ optional 0 .. n nested attributes ]
struct rtattr {
.rta_len = private attribute len,
.rta_type = private attribute typ,
} nested_attr;
struct nested_data data;
};
Since both userspace and kernel deal correctly with attributes that are
larger than expected old versions will just parse the compat part and
ignore the rest.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a nested compat attribute type that can be used to convert
attributes that contain a structure to nested attributes in a
backwards compatible way.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently NAT (and others) that want to modify cloned skbs copy them,
even if in the vast majority of cases its not necessary because the
skb is a clone made by TCP and the portion NAT wants to modify is
actually writable because TCP release the header reference before
cloning.
The problem is that there is no clean way for NAT to find out how
long the writable header area is, so this patch introduces skb->hdr_len
to hold this length. When a headerless skb is cloned skb->hdr_len
is set to the current headroom, for regular clones it is copied from
the original. A new function skb_clone_writable(skb, len) returns
whether the skb is writable up to len bytes from skb->data. To avoid
enlarging the skb the mac_len field is reduced to 16 bit and the
new hdr_len field is put in the remaining 16 bit.
I've done a few rough benchmarks of NAT (not with this exact patch,
but a very similar one). As expected it saves huge amounts of system
time in case of sendfile, bringing it down to basically the same
amount as without NAT, with sendmsg it only helps on loopback,
probably because of the large MTU.
Transmit a 1GB file using sendfile/sendmsg over eth0/lo with and
without NAT:
- sendfile eth0, no NAT: sys 0m0.388s
- sendfile eth0, NAT: sys 0m1.835s
- sendfile eth0: NAT + path: sys 0m0.370s (~ -80%)
- sendfile lo, no NAT: sys 0m0.258s
- sendfile lo, NAT: sys 0m2.609s
- sendfile lo, NAT + patch: sys 0m0.260s (~ -90%)
- sendmsg eth0, no NAT: sys 0m2.508s
- sendmsg eth0, NAT: sys 0m2.539s
- sendmsg eth0, NAT + patch: sys 0m2.445s (no change)
- sendmsg lo, no NAT: sys 0m2.151s
- sendmsg lo, NAT: sys 0m3.557s
- sendmsg lo, NAT + patch: sys 0m2.159s (~ -40%)
I expect other users can see a similar performance improvement,
packet mangling iptables targets, ipip and ip_gre come to mind ..
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This provides a reusable time difference function which returns the difference in
microseconds, as often used in the DCCP code.
Commiter note: renamed ktime_delta to ktime_us_delta and put it in ktime.h.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Keep track of the number of configured ingress/egress QoS mappings to
avoid iteration while calculating the netlink attribute size.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb->priority has only 32 bits and even VLAN uses 32 bit values in its API.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add rtnetlink API for creating, changing and deleting software devices.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
It hasn't "summed" anything in over 7 years, and it's
just a straight mempcy ala skb_copy_to_linear_data()
so just get rid of it.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes the RFCOMM TTY release process so that the TTY is kept
on the list until it is really freed. A new device flag is used to keep
track of released TTYs.
Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch enhances TIPC's stream socket send routine so that
it avoids transmitting data in chunks that require fragmentation
and reassembly, thereby improving performance at both the
sending and receiving ends of the connection.
The "maximum packet size" hint that records MTU info allows
the socket to decide how big a chunk it should send; in the
event that the hint has become stale, fragmentation may still
occur, but the data will be passed correctly and the hint will
be updated in time for the following send. Note: The 66060 byte
pseudo-MTU used for intra-node connections requires the send
routine to perform an additional check to ensure it does not
exceed TIPC"s limit of 66000 bytes of user data per chunk.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Jon Paul Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IB CM should include the HCA ACK delay when calculating the local
ACK timeout value to use for RC QPs. If the HCA ACK delay is large
enough relative to the packet life time, then if it is not taken into
account, the calculated timeout value ends up being too small, which
can result in "retry exceeded" errors.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
MADs sent to the SA should use the the default P_Key (0x7fff/0xffff).
There's no requirement that the default P_Key is stored at index 0 in
the local P_Key table, so add code to the sa_query module to look up
the index of the default P_Key when creating an address handle for the
SA (which is done any time the P_Key table might change), and use this
index for all SA queries.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Most drivers must handle fragmented HCI data packets and events. This
patch adds a generic function for their reassembly to the Bluetooth
core layer and thus allows to shrink the complexity of the drivers.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Use the destination address of the original NLM request as the
source address in callbacks to the client.
Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In addition to binding to a local privileged port the NFS client should
allow binding to a specific local address. This is used by the server
for callbacks. The patch adds the necessary interface.
Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Save the destination address of an incoming request over TCP like is
done already for UDP. It is necessary later for callbacks by the server.
Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cleanup argument passing to functions for creating an RPC transport.
Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Prior to David Howell's mount changes in 2.6.18, users who mounted
different directories which happened to be from the same filesystem on the
server would get different super blocks, and hence could choose different
mount options. As long as there were no hard linked files that crossed from
one subtree to another, this was quite safe.
Post the changes, if the two directories are on the same filesystem (have
the same 'fsid'), they will share the same super block, and hence the same
mount options.
Add a flag to allow users to elect not to share the NFS super block with
another mount point, even if the fsids are the same. This will allow
users to set different mount options for the two different super blocks, as
was previously possible. It is still up to the user to ensure that there
are no cache coherency issues when doing this, however the default
behaviour will be to share super blocks whenever two paths result in
the same fsid.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In preparation for supporting NFSv2 and NFSv3 mount option handling in the
kernel NFS client, convert mount_clnt.c to be a permanent part of the NFS
client, instead of built only when CONFIG_ROOT_NFS is enabled.
In addition, we also replace the "struct sockaddr_in *" argument with
something more generic, to help support IPv6 at some later point.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up, for consistency. Rename rpcb_getport as rpcb_getport_async, to
match the naming scheme of rpcb_getport_sync.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In preparation for handling NFS mount option parsing in the kernel,
rename rpcb_getport_external as rpcb_get_port_sync, and make it available
always (instead of only when CONFIG_ROOT_NFS is enabled).
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Note to self: fix up /usr/sbin/rpcdebug too
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Use the same file size limit that lockd uses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currently we force a synchronous call to __nfs_revalidate_inode() in
nfs_inode_set_delegation(). This not only ensures that we cannot call
nfs_inode_set_delegation from an asynchronous context, but it also slows
down any call to open().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
There is no justification for keeping a special spinlock for the exclusive
use of the NFS writeback code.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We should almost always be deferencing the rpc_auth struct by means of the
credential's cr_auth field instead of the rpc_clnt->cl_auth anyway. Fix up
that historical mistake, and remove the macro that propagated it.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Does a NULL RPC call and returns a pointer to the resulting rpc_task. The
call may be either synchronous or asynchronous.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>