Граф коммитов

120254 Коммитов

Автор SHA1 Сообщение Дата
Johannes Berg 262eb9b223 cfg80211: split wext compatibility to separate header
A lot of drivers erroneously use wext constants
and don't notice since cfg80211.h includes them.
Make this more split up so drivers needing wext
compatibility from cfg80211 need to explicitly
include that from cfg80211-wext.h.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-08 14:24:59 -04:00
John W. Linville a5d5a91477 Merge git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2011-08-03 09:18:21 -04:00
Wey-Yi Guy f352910822 iwlagn: 5000 do not support idle mode
5000 series has issue supporting power save idle mode:

commit	9dc2153315

iwlwifi: always support idle mode for agn devices

For agn devices, always support idle mode which help power
consumption in idle unassociated state.

the above changes cause 5000 become not stable when power management is "on"

http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2312

Cc: stable@kernel.org #2.6.39, #3.0.0
Reported-by: Devin J Pohly <djpohly+iwl@gmail.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-02 13:50:56 -04:00
Stanislaw Gruszka 00898a4726 rt2x00: fix usage of NULL queue
We may call rt2x00queue_pause_queue(queue) with queue == NULL. Bug
was introduced by commit 62fe778412
"rt2x00: Fix stuck queue in tx failure case" .

Cc: stable@kernel.org # 3.0+
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-02 13:48:14 -04:00
Helmut Schaa 449f94eadc rt2x00: Fix compilation without CONFIG_RT2X00_LIB_CRYPTO
This was introduced by commit
77b5621bac (rt2x00: Don't use queue entry
as parameter when creating TX descriptor.)

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-02 13:48:14 -04:00
Emmanuel Grumbach cc1a93e68f iwlagn: sysfs couldn't find the priv pointer
This bug has been introduced by:
d593411084
Author: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Date:   Mon Jul 11 10:48:51 2011 +0300

    iwlagn: simplify the bus architecture

Revert part of the buggy patch: dev_get_drvdata will now return
iwl_priv as it did before the patch.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Tested-by: Daniel Halperin <dhalperi@cs.washington.edu>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-02 13:46:43 -04:00
Stanislaw Gruszka b52398b6e4 rt2x00: rt2800: fix zeroing skb structure
We should clear skb->data not skb itself. Bug was introduced by:
commit 0b8004aa12 "rt2x00: Properly
reserve room for descriptors in skbs".

Cc: stable@kernel.org # 2.6.36+
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-01 13:46:48 -04:00
Larry Finger b6b67df3f2 rtlwifi: Fix kernel oops on ARM SOC
This driver uses information from the self member of the pci_bus struct to
get information regarding the bridge to which the PCIe device is attached.
Unfortunately, this member is not established on all architectures, which
leads to a kernel oops.

Skipping the entire block that uses the self member to determine the bridge
vendor will only affect RTL8192DE devices as that driver sets the ASPM support
flag differently when the bridge vendor is Intel. If the self member is
available, there is no functional change.

This patch fixes Bugzilla No. 40212.

Reported-by: Hubert Liao <liao.hubertt@gmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Stable <stable@kernel.org> [back to 2.6.38]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-01 13:46:47 -04:00
Stanislaw Gruszka d4930086bd ath9k: skip ->config_pci_powersave() if PCIe port has ASPM disabled
We receive many bug reports about system hang during suspend/resume
when ath9k driver is in use. Adrian Chadd remarked that this problem
happens on systems that have ASPM disabled.

To do not hit the bug, skip doing ->config_pci_powersave magic if PCIe
downstream port device, which ath9k device is connected to, has ASPM
disabled.

Bug was introduced by:

commit 53bc7aa08b
Author: Vivek Natarajan <vnatarajan@atheros.com>
Date:   Mon Apr 5 14:48:04 2010 +0530

    ath9k: Add support for newer AR9285 chipsets.

Patch should address:
https://bugzilla.kernel.org/show_bug.cgi?id=37462
https://bugzilla.kernel.org/show_bug.cgi?id=37082
https://bugzilla.redhat.com/show_bug.cgi?id=697157

however I did not receive confirmation about that, except from Camilo
Mesias, whose system stops hang regularly with this patch (but still
hangs from time to time, but this is probably some other bug).

Tested-by: Camilo Mesias <camilo@mesias.co.uk>
Cc: stable@kernel.org # 2.6.35+
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-01 13:46:46 -04:00
Stanislaw Gruszka 17e859a899 iwlegacy: set tx power after rxon_assoc
If settings of tx power was deferred during scan or changing channel we
have to setup them during commit rxon. Fix problem on 3945 (4965 already
has this fix).

Optimize code to apply tx settings only when tx power was actually
changed.

Cc: stable@kernel.org # 2.6.39+
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-01 13:46:45 -04:00
Felix Fietkau c1227340ca ath9k: initialize tx chainmask before testing channel tx power values
With an uninitialized chainmask, the per-channel power will only contain
the power limits for a single chain instead of the combined tx power.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-08-01 13:46:44 -04:00
Linus Torvalds d5eab9152a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (32 commits)
  tg3: Remove 5719 jumbo frames and TSO blocks
  tg3: Break larger frags into 4k chunks for 5719
  tg3: Add tx BD budgeting code
  tg3: Consolidate code that calls tg3_tx_set_bd()
  tg3: Add partial fragment unmapping code
  tg3: Generalize tg3_skb_error_unmap()
  tg3: Remove short DMA check for 1st fragment
  tg3: Simplify tx bd assignments
  tg3: Reintroduce tg3_tx_ring_info
  ASIX: Use only 11 bits of header for data size
  ASIX: Simplify condition in rx_fixup()
  Fix cdc-phonet build
  bonding: reduce noise during init
  bonding: fix string comparison errors
  net: Audit drivers to identify those needing IFF_TX_SKB_SHARING cleared
  net: add IFF_SKB_TX_SHARED flag to priv_flags
  net: sock_sendmsg_nosec() is static
  forcedeth: fix vlans
  gianfar: fix bug caused by 87c288c6e9
  gro: Only reset frag0 when skb can be pulled
  ...
2011-07-28 05:58:19 -07:00
Linus Torvalds 6140333d36 Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md: (75 commits)
  md/raid10: handle further errors during fix_read_error better.
  md/raid10: Handle read errors during recovery better.
  md/raid10: simplify read error handling during recovery.
  md/raid10: record bad blocks due to write errors during resync/recovery.
  md/raid10:  attempt to fix read errors during resync/check
  md/raid10:  Handle write errors by updating badblock log.
  md/raid10: clear bad-block record when write succeeds.
  md/raid10: avoid writing to known bad blocks on known bad drives.
  md/raid10 record bad blocks as needed during recovery.
  md/raid10: avoid reading known bad blocks during resync/recovery.
  md/raid10 - avoid reading from known bad blocks - part 3
  md/raid10: avoid reading from known bad blocks - part 2
  md/raid10: avoid reading from known bad blocks - part 1
  md/raid10: Split handle_read_error out from raid10d.
  md/raid10: simplify/reindent some loops.
  md/raid5: Clear bad blocks on successful write.
  md/raid5.  Don't write to known bad block on doubtful devices.
  md/raid5: write errors should be recorded as bad blocks if possible.
  md/raid5: use bad-block log to improve handling of uncorrectable read errors.
  md/raid5: avoid reading from known bad blocks.
  ...
2011-07-28 05:50:27 -07:00
Matt Carlson a051294423 tg3: Remove 5719 jumbo frames and TSO blocks
The A0 revision of this chip is the only device that requires these
features to be disabled.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:32 -07:00
Matt Carlson e31aa98706 tg3: Break larger frags into 4k chunks for 5719
The 5719 has bug where RDMAs larger than 4k can cause problems.  This
patch works around the problem by dividing larger DMA requests into
something the hardware can handle.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:32 -07:00
Matt Carlson 84b67b27e9 tg3: Add tx BD budgeting code
As the driver breaks large skb fragments into smaller submissions to the
hardware, there is a new danger that BDs might get exhausted before all
fragments have been mapped.  This patch adds code to make sure tx BDs
aren't oversubscribed and flag the condition if it happens.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:32 -07:00
Matt Carlson d1a3b7377d tg3: Consolidate code that calls tg3_tx_set_bd()
This patch consolidates all code that populates tx BDs into a single
routine.  Setting tx BDs needs to be more carefully controlled to see if
workarounds need to be applied.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:31 -07:00
Matt Carlson e01ee14d49 tg3: Add partial fragment unmapping code
The following patches are going to break skb fragments into smaller
sizes.  This patch attempts to make the change easier to digest by only
addressing the skb teardown portion.

The patch modifies the driver to skip over any BDs that have a flag set
that indicates the BD isn't the beginning of an skb fragment.  Such BDs
were a result of segmentation and do not need a pci_unmap_page() call.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:31 -07:00
Matt Carlson 0d681b27b0 tg3: Generalize tg3_skb_error_unmap()
In the following patches, unmapping skb fragments will get just as
complicated as mapping them.  This patch generalizes
tg3_skb_error_unmap() and makes it the one-stop-shop for skb unmapping.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:31 -07:00
Matt Carlson 13350ea78b tg3: Remove short DMA check for 1st fragment
The first fragment of an skb should always be greater than 8 bytes.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:31 -07:00
Matt Carlson 92cd3a17ce tg3: Simplify tx bd assignments
In the following patches, the process the driver will use to assign skb
fragments to transmit BDs will get more complicated.  To prepare for
that new code, this patch seeks to simplify how transmit BDs are
populated.  It does this by separating the code that assigns the BD
members from the logic that controls how the fields are set.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:31 -07:00
Matt Carlson df8944cf5c tg3: Reintroduce tg3_tx_ring_info
The following patches will require the use of an additional flag in the
ring_info structure.  The use of this flag is tx path specific, so this
patch defines a specialized ring_info structure.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:31 -07:00
Marek Vasut bca0beb936 ASIX: Use only 11 bits of header for data size
The AX88772B uses only 11 bits of the header for the actual size. The other bits
are used for something else. This causes dmesg full of messages:

	asix_rx_fixup() Bad Header Length

This patch trims the check to only 11 bits. I believe on older chips, the
remaining 5 top bits are unused.

Signed-off-by: Marek Vasut <marek.vasut@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:31 -07:00
Marek Vasut bc466e678d ASIX: Simplify condition in rx_fixup()
Signed-off-by: Marek Vasut <marek.vasut@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:31 -07:00
Chris Clayton a0295a3b67 Fix cdc-phonet build
Try to send to correct address this time!

----------  Forwarded Message  ----------

Subject: [PATCH] Fix cdc-phonet build
Date: Saturday 23 Jul 2011
From: Chris Clayton <chris2553@googlemail.com>
To: linux-net@vger.kernel.org

cdc-phonet does not presently build on linux-3.0 because there is no entry for it in
drivers/net/Makefile. This patch adds that entry.

Signed-off-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:31 -07:00
Andy Gospodarek b2730f4f84 bonding: reduce noise during init
On Tue, Jul 26, 2011 at 05:40:27PM -0700, Joe Perches wrote:
> On Tue, 2011-07-26 at 17:37 -0700, Jay Vosburgh wrote:
> > Joe Perches <joe@perches.com> wrote:
> > >I'd prefer you don't separate the format string
> > >into multiple pieces.
> > Why not?  To me, it looks easier to read split into sections
> > that don't wrap lines.
>
> Harder to grep for a dmesg and the
> defect rate of these split formats is
> typically higher than single strings
> because of bad spacing between string
> segments.
>

I noticed that you took some time back in late 2009 to 'consolidate' the
split format-strings present in the bonding driver at the time and I've
decided I'm fine to leave them the way they are.  The main point of my
patch was to change the output and I would like to get that included.
Here is my updated patch...

Subject: [PATCH net-next-2.6 v2] bonding: reduce noise during init

Many are using sysfs to configure bonding rather than module options, so
there is no need for bonding to throw this warning in normal cases.

Keep the message around when debugging is enabled as it might be useful
for someone desperate enough to enable debugging, but eliminate it
otherwise.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:30 -07:00
Andy Gospodarek f4bb2e9c4f bonding: fix string comparison errors
When a bond contains a device where one name is the subset of another
(eth1 and eth10, for example), one cannot properly set the primary
device or the currently active device.

This was reported and based on work by Takuma Umeya.  I also verified
the problem and tested that this fix resolves it.

V2: A few did not like the the current code or my changes, so I
refactored bonding_store_primary and bonding_store_active_slave to be a
bit cleaner, dropped the use of strnicmp since we did not really need
the comparison to be case insensitive, and formatted the input string
from sysfs so a comparison to IFNAMSIZ could be used.

I also discovered an error in bonding_store_active_slave that would
modify bond->primary_slave rather than bond->curr_active_slave before
forcing the bonding driver to choose a new active slave.

V3: Actually sending the proper patch....

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Reported-by: Takuma Umeya <tumeya@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:30 -07:00
Neil Horman 550fd08c2c net: Audit drivers to identify those needing IFF_TX_SKB_SHARING cleared
After the last patch, We are left in a state in which only drivers calling
ether_setup have IFF_TX_SKB_SHARING set (we assume that drivers touching real
hardware call ether_setup for their net_devices and don't hold any state in
their skbs.  There are a handful of drivers that violate this assumption of
course, and need to be fixed up.  This patch identifies those drivers, and marks
them as not being able to support the safe transmission of skbs by clearning the
IFF_TX_SKB_SHARING flag in priv_flags

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Karsten Keil <isdn@linux-pingi.de>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Patrick McHardy <kaber@trash.net>
CC: Krzysztof Halasa <khc@pm.waw.pl>
CC: "John W. Linville" <linville@tuxdriver.com>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: Marcel Holtmann <marcel@holtmann.org>
CC: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:30 -07:00
Jiri Pirko 0891b0e089 forcedeth: fix vlans
For some reason, when rxaccel is disabled, NV_RX3_VLAN_TAG_PRESENT is
still set and some pseudorandom vids appear. So check for
NETIF_F_HW_VLAN_RX as well. Also set correctly hw_features and set vlan
mode on probe.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:30 -07:00
Sebastian Pöhn b852b72087 gianfar: fix bug caused by 87c288c6e9
commit 87c288c6e9 "gianfar: do vlan cleanup" has two issues:
# permutation of rx and tx flags
# enabling vlan tag insertion by default (this leads to unusable connections on some configurations)

If VLAN insertion is requested (via ethtool) it will be set at an other point ...

Signed-off-by: Sebastian Poehn <sebastian.poehn@belden.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:30 -07:00
David S. Miller b49179c071 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2011-07-27 22:18:47 -07:00
Linus Torvalds 95b6886526 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (54 commits)
  tpm_nsc: Fix bug when loading multiple TPM drivers
  tpm: Move tpm_tis_reenable_interrupts out of CONFIG_PNP block
  tpm: Fix compilation warning when CONFIG_PNP is not defined
  TOMOYO: Update kernel-doc.
  tpm: Fix a typo
  tpm_tis: Probing function for Intel iTPM bug
  tpm_tis: Fix the probing for interrupts
  tpm_tis: Delay ACPI S3 suspend while the TPM is busy
  tpm_tis: Re-enable interrupts upon (S3) resume
  tpm: Fix display of data in pubek sysfs entry
  tpm_tis: Add timeouts sysfs entry
  tpm: Adjust interface timeouts if they are too small
  tpm: Use interface timeouts returned from the TPM
  tpm_tis: Introduce durations sysfs entry
  tpm: Adjust the durations if they are too small
  tpm: Use durations returned from TPM
  TOMOYO: Enable conditional ACL.
  TOMOYO: Allow using argv[]/envp[] of execve() as conditions.
  TOMOYO: Allow using executable's realpath and symlink's target as conditions.
  TOMOYO: Allow using owner/group etc. of file objects as conditions.
  ...

Fix up trivial conflict in security/tomoyo/realpath.c
2011-07-27 19:26:38 -07:00
NeilBrown 58c54fcca3 md/raid10: handle further errors during fix_read_error better.
If we find more read/write errors we should record a bad block before
failing the device.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:25 +10:00
NeilBrown 5e5702898e md/raid10: Handle read errors during recovery better.
Currently when we get a read error during recovery, we simply abort
the recovery.

Instead, repeat the read in page-sized blocks.
On successful reads, write to the target.
On read errors, record a bad block on the destination,
and only if that fails do we abort the recovery.

As we now retry reads we need to know where we read from.  This was in
bi_sector but that can be changed during a read attempt.
So store the correct from_addr and to_addr in the r10_bio for later
access.


Signed-off-by: NeilBrown<neilb@suse.de>
2011-07-28 11:39:25 +10:00
NeilBrown e684e41db3 md/raid10: simplify read error handling during recovery.
If a read error is detected during recovery the code currently
fails the read device.
This isn't really necessary.  recovery_request_write will signal
a write error to end_sync_write and it will record a write
error on the destination device which will record a bad block
there or kick it from the array.

So just remove this call to do md_error.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:25 +10:00
NeilBrown 1a0b7cd826 md/raid10: record bad blocks due to write errors during resync/recovery.
If we get a write error during resync/recovery don't fail the device
but instead record a bad block.  If that fails we can then fail the
device.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:25 +10:00
NeilBrown f84ee364dd md/raid10: attempt to fix read errors during resync/check
We already attempt to fix read errors found during normal IO
and a 'repair' process.
It is best to try to repair them at any time they are found,
so move a test so that during sync and check a read error will
be corrected by over-writing with good data.

If both (all) devices have known bad blocks in the sync section we
won't try to fix even though the bad blocks might not overlap.  That
should be considered later.

Also if we hit a read error during recovery we don't try to fix it.
It would only be possible to fix if there were at least three copies
of data, which is not very common with RAID10.  But it should still
be considered later.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:25 +10:00
NeilBrown bd870a16c5 md/raid10: Handle write errors by updating badblock log.
When we get a write error (in the data area, not in metadata),
update the badblock log rather than failing the whole device.

As the write may well be many blocks, we trying writing each
block individually and only log the ones which fail.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:24 +10:00
NeilBrown 749c55e942 md/raid10: clear bad-block record when write succeeds.
If we succeed in writing to a block that was recorded as
being bad, we clear the bad-block record.

This requires some delayed handling as the bad-block-list update has
to happen in process-context.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:24 +10:00
NeilBrown d4432c23be md/raid10: avoid writing to known bad blocks on known bad drives.
Writing to known bad blocks on drives that have seen a write error
is asking for trouble.  So try to avoid these blocks.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:24 +10:00
NeilBrown e875ecea26 md/raid10 record bad blocks as needed during recovery.
When recovering one or more devices, if all the good devices have
bad blocks we should record a bad block on the device being rebuilt.

If this fails, we need to abort the recovery.

To ensure we don't think that we aborted later than we actually did,
we need to move the check for MD_RECOVERY_INTR earlier in md_do_sync,
in particular before mddev->curr_resync is updated.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:24 +10:00
NeilBrown 40c356ce5a md/raid10: avoid reading known bad blocks during resync/recovery.
During resync/recovery limit the size of the request to avoid
reading into a bad block that does not start at-or-before the current
read address.

Similarly if there is a bad block at this address, don't allow the
current request to extend beyond the end of that bad block.

Now that we don't ever read from known bad blocks, it is safe to allow
devices with those blocks into the array.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:24 +10:00
NeilBrown 8dbed5cebd md/raid10 - avoid reading from known bad blocks - part 3
When attempting to repair a read error, don't read from
devices with a known bad block.

As we are only reading PAGE_SIZE blocks, we don't try to
narrow down to smaller regions in the hope that only part of this
page is bad - it isn't worth the effort.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:24 +10:00
NeilBrown 7399c31bc9 md/raid10: avoid reading from known bad blocks - part 2
When redirecting a read error to a different device, we must
again avoid bad blocks and possibly split the request.

Spin_lock typo fixed thanks to Dan Carpenter <error27@gmail.com>

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:23 +10:00
NeilBrown 856e08e237 md/raid10: avoid reading from known bad blocks - part 1
This patch just covers the basic read path:
 1/ read_balance needs to check for badblocks, and return not only
    the chosen slot, but also how many good blocks are available
    there.
 2/ read submission must be ready to issue multiple reads to
    different devices as different bad blocks on different devices
    could mean that a single large read cannot be served by any one
    device, but can still be served by the array.
    This requires keeping count of the number of outstanding requests
    per bio.  This count is stored in 'bi_phys_segments'

On read error we currently just fail the request if another target
cannot handle the whole request.  Next patch refines that a bit.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:23 +10:00
NeilBrown 560f8e5532 md/raid10: Split handle_read_error out from raid10d.
raid10d() is too big and is about to get bigger, so split
handle_read_error() out as a separate function.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:23 +10:00
NeilBrown 1294b9c973 md/raid10: simplify/reindent some loops.
When a loop ends with a large if, it can be neater to change the
if to invert the condition and just 'continue'.
Then the body of the if can be indented to a lower level.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:23 +10:00
NeilBrown b84db560ea md/raid5: Clear bad blocks on successful write.
On a successful write to a known bad block, flag the sh
so that raid5d can remove the known bad block from the list.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:23 +10:00
NeilBrown 73e92e51b7 md/raid5. Don't write to known bad block on doubtful devices.
If a device has seen write errors, don't write to any known
bad blocks on that device.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:22 +10:00
NeilBrown bc2607f393 md/raid5: write errors should be recorded as bad blocks if possible.
When a write error is detected, don't mark the device as failed
immediately but rather record the fact for handle_stripe to deal with.

Handle_stripe then attempts to record a bad block.  Only if that fails
does the device get marked as faulty.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:22 +10:00
NeilBrown 7f0da59bdc md/raid5: use bad-block log to improve handling of uncorrectable read errors.
If we get an uncorrectable read error - record a bad block rather than
failing the device.
And if these errors (which may be due to known bad blocks) cause
recovery to be impossible, record a bad block on the recovering
devices, or abort the recovery.

As we might abort a recovery without failing a device we need to teach
RAID5 about recovery_disabled handling.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:22 +10:00
NeilBrown 31c176ecdf md/raid5: avoid reading from known bad blocks.
There are two times that we might read in raid5:
1/ when a read request fits within a chunk on a single
   working device.
   In this case, if there is any bad block in the range of
   the read, we simply fail the cache-bypass read and
   perform the read though the stripe cache.

2/ when reading into the stripe cache.  In this case we
   mark as failed any device which has a bad block in that
   strip (1 page wide).
   Note that we will both avoid reading and avoid writing.
   This is correct (as we will never read from the block, there
   is no point writing), but not optimal (as writing could 'fix'
   the error) - that will be addressed later.

If we have not seen any write errors on the device yet, we treat a bad
block like a recent read error.  This will encourage an attempt to fix
the read error which will either generate a write error, or will
ensure good data is stored there.  We don't yet forget the bad block
in that case.  That comes later.

Now that we honour bad blocks when reading we can allow devices with
bad blocks into the array.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:39:22 +10:00
NeilBrown 62096bce23 md/raid1: factor several functions out or raid1d()
raid1d is too big with several deep branches.
So separate them out into their own functions.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-28 11:38:13 +10:00
NeilBrown 3a9f28a511 md/raid1: improve handling of read failure during recovery.
If we cannot read a block from anywhere during recovery, there is
now a better approach than just giving up.
We can record a bad block on each device and keep going - being
careful not to clear the bad block when a write succeeds as it might -
it will be a write of incorrect data.

We have now reached the state where - for raid1 - we only call
md_error if md_set_badblocks has failed.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-28 11:33:42 +10:00
NeilBrown d8f05d2995 md/raid1: record badblocks found during resync etc.
If we find a bad block while writing as part of resync/recovery we
need to report that back to raid1d which must record the bad block,
or fail the device.

Similarly when fixing a read error, a further error should just
record a bad block if possible rather than failing the device.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-28 11:33:00 +10:00
NeilBrown cd5ff9a16f md/raid1: Handle write errors by updating badblock log.
When we get a write error (in the data area, not in metadata),
update the badblock log rather than failing the whole device.

As the write may well be many blocks, we trying writing each
block individually and only log the ones which fail.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-28 11:32:41 +10:00
NeilBrown 2ca68f5ed7 md/raid1: store behind-write pages in bi_vecs.
When performing write-behind we allocate pages to store the data
during write.
Previously we just keep a list of pages.  Now we keep a list of
bi_vec which includes offset and size.
This means that the r1bio has complete information to create a new
bio which will be needed for retrying after write errors.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-28 11:32:10 +10:00
NeilBrown 4367af5561 md/raid1: clear bad-block record when write succeeds.
If we succeed in writing to a block that was recorded as
being bad, we clear the bad-block record.

This requires some delayed handling as the bad-block-list update has
to happen in process-context.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-28 11:31:49 +10:00
NeilBrown 1f68f0c4b6 md/raid1: avoid writing to known-bad blocks on known-bad drives.
If we have seen any write error on a drive, then don't write to
any known-bad blocks on that drive.
If necessary, we divide the write request up into pieces just
like we do for reads, so each piece is either all written or
all not written to any given drive.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-28 11:31:48 +10:00
NeilBrown de393cdea6 md: make it easier to wait for bad blocks to be acknowledged.
It is only safe to choose not to write to a bad block if that bad
block is safely recorded in metadata - i.e. if it has been
'acknowledged'.

If it hasn't we need to wait for the acknowledgement.

We support that using rdev->blocked wait and
md_wait_for_blocked_rdev by introducing a new device flag
'BlockedBadBlock'.

This flag is only advisory.
It is cleared whenever we acknowledge a bad block, so that a waiter
can re-check the particular bad blocks that it is interested it.

It should be set by a caller when they find they need to wait.
This (set after test) is inherently racy, but as
md_wait_for_blocked_rdev already has a timeout, losing the race will
have minimal impact.

When we clear "Blocked" was also clear "BlockedBadBlocks" incase it
was set incorrectly (see above race).

We also modify the way we manage 'Blocked' to fit better with the new
handling of 'BlockedBadBlocks' and to make it consistent between
externally managed and internally managed metadata.   This requires
that each raidXd loop checks if the metadata needs to be written and
triggers a write (md_check_recovery) if needed.  Otherwise a queued
write request might cause raidXd to wait for the metadata to write,
and only that thread can write it.

Before writing metadata, we set FaultRecorded for all devices that
are Faulty, then after writing the metadata we clear Blocked for any
device for which the Fault was certainly Recorded.

The 'faulty' device flag now appears in sysfs if the device is faulty
*or* it has unacknowledged bad blocks.  So user-space which does not
understand bad blocks can continue to function correctly.
User space which does, should not assume a device is faulty until it
sees the 'faulty' flag, and then sees the list of unacknowledged bad
blocks is empty.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:31:48 +10:00
NeilBrown d7a9d443bc md: add 'write_error' flag to component devices.
If a device has ever seen a write error, we will want to handle
known-bad-blocks differently.
So create an appropriate state flag and export it via sysfs.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-28 11:31:48 +10:00
NeilBrown 06f603851f md/raid1: avoid reading known bad blocks during resync
When performing resync/etc, keep the size of the request
small enough that it doesn't overlap any known bad blocks.
Devices with badblocks at the start of the request are completely
excluded.
If there is nowhere to read from due to bad blocks, record
a bad block on each target device.

Now that we never read from known-bad-blocks we can allow devices with
known-bad-blocks into a RAID1.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:31:48 +10:00
NeilBrown d2eb35acfd md/raid1: avoid reading from known bad blocks.
Now that we have a bad block list, we should not read from those
blocks.
There are several main parts to this:
  1/ read_balance needs to check for bad blocks, and return not only
     the chosen device, but also how many good blocks are available
     there.
  2/ fix_read_error needs to avoid trying to read from bad blocks.
  3/ read submission must be ready to issue multiple reads to
     different devices as different bad blocks on different devices
     could mean that a single large read cannot be served by any one
     device, but can still be served by the array.
     This requires keeping count of the number of outstanding requests
     per bio.  This count is stored in 'bi_phys_segments'
  4/ retrying a read needs to also be ready to submit a smaller read
     and queue another request for the rest.

This does not yet handle bad blocks when reading to perform resync,
recovery, or check.

'md_trim_bio' will also be used for RAID10, so put it in md.c and
export it.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:31:48 +10:00
NeilBrown 9f2f383078 md: Disable bad blocks and v0.90 metadata.
v0.90 metadata cannot record bad blocks, so when loading metadata
for such a device, set shift to -1.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:31:47 +10:00
NeilBrown 2699b67223 md: load/store badblock list from v1.x metadata
Space must have been allocated when array was created.
A feature flag is set when the badblock list is non-empty, to
ensure old kernels don't load and trust the whole device.

We only update the on-disk badblocklist when it has changed.
If the badblocklist (or other metadata) is stored on a bad block, we
don't cope very well.

If metadata has no room for bad block, flag bad-blocks as disabled,
and do the same for 0.90 metadata.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 11:31:47 +10:00
NeilBrown 34b343cff4 md: don't allow arrays to contain devices with bad blocks.
As no personality understand bad block lists yet, we must
reject any device that is known to contain bad blocks.
As the personalities get taught, these tests can be removed.

This only applies to raid1/raid5/raid10.
For linear/raid0/multipath/faulty the whole concept of bad blocks
doesn't mean anything so there is no point adding the checks.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-28 11:31:47 +10:00
NeilBrown 16c791a5af md/bad-block-log: add sysfs interface for accessing bad-block-log.
This can show the log (providing it fits in one page) and
allows bad blocks to be 'acknowledged' meaning that they
have safely been recorded in metadata.

Clearing bad blocks is not allowed via sysfs (except for
code testing).  A bad block can only be cleared when
a write to the block succeeds.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-28 11:31:47 +10:00
NeilBrown 2230dfe4cc md: beginnings of bad block management.
This the first step in allowing md to track bad-blocks per-device so
that we can fail individual blocks rather than the whole device.

This patch just adds a data structure for recording bad blocks, with
routines to add, remove, search the list.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-28 11:31:46 +10:00
NeilBrown a519b26dbe md: remove suspicious size_of()
When calling bioset_create we pass the size of the front_pad as
   sizeof(mddev)
which looks suspicious as mddev is a pointer and so it looks like a
common mistake where
   sizeof(*mddev)
was intended.
The size is actually correct as we want to store a pointer in the
front padding of the bios created by the bioset, so make the intent
more explicit by using
   sizeof(mddev_t *)

Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-28 07:56:24 +10:00
Linus Torvalds 91d41fdf31 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target: Convert to DIV_ROUND_UP_SECTOR_T usage for sectors / dev_max_sectors
  kernel.h: Add DIV_ROUND_UP_ULL and DIV_ROUND_UP_SECTOR_T macro usage
  iscsi-target: Add iSCSI fabric support for target v4.1
  iscsi: Add Serial Number Arithmetic LT and GT into iscsi_proto.h
  iscsi: Use struct scsi_lun in iscsi structs instead of u8[8]
  iscsi: Resolve iscsi_proto.h naming conflicts with drivers/target/iscsi
2011-07-27 13:21:40 -07:00
Linus Torvalds 70a3eff576 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (53 commits)
  Input: synaptics - fix reporting of min coordinates
  Input: tegra-kbc - enable key autorepeat
  Input: kxtj9 - fix locking typo in kxtj9_set_poll()
  Input: kxtj9 - fix bug in probe()
  Input: intel-mid-touch - remove pointless checking for variable 'found'
  Input: hp_sdc - staticize hp_sdc_kicker()
  Input: pmic8xxx-keypad - fix a leak of the IRQ during init failure
  Input: cy8ctmg110_ts - set reset_pin and irq_pin from platform data
  Input: cy8ctmg110_ts - constify i2c_device_id table
  Input: cy8ctmg110_ts - fix checking return value of i2c_master_send
  Input: lifebook - make dmi callback functions return 1
  Input: atkbd - make dmi callback functions return 1
  Input: gpio_keys - switch to using SIMPLE_DEV_PM_OPS
  Input: gpio_keys - add support for device-tree platform data
  Input: aiptek - remove double define
  Input: synaptics - set minimum coordinates as reported by firmware
  Input: synaptics - process button bits in AGM packets
  Input: synaptics - rename set_slot to be more descriptive
  Input: synaptics - fuzz position for touchpad with reduced filtering
  Input: synaptics - set resolution for MT_POSITION_X/Y axes
  ...
2011-07-27 09:24:56 -07:00
Daniel Morsing 8aae36cdf1 staging: brcm80211: Fix double include introduced by bad merge
A merge with Linus' tree added a double include of linux/interrupt.h.
Fix by removing one of the includes.

Signed-off-by: Daniel Morsing <daniel.morsing@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-27 09:23:35 -07:00
Dmitry Torokhov aa7eb8e78d Merge branch 'next' into for-linus 2011-07-27 00:54:47 -07:00
Linus Torvalds e371d46ae4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  merge fchmod() and fchmodat() guts, kill ancient broken kludge
  xfs: fix misspelled S_IS...()
  xfs: get rid of open-coded S_ISREG(), etc.
  vfs: document locking requirements for d_move, __d_move and d_materialise_unique
  omfs: fix (mode & S_IFDIR) abuse
  btrfs: S_ISREG(mode) is not mode & S_IFREG...
  ima: fmode_t misspelled as mode_t...
  pci-label.c: size_t misspelled as mode_t
  jffs2: S_ISLNK(mode & S_IFMT) is pointless
  snd_msnd ->mode is fmode_t, not mode_t
  v9fs_iop_get_acl: get rid of unused variable
  vfs: dont chain pipe/anon/socket on superblock s_inodes list
  Documentation: Exporting: update description of d_splice_alias
  fs: add missing unlock in default_llseek()
2011-07-26 18:30:20 -07:00
Jonathan Brassow 768e587e18 MD: generate an event when array sync is complete
This patch causes MD to generate an event (for device-mapper) when the
synchronization thread is reaped.  This is expected behavior for device-mapper.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:37 +10:00
Jonathan Brassow 3520fa4db7 MD bitmap: Revert DM dirty log hooks
Revert most of commit e384e58549
  md/bitmap: prepare for storing write-intent-bitmap via dm-dirty-log.

MD should not need to use DM's dirty log - we decided to use md's
bitmaps instead.

Keeping the DIV_ROUND_UP clean-ups that were part of commit
e384e58549, however.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:37 +10:00
Jonathan Brassow 654e8b5abc MD: raid1 s/sysfs_notify_dirent/sysfs_notify_dirent_safe
If device-mapper creates a RAID1 array that includes devices to
be rebuilt, it will deref a NULL pointer when finished because
sysfs is not used by device-mapper instantiated RAID devices.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:36 +10:00
NeilBrown 8cfa7b0f67 md/raid5: Avoid BUG caused by multiple failures.
While preparing to write a stripe we keep the parity block or blocks
locked (R5_LOCKED) - towards the end of schedule_reconstruction.

If the array is discovered to have failed before this write completes
we can leave those blocks LOCKED, and init_stripe will notice that a
free stripe still has a locked block and will complain.

So clear the R5_LOCKED flag in handle_failed_stripe, and demote the
'BUG' to a 'WARN_ON'.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:36 +10:00
Namhyung Kim cbea21703b md/raid10: move rdev->corrected_errors counting
Read errors are considered to corrected if write-back and re-read
cycle is finished without further problems. Thus moving the rdev->
corrected_errors counting after the re-reading looks more reasonable
IMHO.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:36 +10:00
Namhyung Kim ddd5115fe5 md/raid5: move rdev->corrected_errors counting
Read errors are considered to corrected if write-back and re-read
cycle is finished without further problems. Thus moving the rdev->
corrected_errors counting after the re-reading looks more reasonable
IMHO.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:36 +10:00
Namhyung Kim 9d3d80113d md/raid1: move rdev->corrected_errors counting
Read errors are considered to corrected if write-back and re-read
cycle is finished without further problems. Thus moving the rdev->
corrected_errors counting after the re-reading looks more reasonable
IMHO. Also included a couple of whitespace fixes on sync_page_io().

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:36 +10:00
Namhyung Kim 65a06f0674 md: get rid of unnecessary casts on page_address()
page_address() returns void pointer, so the casts can be removed.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:36 +10:00
NeilBrown 700c721389 md/raid10: Improve decision on whether to fail a device with a read error.
Normally we would fail a device with a READ error.  However if doing
so causes the array to fail, it is better to leave the device
in place and just return the read error to the caller.

The current test for decide if the array will fail is overly
simplistic.
We have a function 'enough' which can tell if the array is failed or
not, so use it to guide the decision.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:36 +10:00
NeilBrown 2bb77736ae md/raid10: Make use of new recovery_disabled handling
When we get a read error during recovery, RAID10 previously
arranged for the recovering device to appear to fail so that
the recovery stops and doesn't restart.  This is misleading and wrong.

Instead, make use of the new recovery_disabled handling and mark
the target device and having recovery disabled.

Add appropriate checks in add_disk and remove_disk so that devices
are removed and not re-added when recovery is disabled.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:36 +10:00
NeilBrown 5389042ffa md: change managed of recovery_disabled.
If we hit a read error while recovering a mirror, we want to abort the
recovery without necessarily failing the disk - as having a disk this
a read error is better than not having an array at all.

Currently this is managed with a per-array flag "recovery_disabled"
and is only implemented for RAID1.  For RAID10 we will need finer
grained control as we might want to disable recovery for individual
devices separately.

So push more of the decision making into the personality.
'recovery_disabled' is now a 'cookie' which is copied when the
personality want to disable recovery and is changed when a device is
added to the array as this is used as a trigger to 'try recovery
again'.

This will allow RAID10 to get the control that it needs.

Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:36 +10:00
Namhyung Kim a478a069b6 md: remove ro check in md_check_recovery()
Commit c89a8eee61 ("Allow faulty devices to be removed from a
readonly array.") added some work on ro array in the function,
but it couldn't be done since we didn't allow the ro array to be
handled from the beginning. Fix it.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:36 +10:00
Namhyung Kim 36fad858a7 md: introduce link/unlink_rdev() helpers
There are places where sysfs links to rdev are handled
in a same way. Add the helper functions to consolidate
them.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:36 +10:00
Christian Dietrich 8bda470e8e md/raid: use printk_ratelimited instead of printk_ratelimit
As per printk_ratelimit comment, it should not be used.

Signed-off-by: Christian Dietrich <christian.dietrich@informatik.uni-erlangen.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:36 +10:00
Akinobu Mita a0a02a7ad6 md: use proper little-endian bitops
Using __test_and_{set,clear}_bit_le() with ignoring its return value
can be replaced with __{set,clear}_bit_le().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
2011-07-27 11:00:36 +10:00
NeilBrown acfe726bdd md/raid5: finalise new merged handle_stripe.
handle_stripe5() and handle_stripe6() are now virtually identical.
So discard one and rename the other to 'analyse_stripe()'.

It always returns 0, so change it to 'void' and remove the 'done'
variable in handle_stripe().

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-27 11:00:36 +10:00
NeilBrown 474af965fe md/raid5: move some more common code into handle_stripe
The RAID6 version of this code is usable for RAID5 providing:
  - we test "conf->max_degraded" rather than "2" as appropriate
  - we make sure s->failed_num[1] is meaningful (and not '-1')
    when s->failed > 1

The 'return 1' must become 'goto finish' in the new location.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-27 11:00:36 +10:00
NeilBrown 84789554e9 md/raid5: move more common code into handle_stripe
Apart from 'prexor' which can only be set for RAID5, and
'qd_idx' which can only be meaningful for RAID6, these two
chunks of code are nearly the same.

So combine them into one adding a test to call either
handle_parity_checks5 or handle_parity_checks6 as appropriate.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-27 11:00:36 +10:00
NeilBrown c8ac1803ff md/raid5: unite handle_stripe_dirtying5 and handle_stripe_dirtying6
RAID6 is only allowed to choose 'reconstruct-write' while RAID5 is
also allow 'read-modify-write'
Apart from this difference, handle_stripe_dirtying[56] are nearly
identical.  So resolve these differences and create just one function.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-27 11:00:36 +10:00
NeilBrown 93b3dbce64 md/raid5: unite fetch_block5 and fetch_block6
Provided that ->failed_num[1] is not a valid device number (which is
easily achieved) fetch_block6 provides all the functionality of
fetch_block5.

So remove the latter and rename the former to simply "fetch_block".

Then handle_stripe_fill5 and handle_stripe_fill6 become the same and
can similarly be united.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-27 11:00:36 +10:00
NeilBrown 5d35e09cae md/raid5: rearrange a test in fetch_block6.
Next patch will unite fetch_block5 and fetch_block6.
First I want to make the differences a little more clear.

For RAID6 if we are writing at all and there is a failed device, then
we need to load or compute every block so we can do a
reconstruct-write.
This case isn't needed for RAID5 - we will do a read-modify-write in
that case.
So make that test a separate test in fetch_block6 rather than merged
with two other tests.

Make a similar change in fetch_block5 so the one bit that is not
needed for RAID6 is clearly separate.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-27 11:00:36 +10:00
NeilBrown c5a3100062 md/raid5: move more code into common handle_stripe
The difference between the RAID5 and RAID6 code here is easily
resolved using conf->max_degraded.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-27 11:00:36 +10:00
NeilBrown 3687c06188 md/raid5: Move code for finishing a reconstruction into handle_stripe.
Prior to commit ab69ae12ce the code in handle_stripe5 and
handle_stripe6 to "Finish reconstruct operations initiated by the
expansion process" was identical.
That commit added an identical stanza of code to each function, but in
different places.  That was careless.

The raid5 code was correct, so move that out into handle_stripe and
remove raid6 version.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-27 11:00:36 +10:00
NeilBrown 86c374ba9f md/raid5: Remove stripe_head_state arg from handle_stripe_expansion.
This arg is only used to differentiate between RAID5 and RAID6 but
that is not needed.  For RAID5, raid5_compute_sector will set qd_idx
to "~0" so j with certainly not equals qd_idx, so there is no need
for a guard on that condition.

So remove the guard and remove the arg from the declaration and
callers of handle_stripe_expansion.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
2011-07-27 11:00:36 +10:00
Linus Torvalds b0189cd087 Merge branch 'next/devel2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc
* 'next/devel2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc: (47 commits)
  OMAP: Add debugfs node to show the summary of all clocks
  OMAP2+: hwmod: Follow the recommended PRCM module enable sequence
  OMAP2+: clock: allow per-SoC clock init code to prevent clockdomain calls from clock code
  OMAP2+: clockdomain: Add per clkdm lock to prevent concurrent state programming
  OMAP2+: PM: idle clkdms only if already in idle
  OMAP2+: clockdomain: add clkdm_in_hwsup()
  OMAP2+: clockdomain: Add 2 APIs to control clockdomain from hwmod framework
  OMAP: clockdomain: Remove redundant call to pwrdm_wait_transition()
  OMAP4: hwmod: Introduce the module control in hwmod control
  OMAP4: cm: Add two new APIs for modulemode control
  OMAP4: hwmod data: Add modulemode entry in omap_hwmod structure
  OMAP4: hwmod data: Add PRM context register offset
  OMAP4: prm: Remove deprecated functions
  OMAP4: prm: Replace warm reset API with the offset based version
  OMAP4: hwmod: Replace RSTCTRL absolute address with offset macros
  OMAP: hwmod: Wait the idle status to be disabled
  OMAP4: hwmod: Replace CLKCTRL absolute address with offset macros
  OMAP2+: hwmod: Init clkdm field at boot time
  OMAP4: hwmod data: Add clock domain attribute
  OMAP4: clock data: Add missing divider selection for auxclks
  ...
2011-07-26 17:42:18 -07:00
Linus Torvalds 69f1d1a6ac Merge branch 'next/devel' of ssh://master.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc
* 'next/devel' of ssh://master.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc: (128 commits)
  ARM: S5P64X0: External Interrupt Support
  ARM: EXYNOS4: Enable MFC on Samsung NURI
  ARM: EXYNOS4: Enable MFC on universal_c210
  ARM: S5PV210: Enable MFC on Goni
  ARM: S5P: Add support for MFC device
  ARM: EXYNOS4: Add support FIMD on SMDKC210
  ARM: EXYNOS4: Add platform device and helper functions for FIMD
  ARM: EXYNOS4: Add resource definition for FIMD
  ARM: EXYNOS4: Change devname for FIMD clkdev
  ARM: SAMSUNG: Add IRQ_I2S0 definition
  ARM: SAMSUNG: Add platform device for idma
  ARM: EXYNOS4: Add more registers to be saved and restored for PM
  ARM: EXYNOS4: Add more register addresses of CMU
  ARM: EXYNOS4: Add platform device for dwmci driver
  ARM: EXYNOS4: configure rtc-s3c on NURI
  ARM: EXYNOS4: configure MAX8903 secondary charger on NURI
  ARM: EXYNOS4: configure ADC on NURI
  ARM: EXYNOS4: configure MAX17042 fuel gauge on NURI
  ARM: EXYNOS4: configure regulators and PMIC(MAX8997) on NURI
  ARM: EXYNOS4: Increase NR_IRQS for devices with more IRQs
  ...

Fix up tons of silly conflicts:
 - arch/arm/mach-davinci/include/mach/psc.h
 - arch/arm/mach-exynos4/Kconfig
 - arch/arm/mach-exynos4/mach-smdkc210.c
 - arch/arm/mach-exynos4/pm.c
 - arch/arm/mach-imx/mm-imx1.c
 - arch/arm/mach-imx/mm-imx21.c
 - arch/arm/mach-imx/mm-imx25.c
 - arch/arm/mach-imx/mm-imx27.c
 - arch/arm/mach-imx/mm-imx31.c
 - arch/arm/mach-imx/mm-imx35.c
 - arch/arm/mach-mx5/mm.c
 - arch/arm/mach-s5pv210/mach-goni.c
 - arch/arm/mm/Kconfig
2011-07-26 17:41:04 -07:00