If CONFIG_PM is not defined, then arch/arm/mach-at91/pm.c is not
compiled in. This patch creates an inline function that does nothing
if CONFIG_PM is not defined.
Signed-off-by: Brent Taylor <motobud@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Alias was missing for SoC of the at91sam9x5 familly that embed USART3.
Reported-by: Jiri Prchal <jiri.prchal@aksignal.cz>
[b.brezillon@overkiz.com: advised to place changes in at91sam9x5_usart3.dtsi]
Acked-by: Boris BREZILLON <b.brezillon@overkiz.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
With some devices, transfer hangs during I2C frame transmission. This issue
disappears when reducing the internal frequency of the TWI IP. Even if it is
indicated that internal clock max frequency is 66MHz, it seems we have
oversampling on I2C signals making TWI believe that a transfer in progress
is done.
This fix has no impact on the I2C bus frequency.
Cc: <stable@vger.kernel.org> #3.10+
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
When the probe of snd-hda-intel driver is deferred due to f/w loading
or the nested module loading, complete_all() should be also delayed
until the initialization really finished. Otherwise, vga-switcheroo
client would start switching before the actual init is done.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
It seems that EAPD on NID 0x16 is the only control over all outputs on
HP machines with AD1984A while turning EAPD on NID 0x12 breaks the
output. Thus we need to avoid fiddling EAPD on NID. As a quick
workaround, just set own_eapd_ctrl flag for the wrong EAPD, then
implement finer EAPD controls.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=66321
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
N810 audio driver has stopped working at some point. Probably when
OMAP2 was converted to common clock framework since now call to clk_enable
dumps the stack trace in drivers/clk/clk.c: __clk_enable() due
clk->prepare_count is zero.
Fix this by converting clk_enable/_disable calls to those that take care
of clock prepare/unprepare.
I'm not queueing this to linux-stable since OMAP2 common clock framework
conversion in commit ed1ebc4948 ("ARM: OMAP2: clock: Convert to common clk")
happened before N810 was really usable in mainline and user base for N810 is
anyway small. Potential linux-stable candidates are only those after
commit 3d3a6d18ab ("watchdog: introduce retu_wdt driver").
Signed-off-by: Jarkko Nikula <jarkko.nikula@bitmer.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
platform_set_drvdata(op, pdata) in pcm030_fabric_probe()
will be overwrited when calling snd_soc_register_card(card),
but cm030_fabric_remove() use drvdata as a type of struct
pcm030_audio_data, so we should move platform_set_drvdata()
below snd_soc_register_card() call.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Mark Brown <broonie@linaro.org>
Originally snd_hrtimer_callback() used iprtd->period_time for
some jiffies based estimation to determine the right moment
to call snd_pcm_period_elapsed(). As timer drifts may well be a
problem, this was changed in commit b4e82b5b78 to be based
on buffer transmission progress, using iprtd->offset and
runtime->buffer_size to calculate the amount of data since last
period had elapsed.
Unfortunately, iprtd->offset counts in bytes, while
runtime->buffer_size counts frames, so adding these to find some
delta is like comparing apples and oranges, and eventually results
in negative delta values every now and then. This is no big harm,
because it simply causes snd_pcm_period_elapsed() being called
more often than necessary, as negative delta is taken for a
large unsigned value by implicit conversion rule.
Nonetheless, the calculation is broken, so one would replace
the runtime->buffer_size by its equivalent in bytes.
But then, there are chances snd_pcm_period_elapsed() is called
late, because calculating the moment for the elapsed period
into delta is based against the iprtd->last_offset, which is not
necessarily the first byte of the period in question, but some
random byte which the FIQ handler left us with in r8/r9 by
accident. Again, negative impact is low, as there are plenty of
periods already prefilled with data, and snd_pcm_period_elapsed()
will probably be called latest when the following period is
reached. However, the calculation is conceptually broken, and we
are best off removing the clever stuff altogether.
snd_pcm_period_elapsed() is now simply called once everytime
snd_hrtimer_callback() is run, which may not be most accurate,
but at least this way we are quite sure we dont miss an end of
period. There is not much extra effort wasted by superfluous
calls to snd_pcm_period_elapsed(), as the timer frequency
closely matches the period size anyway.
Signed-off-by: Oskar Schirmer <oskar@scara.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Remove waiting for TX queues to become empty during selftest.
This check is not necessary for any purpose, and might put
the driver into an infinite loop.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit a553e4a631 ("[PKTGEN]: IPSEC support")
tried to support IPsec ESP transport transformation for pktgen, but acctually
this doesn't work at all for two reasons(The orignal transformed packet has
bad IPv4 checksum value, as well as wrong auth value, reported by wireshark)
- After transpormation, IPv4 header total length needs update,
because encrypted payload's length is NOT same as that of plain text.
- After transformation, IPv4 checksum needs re-caculate because of payload
has been changed.
With this patch, armmed pktgen with below cofiguration, Wireshark is able to
decrypted ESP packet generated by pktgen without any IPv4 checksum error or
auth value error.
pgset "flag IPSEC"
pgset "flows 1"
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
receive mergeable now handles errors internally.
Do same for big and small packet paths, otherwise
the logic is too hard to follow.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet noticed that if we encounter an error
when processing a mergeable buffer, we don't
dequeue all of the buffers from this packet,
the result is almost sure to be loss of networking.
Jason Wang noticed that we also leak a page and that we don't decrement
the rq buf count, so we won't repost buffers (a resource leak).
Fix both issues.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael Dalton <mwdalton@google.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull UML fixes from Richard Weinberger:
"Fixes two regressions which got introduced this merge window"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: Build always with -mcmodel=large on 64bit
um: Rename print_stack_trace to do_stack_trace
Pull ARM fixes from Russell King:
"Some ARM fixes, the biggest of which is the fix for the signal return
codes; this came up due to an interaction between the V7M nommu
changes and the BE8 changes. Dave Martin spotted that the kexec
trampoline wasn't being correctly copied (in a way which allows
Thumb-2 to work).
I've also fixed a number of breakages on footbridge platforms as I've
upgraded one of my machines to v3.12... one which had a 1200 day
uptime"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 7907/1: lib: delay-loop: Add align directive to fix BogoMIPS calculation
ARM: 7897/1: kexec: Use the right ISA for relocate_new_kernel
ARM: 7895/1: signal: fix armv7-m build issue in sigreturn_codes.S
ARM: footbridge: fix EBSA285 LEDs
ARM: footbridge: fix VGA initialisation
ARM: fix booting low-vectors machines
ARM: dma-mapping: check DMA mask against available memory
On UML SUBARCH can be x86, x86_64 and i386 and if it is x86
we use uname -m to select a defconfig.
Therefore we can no longer use -mcmodel=large only if SUBARCH
is x86_64.
Reported-and-tested-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
We cannot use print_stack_trace because the name conflicts
with linux/stacktrace.h.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
A return value of 1 is interpreted as an error. See pci_driver.
in local_pci_probe(). If you're wondering how this ever could
have worked, it's because it used to be the case that only return
values less than zero were interpreted as failure. But even in
the current kernel if the driver registers its various entry
points with the kernel, and then returns a value which is
interpreted as failure, those registrations aren't undone, so
the driver still mostly works. However, the driver's remove
function wouldn't be called on rmmod, and pci power management
functions wouldn't work. In the case of Smart Array, since it
has a battery backed cache (or else no cache) even if the driver
is not shut down properly as long as there is no outstanding
i/o, nothing too bad happens, which is why it took so long to
notice.
Requesting backport to stable because the change to pci-driver.c
which requires driver probe functions to return 0 occurred between
2.6.35 and 2.6.36 (the pci power management breakage) and again
between 3.7 and 3.8 (pci_dev->driver getting set to NULL in
local_pci_probe() preventing driver remove function from being
called on rmmod.)
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Copying a function with memcpy() and then trying to execute the
result isn't trivially portable to Thumb.
This patch modifies the kexec soft restart code to copy its
assembler trampoline relocate_new_kernel() using fncpy() instead,
so that relocate_new_kernel can be in the same ISA as the rest of
the kernel without problems.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Reported-by: Taras Kondratiuk <taras.kondratiuk@linaro.org>
Tested-by: Taras Kondratiuk <taras.kondratiuk@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
After "ARM: signal: sigreturn_codes should be endian neutral to
work in BE8" commit, thumb only platforms, like armv7m, fails to
compile sigreturn_codes.S. The reason is that for such arch
values '.arm' directive and arm opcodes are not allowed.
Fix conditionally enables arm opcodes only if no CONFIG_CPU_THUMBONLY
defined and it uses .org instructions to keep sigreturn_codes
layout.
Suggested-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
- The LEDs register is write-only: it can't be read-modify-written.
- The LEDs are write-1-for-off not 0.
- The check for the platform was inverted.
Fixes: cf6856d693 ("ARM: mach-footbridge: retire custom LED code")
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: stable@vger.kernel.org
We inadvertantly discarded the scsi status for aborted commands.
For some commands (e.g. reads from tape drives) these can't be retried,
and if we discarded the scsi status, the scsi mid layer couldn't notice
anything was wrong and the error was not reported.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
"MAC filter" sounds more reasonable than "MAC fitler".
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When building a 64bit kernel sometimes functions in the .init section were not
able to reach the standard kernel function. Main reason for this problem is,
that the linkage tables (.plt, .opd, .dlt) tend to become pretty huge and thus
the distance gets too big for short calls.
One option to avoid this is to use the -mlong-calls compiler option, but this
increases the binary size and introduces a performance penalty.
Instead, with this patch we just lay out the binary differently. Init code is
stored first, followed by text, R/O and finally R/W data. This means, that init
and text code is now much closer to each other, which is sufficient to reach
each other by short calls.
Signed-off-by: Helge Deller <deller@gmx.de>
If architectures don't support SERIAL_PORT_DFNS, they need not define it
to "nothing", the related drivers need do it by themselves (e.g. 8250
serial driver).
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Sadly the correct names for machines which end with a question-mark aren't
known, so let's give it a best-guessed-name.
Signed-off-by: Helge Deller <deller@gmx.de>
locale-gen on Debian showed a strange problem on parisc:
mmap2(NULL, 536870912, PROT_NONE, MAP_SHARED, 3, 0) = 0x42a54000
mmap2(0x42a54000, 103860, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED, 3, 0) = -1 EINVAL (Invalid argument)
Basically it was just trying to re-mmap() a file at the same address
which it was given by a previous mmap() call. But this remapping failed
with EINVAL.
The problem is, that when MAP_FIXED and MAP_SHARED flags were used, we didn't
included the mapping-based offset when we verified the alignment of the given
fixed address against the offset which we calculated it in the previous call.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # 3.10+
Patch from developers of the alternative loss models, downloaded from:
http://netgroup.uniroma2.it/twiki/bin/view.cgi/Main/NetemCLG
"in case 2, of the switch we change the direction of the inequality to
net_random()>clg->a3, because clg->a3 is h in the GE model and when h
is 0 all packets will be lost."
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch from developers of the alternative loss models, downloaded from:
http://netgroup.uniroma2.it/twiki/bin/view.cgi/Main/NetemCLG
"In the case 1 of the switch statement in the if conditions we
need to add clg->a4 to clg->a1, according to the model."
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a missing break statement in the Gilbert Elliot loss model
generator which makes state machine behave incorrectly.
Reported-by: Martin Burri <martin.burri@ch.abb.com
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This implements the rtnl_link_ops fill_info routine for HSR.
Signed-off-by: Arvid Brodin <arvid.brodin@alten.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPv6 stats are 64 bits and thus are protected with a seqlock. By not
disabling bottom-half we could deadlock here if we don't disable bh and
a softirq reentrantly updates the same mib.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates
This series contains updates to igb, e1000 and ixgbe.
Akeem provides a igb fix where WOL was being reported as supported on
some ethernet devices which did not have that capability.
Yanjun provides a fix for e1000 which is similar to a previous fix
for e1000e commit bb9e44d0d0 ("e1000e: prevent oops when adapter is
being closed and reset simultaneously"), where the same issue was
observed on the older e1000 cards.
Vladimir Davydov provides 2 e1000 fixes. The first fixes a lockdep
warning e1000_down() tries to synchronously cancel e1000 auxiliary
works (reset_task, watchdog_task, phy_info_task and fifo_stall_task)
which take adapter->mutex in their handlers. The second patch is to
fix a possible race condition where reset_task() would be running
after adapter down.
John provides 2 fixes for ixgbe. First turns ixgbe_fwd_ring_down
to static and the second disables NETIF_F_HW_L2FW_DOFFLOAD by default
because it allows upper layer net devices to use queues in the hardware
to directly submit and receive skbs.
Mark Rustad provides a single patch for ixgbe to make
ixgbe_identify_qsfp_module_generic static to resolve compile
warnings.
v2: Drop igb patch "igb: Update queue reinit function to call dev_close
when init of queues fails" from Carolyn, so that the solution can
be re-worked based on feedback from David Miller.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
It's no good setting vga_base after the VGA console has been
initialised, because if we do that we get this:
Unable to handle kernel paging request at virtual address 000b8000
pgd = c0004000
[000b8000] *pgd=07ffc831, *pte=00000000, *ppte=00000000
0Internal error: Oops: 5017 [#1] ARM
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 3.12.0+ #49
task: c03e2974 ti: c03d8000 task.ti: c03d8000
PC is at vgacon_startup+0x258/0x39c
LR is at request_resource+0x10/0x1c
pc : [<c01725d0>] lr : [<c0022b50>] psr: 60000053
sp : c03d9f68 ip : 000b8000 fp : c03d9f8c
r10: 000055aa r9 : 4401a103 r8 : ffffaa55
r7 : c03e357c r6 : c051b460 r5 : 000000ff r4 : 000c0000
r3 : 000b8000 r2 : c03e0514 r1 : 00000000 r0 : c0304971
Flags: nZCv IRQs on FIQs off Mode SVC_32 ISA ARM Segment kernel
which is an access to the 0xb8000 without the PCI offset required to
make it work.
Fixes: cc22b4c185 ("ARM: set vga memory base at run-time")
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: <stable@vger.kernel.org>
Commit f6f91b0d9f (ARM: allow kuser helpers to be removed from the
vector page) required two pages for the vectors code. Although the
code setting up the initial page tables was updated, the code which
allocates page tables for new processes wasn't, neither was the code
which tears down the mappings. Fix this.
Fixes: f6f91b0d9f ("ARM: allow kuser helpers to be removed from the vector page")
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: <stable@vger.kernel.org>
Some buses have negative offsets, which causes the DMA mask checks to
falsely fail. Fix this by using the actual amount of memory fitted in
the system.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Correct a namespace complaint by making the function static
and moving the prototype into the .c file.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
NETIF_F_HW_L2FW_DOFFLOAD allows upper layer net devices such
as macvlan to use queues in the hardware to directly submit and
receive skbs.
This creates a subtle change in the datapath though. One change
being the skb may no longer use the root devices qdisc.
Because users may not expect this we can't enable the feature
by default unless the hardware can offload all the software
functionality above it. So for now disable it by default and
let users opt in.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When compiling with -Wstrict-prototypes gcc catches a static
I missed.
./ixgbe_main.c:4254: warning: no previous prototype for 'ixgbe_fwd_ring_down'
Reported-by: Phillip Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On e1000_down(), we should ensure every asynchronous work is canceled
before proceeding. Since the watchdog_task can schedule other works
apart from itself, it should be stopped first, but currently it is
stopped after the reset_task. This can result in the following race
leading to the reset_task running after the module unload:
e1000_down_and_stop(): e1000_watchdog():
---------------------- -----------------
cancel_work_sync(reset_task)
schedule_work(reset_task)
cancel_delayed_work_sync(watchdog_task)
The patch moves cancel_delayed_work_sync(watchdog_task) at the beginning
of e1000_down_and_stop() thus ensuring the race is impossible.
Cc: Tushar Dave <tushar.n.dave@intel.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The patch fixes the following lockdep warning, which is 100%
reproducible on network restart:
======================================================
[ INFO: possible circular locking dependency detected ]
3.12.0+ #47 Tainted: GF
-------------------------------------------------------
kworker/1:1/27 is trying to acquire lock:
((&(&adapter->watchdog_task)->work)){+.+...}, at: [<ffffffff8108a5b0>] flush_work+0x0/0x70
but task is already holding lock:
(&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&adapter->mutex){+.+...}:
[<ffffffff810bdb5d>] lock_acquire+0x9d/0x120
[<ffffffff816b8cbc>] mutex_lock_nested+0x4c/0x390
[<ffffffffa017233d>] e1000_watchdog+0x7d/0x5b0 [e1000]
[<ffffffff8108b972>] process_one_work+0x1d2/0x510
[<ffffffff8108ca80>] worker_thread+0x120/0x3a0
[<ffffffff81092c1e>] kthread+0xee/0x110
[<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0
-> #0 ((&(&adapter->watchdog_task)->work)){+.+...}:
[<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810
[<ffffffff810bdb5d>] lock_acquire+0x9d/0x120
[<ffffffff8108a5eb>] flush_work+0x3b/0x70
[<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140
[<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20
[<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000]
[<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000]
[<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000]
[<ffffffff8108b972>] process_one_work+0x1d2/0x510
[<ffffffff8108ca80>] worker_thread+0x120/0x3a0
[<ffffffff81092c1e>] kthread+0xee/0x110
[<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&adapter->mutex);
lock((&(&adapter->watchdog_task)->work));
lock(&adapter->mutex);
lock((&(&adapter->watchdog_task)->work));
*** DEADLOCK ***
3 locks held by kworker/1:1/27:
#0: (events){.+.+.+}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510
#1: ((&adapter->reset_task)){+.+...}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510
#2: (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000]
stack backtrace:
CPU: 1 PID: 27 Comm: kworker/1:1 Tainted: GF 3.12.0+ #47
Hardware name: System manufacturer System Product Name/P5B-VM SE, BIOS 0501 05/31/2007
Workqueue: events e1000_reset_task [e1000]
ffffffff820f6000 ffff88007b9dba98 ffffffff816b54a2 0000000000000002
ffffffff820f5e50 ffff88007b9dbae8 ffffffff810ba936 ffff88007b9dbac8
ffff88007b9dbb48 ffff88007b9d8f00 ffff88007b9d8780 ffff88007b9d8f00
Call Trace:
[<ffffffff816b54a2>] dump_stack+0x49/0x5f
[<ffffffff810ba936>] print_circular_bug+0x216/0x310
[<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810
[<ffffffff8108a5b0>] ? __flush_work+0x250/0x250
[<ffffffff810bdb5d>] lock_acquire+0x9d/0x120
[<ffffffff8108a5b0>] ? __flush_work+0x250/0x250
[<ffffffff8108a5eb>] flush_work+0x3b/0x70
[<ffffffff8108a5b0>] ? __flush_work+0x250/0x250
[<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140
[<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20
[<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000]
[<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000]
[<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000]
[<ffffffff8108b972>] process_one_work+0x1d2/0x510
[<ffffffff8108b906>] ? process_one_work+0x166/0x510
[<ffffffff8108ca80>] worker_thread+0x120/0x3a0
[<ffffffff8108c960>] ? manage_workers+0x2c0/0x2c0
[<ffffffff81092c1e>] kthread+0xee/0x110
[<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70
[<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0
[<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70
== The issue background ==
The problem occurs, because e1000_down(), which is called under
adapter->mutex by e1000_reset_task(), tries to synchronously cancel
e1000 auxiliary works (reset_task, watchdog_task, phy_info_task,
fifo_stall_task), which take adapter->mutex in their handlers. So the
question is what does adapter->mutex protect there?
The adapter->mutex was introduced by commit 0ef4ee ("e1000: convert to
private mutex from rtnl") as a replacement for rtnl_lock() taken in the
asynchronous handlers. It targeted on fixing a similar lockdep warning
issued when e1000_down() was called under rtnl_lock(), and it fixed it,
but unfortunately it introduced the lockdep warning described above.
Anyway, that said the source of this bug is that the asynchronous works
were made to take rtnl_lock() some time ago, so let's look deeper and
find why it was added there.
The rtnl_lock() was added to asynchronous handlers by commit 338c15
("e1000: fix occasional panic on unload") in order to prevent
asynchronous handlers from execution after the module is unloaded
(e1000_down() is called) as it follows from the comment to the commit:
> Net drivers in general have an issue where timers fired
> by mod_timer or work threads with schedule_work are running
> outside of the rtnl_lock.
>
> With no other lock protection these routines are vulnerable
> to races with driver unload or reset paths.
>
> The longer term solution to this might be a redesign with
> safer locks being taken in the driver to guarantee no
> reentrance, but for now a safe and effective fix is
> to take the rtnl_lock in these routines.
I'm not sure if this locking scheme fixed the problem or just made it
unlikely, although I incline to the latter. Anyway, this was long time
ago when e1000 auxiliary works were implemented as timers scheduling
real work handlers in their routines. The e1000_down() function only
canceled the timers, but left the real handlers running if they were
running, which could result in work execution after module unload.
Today, the e1000 driver uses sane delayed works instead of the pair
timer+work to implement its delayed asynchronous handlers, and the
e1000_down() synchronously cancels all the works so that the problem
that commit 338c15 tried to cope with disappeared, and we don't need any
locks in the handlers any more. Moreover, any locking there can
potentially result in a deadlock.
So, this patch reverts commits 0ef4ee and 338c15.
Fixes: 0ef4eedc2e ("e1000: convert to private mutex from rtnl")
Fixes: 338c15e470 ("e1000: fix occasional panic on unload")
Cc: Tushar Dave <tushar.n.dave@intel.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change is based on a similar change made to e1000e support in
commit bb9e44d0d0 ("e1000e: prevent oops when adapter is being closed
and reset simultaneously"). The same issue has also been observed
on the older e1000 cards.
Here, we have increased the RESET_COUNT value to 50 because there are too
many accesses to e1000 nic on stress tests to e1000 nic, it is not enough
to set RESET_COUT 25. Experimentation has shown that it is enough to set
RESET_COUNT 50.
Signed-off-by: yzhu1 <yanjun.zhu@windriver.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes Wake on LAN being reported as supported on some Ethernet
ports, in contrary to Hardware capability.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>