When add_memory() fails the following BUG is observed:
[ 743.646107] hv_balloon: hot_add memory failed error is -17
[ 743.679973]
[ 743.680930] =====================================
[ 743.680930] [ BUG: bad unlock balance detected! ]
[ 743.680930] 3.19.0-rc5_bug1131426+ #552 Not tainted
[ 743.680930] -------------------------------------
[ 743.680930] kworker/0:2/255 is trying to release lock (&dm_device.ha_region_mutex) at:
[ 743.680930] [<ffffffff81aae5fe>] mutex_unlock+0xe/0x10
[ 743.680930] but there are no more locks to release!
This happens as we don't acquire ha_region_mutex and hot_add_req() expects us
to as it does unconditional mutex_unlock(). Acquire the lock on the error path.
The current char-next branch is broken and this patch fixes
the bug.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch is a continuation of the rescind handling cleanup work. We cannot
block in the global message handling work context especially if we are blocking
waiting for the host to wake us up. I would like to thank
Dexuan Cui <decui@microsoft.com> for observing this problem.
The current char-next branch is broken and this patch fixes
the bug.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/hv/vmbus_drv.c:51:5: sparse: symbol 'hyperv_panic_event' was not declared. Should it be static?
drivers/hv/vmbus_drv.c:51:5: sparse: symbol 'hyperv_panic_event' was not declared. Should it be static?
drivers/hv/vmbus_drv.c:51:5: sparse: symbol 'hyperv_panic_event' was not declared. Should it be static?
drivers/hv/vmbus_drv.c:51:5: sparse: symbol 'hyperv_panic_event' was not declared. Should it be static?
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Implement an API that gives additional control on the what VMBUS flags will be
set as well as if the host needs to be signalled. This API will be
useful for clients that want to batch up requests to the host.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Implement an API for sending pagebuffers that gives more control to the client
in terms of setting the vmbus flags as well as deciding when to
notify the host. This will be useful for enabling batch processing.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current algorithm for picking an outgoing channel was not distributing
the load well. Implement a simple round-robin scheme to ensure good
distribution of the outgoing traffic.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hyper-V allows a guest to notify the Hyper-V host that a panic
condition occured. This notification can include up to five 64
bit values. These 64 bit values are written into crash MSRs.
Once the data has been written into the crash MSRs, the host is
then notified by writing into a Crash Control MSR. On the Hyper-V
host, the panic notification data is captured in the Windows Event
log as a 18590 event.
Crash MSRs are defined in appendix H of the Hypervisor Top Level
Functional Specification. At the time of this patch, v4.0 is the
current functional spec. The URL for the v4.0 document is:
http://download.microsoft.com/download/A/B/4/AB43A34E-BDD0-4FA6-BDEF-79EEF16E880B/Hypervisor Top Level Functional Specification v4.0.docx
Signed-off-by: Nick Meier <nmeier@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When host asks us to balloon up we need to be sure we're not committing suicide
by overballooning. Use already existent 'floor' metric as our lowest possible
value for free ram.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When hot-added memory pages are not brought online or when some memory blocks
are sent offline the subsequent ballooning process kills the guest with OOM
killer. This happens as we don't report these pages as neither used nor free
and apparently host algorithm considers them as being unused. Keep track of
all online/offline operations and report all currently offline pages as being
used so host won't try to balloon them out.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When many memory regions are being added and automatically onlined the
following lockup is sometimes observed:
INFO: task udevd:1872 blocked for more than 120 seconds.
...
Call Trace:
[<ffffffff816ec0bc>] schedule_timeout+0x22c/0x350
[<ffffffff816eb98f>] wait_for_common+0x10f/0x160
[<ffffffff81067650>] ? default_wake_function+0x0/0x20
[<ffffffff816eb9fd>] wait_for_completion+0x1d/0x20
[<ffffffff8144cb9c>] hv_memory_notifier+0xdc/0x120
[<ffffffff816f298c>] notifier_call_chain+0x4c/0x70
...
When several memory blocks are going online simultaneously we got several
hv_memory_notifier() trying to acquire the ha_region_mutex. When this mutex is
being held by hot_add_req() all these competing acquire_region_mutex() do
mutex_trylock, fail, and queue themselves into wait_for_completion(..). However
when we do complete() from release_region_mutex() only one of them wakes up.
This could be solved by changing complete() -> complete_all() memory onlining
can be delayed as well, in that case we can still get several
hv_memory_notifier() runners at the same time trying to grab the mutex.
Only one of them will succeed and the others will hang for forever as
complete() is not being called. We don't see this issue often because we have
5sec onlining timeout in hv_mem_hot_add() and usually all udev events arrive
in this time frame.
Get rid of the trylock path, waiting on the mutex is supposed to provide the
required serialization.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently we log messages when either we are not able to map an ID to a
channel or when the channel does not have a callback associated
(in the channel interrupt handling path). These messages don't add
any value, get rid of them.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the offer is rescinded, vmbus_close() can free up the channel;
deinitialize the service before closing the channel.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Properly rollback state in vmbus_pocess_offer() in the failure paths.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Execute both ressind and offer messages in the same work context. This serializes these
operations and naturally addresses the various corner cases.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In response to a rescind message, we need to remove the channel and the
corresponding device. Cleanup this code path by factoring out the code
to remove a channel.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Handle the case when the device may be removed when the device has no driver
attached to it.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
NetworkDirect is a service that supports guest RDMA.
Define the GUID for this service.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Correctly rollback state if the failure occurs after we have handed over
the ownership of the buffer to the host.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
return type of wait_for_completion_timeout is unsigned long not int, this
patch changes the type of t from int to unsigned long.
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
return type of wait_for_completion_timeout is unsigned long not int, this
patch changes the type of t from int to unsigned long.
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
return type of wait_for_completion_timeout is unsigned long not int, this
patch changes the type of t from int to unsigned long.
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Without this patch, the state is put to CHANNEL_OPENING_STATE, and when
the driver is loaded next time, vmbus_open() will fail immediately due to
newchannel->state != CHANNEL_OPEN_STATE.
CC: "K. Y. Srinivasan" <kys@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
I got HV_STATUS_INVALID_CONNECTION_ID on Hyper-V 2008 R2 when keeping running
"rmmod hv_netvsc; modprobe hv_netvsc; rmmod hv_utils; modprobe hv_utils"
in a Linux guest. Looks the host has some kind of throttling mechanism if
some kinds of hypercalls are sent too frequently.
Without the patch, the driver can occasionally fail to load.
Also let's retry HV_STATUS_INSUFFICIENT_MEMORY, though we didn't get it
before.
Removed 'case -ENOMEM', since the hypervisor doesn't return this.
CC: "K. Y. Srinivasan" <kys@microsoft.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Before the line vmbus_open() returns, srv->util_cb can be already running
and the variables, like util_fw_version, are needed by the srv->util_cb.
So we have to make sure the variables are initialized before the vmbus_open().
CC: "K. Y. Srinivasan" <kys@microsoft.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Newly introduced clockevent devices made it impossible to unload hv_vmbus
module as clockevents_config_and_register() takes additional reverence to
the module. To make it possible again we do the following:
- avoid setting dev->owner for clockevent devices;
- implement hv_synic_clockevents_cleanup() doing clockevents_unbind_device();
- call it from vmbus_exit().
In theory hv_synic_clockevents_cleanup() can be merged with hv_synic_cleanup(),
however, we call hv_synic_cleanup() from smp_call_function_single() and this
doesn't work for clockevents_unbind_device() as it does such call on its own. I
opted for a separate function.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
SynIC has to be switched off when we unload the module, otherwise registered
memory pages can get corrupted after (as Hyper-V host still writes there) and
we see the following crashes for random processes:
[ 89.116774] BUG: Bad page map in process sh pte:4989c716 pmd:36f81067
[ 89.159454] addr:0000000000437000 vm_flags:00000875 anon_vma: (null) mapping:ffff88007bba55a0 index:37
[ 89.226146] vma->vm_ops->fault: filemap_fault+0x0/0x410
[ 89.257776] vma->vm_file->f_op->mmap: generic_file_mmap+0x0/0x60
[ 89.297570] CPU: 0 PID: 215 Comm: sh Tainted: G B 3.19.0-rc5_bug923184+ #488
[ 89.353738] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 05/23/2012
[ 89.409138] 0000000000000000 000000004e083d7b ffff880036e9fa18 ffffffff81a68d31
[ 89.468724] 0000000000000000 0000000000437000 ffff880036e9fa68 ffffffff811a1e3a
[ 89.519233] 000000004989c716 0000000000000037 ffffea0001edc340 0000000000437000
[ 89.575751] Call Trace:
[ 89.591060] [<ffffffff81a68d31>] dump_stack+0x45/0x57
[ 89.625164] [<ffffffff811a1e3a>] print_bad_pte+0x1aa/0x250
[ 89.667234] [<ffffffff811a2c95>] vm_normal_page+0x55/0xa0
[ 89.703818] [<ffffffff811a3105>] unmap_page_range+0x425/0x8a0
[ 89.737982] [<ffffffff811a3601>] unmap_single_vma+0x81/0xf0
[ 89.780385] [<ffffffff81184320>] ? lru_deactivate_fn+0x190/0x190
[ 89.820130] [<ffffffff811a4131>] unmap_vmas+0x51/0xa0
[ 89.860168] [<ffffffff811ad12c>] exit_mmap+0xac/0x1a0
[ 89.890588] [<ffffffff810763c3>] mmput+0x63/0x100
[ 89.919205] [<ffffffff811eba48>] flush_old_exec+0x3f8/0x8b0
[ 89.962135] [<ffffffff8123b5bb>] load_elf_binary+0x32b/0x1260
[ 89.998581] [<ffffffff811a14f2>] ? get_user_pages+0x52/0x60
hv_synic_cleanup() function exists but noone calls it now. Do the following:
- call hv_synic_cleanup() on each cpu from vmbus_exit();
- write global disable bit through MSR;
- use hv_synic_free_cpu() to avoid memory leask and code duplication.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We need to destroy hv_vmbus_con on module shutdown, otherwise the following
crash is sometimes observed:
[ 76.569845] hv_vmbus: Hyper-V Host Build:9600-6.3-17-0.17039; Vmbus version:3.0
[ 82.598859] BUG: unable to handle kernel paging request at ffffffffa0003480
[ 82.599287] IP: [<ffffffffa0003480>] 0xffffffffa0003480
[ 82.599287] PGD 1f34067 PUD 1f35063 PMD 3f72d067 PTE 0
[ 82.599287] Oops: 0010 [#1] SMP
[ 82.599287] Modules linked in: [last unloaded: hv_vmbus]
[ 82.599287] CPU: 0 PID: 26 Comm: kworker/0:1 Not tainted 3.19.0-rc5_bug923184+ #488
[ 82.599287] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
[ 82.599287] Workqueue: hv_vmbus_con 0xffffffffa0003480
[ 82.599287] task: ffff88007b6ddfa0 ti: ffff88007f8f8000 task.ti: ffff88007f8f8000
[ 82.599287] RIP: 0010:[<ffffffffa0003480>] [<ffffffffa0003480>] 0xffffffffa0003480
[ 82.599287] RSP: 0018:ffff88007f8fbe00 EFLAGS: 00010202
...
To avoid memory leaks we need to free monitor_pages and int_page for
vmbus_connection. Implement vmbus_disconnect() function by separating cleanup
path from vmbus_connect().
As we use hv_vmbus_con to release channels (see free_channel() in channel_mgmt.c)
we need to make sure the work was done before we remove the queue, do that with
drain_workqueue(). We also need to avoid handling messages which can (potentially)
create new channels, so set vmbus_connection.conn_state = DISCONNECTED at the very
beginning of vmbus_exit() and check for that in vmbus_onmessage_work().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On driver shutdown device_obj is being freed twice:
1) In vmbus_free_channels()
2) vmbus_device_release() (which is being triggered by device_unregister() in
vmbus_device_unregister().
This double kfree leads to the following sporadic crash on driver unload:
[ 23.469876] general protection fault: 0000 [#1] SMP
[ 23.470036] Modules linked in: hv_vmbus(-)
[ 23.470036] CPU: 2 PID: 213 Comm: rmmod Not tainted 3.19.0-rc5_bug923184+ #488
[ 23.470036] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 05/23/2012
[ 23.470036] task: ffff880036ef1cb0 ti: ffff880036ce8000 task.ti: ffff880036ce8000
[ 23.470036] RIP: 0010:[<ffffffff811d2e1b>] [<ffffffff811d2e1b>] __kmalloc_node_track_caller+0xdb/0x1e0
[ 23.470036] RSP: 0018:ffff880036cebcc8 EFLAGS: 00010246
...
When this crash does not happen on driver unload the similar one is expected if
we try to load hv_vmbus again.
Remove kfree from vmbus_free_channels() as freeing it from
vmbus_device_release() seems right.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
All channel work queues are named 'hv_vmbus_ctl', this makes them
indistinguishable in ps output and makes it hard to link to the corresponding
vmbus device. Rename them to hv_vmbus_ctl/N and make vmbus device names match,
e.g. now vmbus_1 device is served by hv_vmbus_ctl/1 work queue.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When an SMP Hyper-V guest is running on top of 2012R2 Server and secondary
cpus are sent offline (with echo 0 > /sys/devices/system/cpu/cpu$cpu/online)
the system freeze is observed. This happens due to the fact that on newer
hypervisors (Win8, WS2012R2, ...) vmbus channel handlers are distributed
across all cpus (see init_vp_index() function in drivers/hv/channel_mgmt.c)
and on cpu offlining nobody reassigns them to CPU0. Prevent cpu offlining
when vmbus is loaded until the issue is fixed host-side.
This patch also disables hibernation but it is OK as it is also broken (MCE
error is hit on resume). Suspend still works.
Tested with WS2008R2 and WS2012R2.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here's the big char/misc driver update for 3.20-rc1.
Lots of little things in here, all described in the changelog. Nothing
major or unusual, except maybe the binder selinux stuff, which was all
acked by the proper selinux people and they thought it best to come
through this tree.
All of this has been in linux-next with no reported issues for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlTgs80ACgkQMUfUDdst+yn86gCeMLbxANGExVLd+PR46GNsAUQb
SJ4AmgIqrkIz+5LCwZWM02ldbYhPeBVf
=lfmM
-----END PGP SIGNATURE-----
Merge tag 'char-misc-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc patches from Greg KH:
"Here's the big char/misc driver update for 3.20-rc1.
Lots of little things in here, all described in the changelog.
Nothing major or unusual, except maybe the binder selinux stuff, which
was all acked by the proper selinux people and they thought it best to
come through this tree.
All of this has been in linux-next with no reported issues for a while"
* tag 'char-misc-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (90 commits)
coresight: fix function etm_writel_cp14() parameter order
coresight-etm: remove check for unknown Kconfig macro
coresight: fixing CPU hwid lookup in device tree
coresight: remove the unnecessary function coresight_is_bit_set()
coresight: fix the debug AMBA bus name
coresight: remove the extra spaces
coresight: fix the link between orphan connection and newly added device
coresight: remove the unnecessary replicator property
coresight: fix the replicator subtype value
pdfdocs: Fix 'make pdfdocs' failure for 'uio-howto.tmpl'
mcb: Fix error path of mcb_pci_probe
virtio/console: verify device has config space
ti-st: clean up data types (fix harmless memory corruption)
mei: me: release hw from reset only during the reset flow
mei: mask interrupt set bit on clean reset bit
extcon: max77693: Constify struct regmap_config
extcon: adc-jack: Release IIO channel on driver remove
extcon: Remove duplicated include from extcon-class.c
Drivers: hv: vmbus: hv_process_timer_expiration() can be static
Drivers: hv: vmbus: serialize Offer and Rescind offer
...
struct acpi_resource_address and struct acpi_resource_extended_address64 share substracts
just at different offsets. To unify the parsing functions, OSPMs like Linux
need a new ACPI_ADDRESS64_ATTRIBUTE as their substructs, so they can
extract the shared data.
This patch also synchronizes the structure changes to the Linux kernel.
The usages are searched by matching the following keywords:
1. acpi_resource_address
2. acpi_resource_extended_address
3. ACPI_RESOURCE_TYPE_ADDRESS
4. ACPI_RESOURCE_TYPE_EXTENDED_ADDRESS
And we found and fixed the usages in the following files:
arch/ia64/kernel/acpi-ext.c
arch/ia64/pci/pci.c
arch/x86/pci/acpi.c
arch/x86/pci/mmconfig-shared.c
drivers/xen/xen-acpi-memhotplug.c
drivers/acpi/acpi_memhotplug.c
drivers/acpi/pci_root.c
drivers/acpi/resource.c
drivers/char/hpet.c
drivers/pnp/pnpacpi/rsparser.c
drivers/hv/vmbus_drv.c
Build tests are passed with defconfig/allnoconfig/allyesconfig and
defconfig+CONFIG_ACPI=n.
Original-by: Thomas Gleixner <tglx@linutronix.de>
Original-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
drivers/hv/vmbus_drv.c:582:6: sparse: symbol 'hv_process_timer_expiration' was not declared. Should it be static?
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 4b2f9abea5 ("staging: hv: convert channel_mgmt.c to not call
osd_schedule_callback")' was written under an assumption that we never receive
Rescind offer while we're still processing the initial Offer request. However,
the issue we fixed in 04a258c162 could be caused by this assumption not
always being true.
In particular, we need to protect against the following:
1) Receiving a Rescind offer after we do queue_work() for processing an Offer
request and before we actually enter vmbus_process_offer(). work.func points
to vmbus_process_offer() at this moment and in vmbus_onoffer_rescind() we do
another queue_work() without a check so we'll enter vmbus_process_offer()
twice.
2) Receiving a Rescind offer after we enter vmbus_process_offer() and
especially after we set >state = CHANNEL_OPEN_STATE. Many things can go
wrong in that case, e.g. we can call free_channel() while we're still using
it.
Implement the required protection by changing work->func at the very end of
vmbus_process_offer() and checking work->func in vmbus_onoffer_rescind(). In
case we receive rescind offer during or before vmbus_process_offer() is done
we set rescind flag to true and we check it at the end of vmbus_process_offer()
so such offer will not get lost.
Suggested-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sc_lock spinlock in struct vmbus_channel is being used to not only protect the
sc_list field, e.g. vmbus_open() function uses it to implement test-and-set
access to the state field. Rename it to the more generic 'lock' and add the
description.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
vmbus_device_create() result is not being checked in vmbus_process_offer() and
it can fail if kzalloc() fails. Add the check and do minor cleanup to avoid
additional duplication of "free_channel(); return;" block.
Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the case the user-space daemon crashes, hangs or is killed, we
need to down the semaphore, otherwise, after the daemon starts next
time, the obsolete data in fcopy_transaction.message or
fcopy_transaction.fcopy_msg will be used immediately.
Cc: Jason Wang <jasowang@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, the API for sending a multi-page buffer over VMBUS is limited to
a maximum pfn array of MAX_MULTIPAGE_BUFFER_COUNT. This limitation is
not imposed by the host and unnecessarily limits the maximum payload
that can be sent. Implement an API that does not have this restriction.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Correctly compute the local (gpadl) handle.
I would like to thank Michael Brown <mcb30@ipxe.org> for seeing this bug.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reported-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Implement a clockevent device based on the timer support available on
Hyper-V.
In this version of the patch I have addressed Jason's review comments.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We currently release memory (balloon down) in the interrupt context and we also
post memory status while releasing memory. Rather than posting the status
in the interrupt context, wakeup the status posting thread to post the status.
This will address the inconsistent lock state that Sitsofe Wheeler <sitsofe@gmail.com>
reported:
http://lkml.iu.edu/hypermail/linux/kernel/1411.1/00075.html
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reported-by: Sitsofe Wheeler <sitsofe@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We support memory hot-add in the Hyper-V balloon driver by hot adding an appropriately
sized and aligned region and controlling the on-lining of pages within that region
based on the pages that the host wants us to online. We do this because the
granularity and alignment requirements in Linux are different from what Windows
expects. The state to manage the onlining of pages needs to be correctly
protected. Fix this bug.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make adjustments in computing the balloon floor. The current computation
of the balloon floor was not appropriate for virtual machines with more than
10 GB of assigned memory - we would get into situations where the host would
agressively balloon down the guest and leave the guest in an unusable state.
This patch fixes the issue by raising the floor.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Replace calls for smp_processor_id() to get_cpu() to get the CPU ID of
the current CPU. In these instances, there is no correctness issue with
regards to preemption, we just need the current CPU ID.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here's the big char/misc driver update for 3.19-rc1
Lots of little things all over the place in different drivers, and a new
subsystem, "coresight" has been added. Full details are in the
shortlog.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlSODosACgkQMUfUDdst+ykSNwCfcqx1Z3rQzbLwSrR2sa1fV3Zb
yEAAniJoLZ4ZkoQK4/1ozsFc31q+gXNm
=/epr
-----END PGP SIGNATURE-----
Merge tag 'char-misc-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here's the big char/misc driver update for 3.19-rc1
Lots of little things all over the place in different drivers, and a
new subsystem, "coresight" has been added. Full details are in the
shortlog"
* tag 'char-misc-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (73 commits)
parport: parport_pc, do not remove parent devices early
spmi: Remove shutdown/suspend/resume kernel-doc
carma-fpga-program: drop videobuf dependency
carma-fpga: drop videobuf dependency
carma-fpga-program.c: fix compile errors
i8k: Fix temperature bug handling in i8k_get_temp()
cxl: Name interrupts in /proc/interrupt
CXL: Return error to PSL if IRQ demultiplexing fails & print clearer warning
coresight-replicator: remove .owner field for driver
coresight: fixed comments in coresight.h
coresight: fix typo in comment in coresight-priv.h
coresight: bindings for coresight drivers
coresight: Adding ABI documentation
w1: support auto-load of w1_bq27000 module.
w1: avoid potential u16 overflow
cn: verify msg->len before making callback
mei: export fw status registers through sysfs
mei: read and print all six FW status registers
mei: txe: add cherrytrail device id
mei: kill cached host and me csr values
...
This patch adds proper handling of the vNIC hot removal event, which includes
a rescind-channel-offer message from the host side that triggers vNIC close and
removal. In this case, the notices to the host during close and removal is not
necessary because the channel is rescinded. This patch blocks these unnecessary
messages, and lets vNIC removal process complete normally.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If num_ballooned is not 0, we shouldn't neglect the
already-partially-allocated 2MB memory block(s).
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we fail to send a message to userspace daemon with cn_netlink_send()
there is no need to wait for userspace to reply as it is not going to
happen. This happens when kvp or vss daemon is stopped after a successful
handshake. Report HV_E_FAIL immediately and cancel the timeout job so
host won't receive two failures.
Use pr_warn() for VSS and pr_debug() for KVP deliberately as VSS request
are rare and result in a failed backup. KVP requests are much more frequent
after a successful handshake so avoid flooding logs. It would be nice to
have an ability to de-negotiate with the host in case userspace daemon gets
disconnected so we won't receive new requests. But I'm not sure it is
possible.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In contrast with KVP there is no timeout when communicating with
userspace VSS daemon. In case it gets stuck performing freeze/thaw
operation no message will be sent to the host so it will take very
long (around 10 minutes) before backup fails. Introduce 10 second
timeout using schedule_delayed_work().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When build with Debug the following crash is sometimes observed:
Call Trace:
[<ffffffff812b9600>] string+0x40/0x100
[<ffffffff812bb038>] vsnprintf+0x218/0x5e0
[<ffffffff810baf7d>] ? trace_hardirqs_off+0xd/0x10
[<ffffffff812bb4c1>] vscnprintf+0x11/0x30
[<ffffffff8107a2f0>] vprintk+0xd0/0x5c0
[<ffffffffa0051ea0>] ? vmbus_process_rescind_offer+0x0/0x110 [hv_vmbus]
[<ffffffff8155c71c>] printk+0x41/0x45
[<ffffffffa004ebac>] vmbus_device_unregister+0x2c/0x40 [hv_vmbus]
[<ffffffffa0051ecb>] vmbus_process_rescind_offer+0x2b/0x110 [hv_vmbus]
...
This happens due to the following race: between 'if (channel->device_obj)' check
in vmbus_process_rescind_offer() and pr_debug() in vmbus_device_unregister() the
device can disappear. Fix the issue by taking an additional reference to the
device before proceeding to vmbus_device_unregister().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In win8 we have a feature that allows for interrupt driven flow management
for host/guest communication. For instance, if the host were blocked because
there was no space available in the ringbuffer, the host could request that the
guest send an interrupt when space becomes available in the ringbuffer (when
the guest drains the ringbuffer).
While this feature was implemented in the guest a while ago, we had not
advertised that the guest supported this feature. This patch advertises
the support to the host.
For pre-win8 hosts, this has no effect since the size of the ringbuffer
control structure has not changed and all changes have been backward
compatible - unused/reserved space has been used to implement this
feature.
In this version of the patch I have cleaned up the commit log based on
feedback from Greg KH.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Disable preemption when sampling current processor ID when preemption
is otherwise possible.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Minimize failures in this function by pre-allocating the buffer
for posting messages. The hypercall for posting the message can fail
for a number of reasons:
1. Transient resource related issues
2. Buffer alignment
3. Buffer cannot span a page boundry
We address issues 2 and 3 by preallocating a per-cpu page for the buffer.
Transient resource related failures are handled by retrying by the callers
of this function.
This patch is based on the investigation
done by Dexuan Cui <decui@microsoft.com>.
I would like to thank Sitsofe Wheeler <sitsofe@yahoo.com>
for reporting the issue and helping in debuggging.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Cc: <stable@vger.kernel.org>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eliminate calls to BUG_ON() in vmbus_close_internal().
We have chosen to potentially leak memory, than crash the guest
in case of failures.
In this version of the patch I have addressed comments from
Dan Carpenter (dan.carpenter@oracle.com).
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix a bug in vmbus_open() and properly propagate the error. I would
like to thank Dexuan Cui <decui@microsoft.com> for identifying the
issue.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eliminate the call to BUG_ON() by waiting for the host to respond. We are
trying to reclaim the ownership of memory that was given to the host and so
we will have to wait until the host responds.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eliminate calls to BUG_ON() by properly handling errors. In cases where
rollback is possible, we will return the appropriate error to have the
calling code decide how to rollback state. In the case where we are
transferring ownership of the guest physical pages to the host,
we will wait for the host to respond.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Posting messages to the host can fail because of transient resource
related failures. Correctly deal with these failures and increase the
number of attempts to post the message before giving up.
In this version of the patch, I have normalized the error code to
Linux error code.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here's the big driver misc / char pull request for 3.17-rc1.
Lots of things in here, the thunderbolt support for Apple laptops, some
other new drivers, testing fixes, and other good things. All have been
in linux-next for a long time.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlPf1LcACgkQMUfUDdst+ymaVwCgqMrKFmpduBufOSFROhxlfB5Q
ajsAoNDmIn3pgla+kj23Y5ib20aMi++s
=IdIr
-----END PGP SIGNATURE-----
Merge tag 'char-misc-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc driver patches from Greg KH:
"Here's the big driver misc / char pull request for 3.17-rc1.
Lots of things in here, the thunderbolt support for Apple laptops,
some other new drivers, testing fixes, and other good things. All
have been in linux-next for a long time"
* tag 'char-misc-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (119 commits)
misc: bh1780: Introduce the use of devm_kzalloc
Lattice ECP3 FPGA: Correct endianness
drivers/misc/ti-st: Load firmware from ti-connectivity directory.
dt-bindings: extcon: Add support for SM5502 MUIC device
extcon: sm5502: Change internal hardware switch according to cable type
extcon: sm5502: Detect cable state after completing platform booting
extcon: sm5502: Add support new SM5502 extcon device driver
extcon: arizona: Get MICVDD against extcon device
extcon: Remove unnecessary OOM messages
misc: vexpress: Fix sparse non static symbol warnings
mei: drop unused hw dependent fw status functions
misc: bh1770glc: Use managed functions
pcmcia: remove DEFINE_PCI_DEVICE_TABLE usage
misc: remove DEFINE_PCI_DEVICE_TABLE usage
ipack: Replace DEFINE_PCI_DEVICE_TABLE macro use
drivers/char/dsp56k.c: drop check for negativity of unsigned parameter
mei: fix return value on disconnect timeout
mei: don't schedule suspend in pm idle
mei: start disconnect request timer consistently
mei: reset client connection state on timeout
...
We should schedule the 5s "timer work" before starting the data transfer,
otherwise, the data transfer code may finish so fast on another
virtual cpu that when the code(fcopy_write()) trying to cancel the 5s
"timer work" can occasionally fail because the "timer work" may haven't
been scheduled yet and as a result the fcopy process will be aborted
wrongly by fcopy_work_func() in 5s.
Thank Liz Zhang <lizzha@microsoft.com> for the initial investigation
on the bug.
This addresses https://bugzilla.redhat.com/show_bug.cgi?id=1118123
Tested-by: Liz Zhang <lizzha@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This resolves a number of merge issues with changes in this tree and
Linus's tree at the same time.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add code to poll the channel since we process only one message
at a time and the host may not interrupt us. Also increase the
receive buffer size since some KVP messages are close to 8K bytes in size.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Starting with Win8, we have implemented several optimizations to improve the
scalability and performance of the VMBUS transport between the Host and the
Guest. Some of the non-performance critical services cannot leverage these
optimization since they only read and process one message at a time.
Make adjustments to the callback dispatch code to account for the way
non-performance critical drivers handle reading of the channel.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
All its callers depends on the return value of -ENOBUFS to reallocate a
bigger buffer and retry the receiving. So there's no need to call
pr_err() here since it was not a real issue, otherwise syslog will be
flooded by this false warning.
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull networking updates from David Miller:
1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov.
2) Multiqueue support in xen-netback and xen-netfront, from Andrew J
Benniston.
3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn
Mork.
4) BPF now has a "random" opcode, from Chema Gonzalez.
5) Add more BPF documentation and improve test framework, from Daniel
Borkmann.
6) Support TCP fastopen over ipv6, from Daniel Lee.
7) Add software TSO helper functions and use them to support software
TSO in mvneta and mv643xx_eth drivers. From Ezequiel Garcia.
8) Support software TSO in fec driver too, from Nimrod Andy.
9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli.
10) Handle broadcasts more gracefully over macvlan when there are large
numbers of interfaces configured, from Herbert Xu.
11) Allow more control over fwmark used for non-socket based responses,
from Lorenzo Colitti.
12) Do TCP congestion window limiting based upon measurements, from Neal
Cardwell.
13) Support busy polling in SCTP, from Neal Horman.
14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru.
15) Bridge promisc mode handling improvements from Vlad Yasevich.
16) Don't use inetpeer entries to implement ID generation any more, it
performs poorly, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
rtnetlink: fix userspace API breakage for iproute2 < v3.9.0
tcp: fixing TLP's FIN recovery
net: fec: Add software TSO support
net: fec: Add Scatter/gather support
net: fec: Increase buffer descriptor entry number
net: fec: Factorize feature setting
net: fec: Enable IP header hardware checksum
net: fec: Factorize the .xmit transmit function
bridge: fix compile error when compiling without IPv6 support
bridge: fix smatch warning / potential null pointer dereference
via-rhine: fix full-duplex with autoneg disable
bnx2x: Enlarge the dorq threshold for VFs
bnx2x: Check for UNDI in uncommon branch
bnx2x: Fix 1G-baseT link
bnx2x: Fix link for KR with swapped polarity lane
sctp: Fix sk_ack_backlog wrap-around problem
net/core: Add VF link state control policy
net/fsl: xgmac_mdio is dependent on OF_MDIO
net/fsl: Make xgmac_mdio read error message useful
net_sched: drr: warn when qdisc is not work conserving
...
The uuid structure could be managed as a const in several places.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We try to free two pages when only one has been allocated.
Cleanup path is unlikely, so I haven't found any trace that would fit,
but I hope that free_pages_prepare() does catch it.
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current code posts periodic memory pressure status from a dedicated thread.
Under some conditions, especially when we are releasing a lot of memory into
the guest, we may not send timely pressure reports back to the host. Fix this
issue by reporting pressure in all contexts that can be active in this driver.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pfncount is of type u32 and thus can never be smaller than 0.
Found by the coverity scanner, CID 143213.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently the mapping of the relID to channel is done under the protection of a
single spin lock. Starting with ws2012, each channel is bound to a specific VCPU
in the guest. Use this binding to eliminate the spin lock by setting up
per-cpu state for mapping relId to the channel.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
By ensuring that we set the callback handler to NULL in the channel close
path on the same CPU that the channel is bound to, we can eliminate this lock
acquisition and release in a performance critical path.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Only ws2012r2 hosts support the ability to reconnect to the host on VMBUS. This functionality
is needed by kexec in Linux. To use this functionality we need to negotiate version 3.0 of the
VMBUS protocol.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org> [3.9+]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here's the big char/misc driver updates for 3.15-rc1.
Lots of various things here, including the new mcb driver subsystem.
All of these have been in linux-next for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iEYEABECAAYFAlM7ArIACgkQMUfUDdst+ylS+gCfcJr0Zo2v5aWnqD7rFtFETmFI
LhcAoNTQ4cvlVdxnI0driWCWFYxLj6at
=aj+L
-----END PGP SIGNATURE-----
Merge tag 'char-misc-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver patches from Greg KH:
"Here's the big char/misc driver updates for 3.15-rc1.
Lots of various things here, including the new mcb driver subsystem.
All of these have been in linux-next for a while"
* tag 'char-misc-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (118 commits)
extcon: Move OF helper function to extcon core and change function name
extcon: of: Remove unnecessary function call by using the name of device_node
extcon: gpio: Use SIMPLE_DEV_PM_OPS macro
extcon: palmas: Use SIMPLE_DEV_PM_OPS macro
mei: don't use deprecated DEFINE_PCI_DEVICE_TABLE macro
mei: amthif: fix checkpatch error
mei: client.h fix checkpatch errors
mei: use cl_dbg where appropriate
mei: fix Unnecessary space after function pointer name
mei: report consistently copy_from/to_user failures
mei: drop pr_fmt macros
mei: make me hw headers private to me hw.
mei: fix memory leak of pending write cb objects
mei: me: do not reset when less than expected data is received
drivers: mcb: Fix build error discovered by 0-day bot
cs5535-mfgpt: Simplify dependencies
spmi: pm: drop bus-level PM suspend/resume routines
spmi: pmic_arb: make selectable on ARCH_QCOM
Drivers: hv: vmbus: Increase the limit on the number of pfns we can handle
pch_phub: Report error writing MAC back to user
...
Compiling last minute changes without setting the proper config
options is not really clever.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The vmbus/hyperv interrupt handling is another complete trainwreck and
probably the worst of all currently in tree.
If CONFIG_HYPERV=y then the interrupt delivery to the vmbus happens
via the direct HYPERVISOR_CALLBACK_VECTOR. So far so good, but:
The driver requests first a normal device interrupt. The only reason
to do so is to increment the interrupt stats of that device
interrupt. For no reason it also installs a private flow handler.
We have proper accounting mechanisms for direct vectors, but of
course it's too much effort to add that 5 lines of code.
Aside of that the alloc_intr_gate() is not protected against
reallocation which makes module reload impossible.
Solution to the problem is simple to rip out the whole mess and
implement it correctly.
First of all move all that code to arch/x86/kernel/cpu/mshyperv.c and
merily install the HYPERVISOR_CALLBACK_VECTOR with proper reallocation
protection and use the proper direct vector accounting mechanism.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linuxdrivers <devel@linuxdriverproject.org>
Cc: x86 <x86@kernel.org>
Link: http://lkml.kernel.org/r/20140223212739.028307673@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use a resource for the hyperv mmio region instead of start/size
variables. Register the region properly so it shows up in
/proc/iomem.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Implement the file copy service for Linux guests on Hyper-V. This permits the
host to copy a file (over VMBUS) into the guest. This facility is part of
"guest integration services" supported on the Windows platform.
Here is a link that provides additional details on this functionality:
http://technet.microsoft.com/en-us/library/dn464282.aspx
In V1 version of the patch I have addressed comments from
Olaf Hering <olaf@aepfle.de> and Dan Carpenter <dan.carpenter@oracle.com>
In V2 version of this patch I did some minor cleanup (making some globals
static). In V4 version of the patch I have addressed all of Olaf's
most recent set of comments/concerns.
In V5 version of the patch I had addressed Greg's most recent comments.
I would like to thank Greg for suggesting that I use misc device; it has
significantly simplified the code.
In V6 version of the patch I have cleaned up error message based on Olaf's
comments. I have also rebased the patch based on the current tip.
In this version of the patch, I have addressed the latest comments from Greg.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The non-interruptible sleep of the memory pressure posting thread
results in higher reported load average. Make this sleep interruptible.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This allows replying only to the requestor portid while still
supporting broadcasting. Pass 0 to portid for the previous behavior.
Signed-off-by: David Fries <David@Fries.net>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current channel code is using scatterlist abstraction to pass data to the
ringbuffer API on the send path. This causes unnecessary translations
between virtual and physical addresses. Fix this.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On Gen2 firmware, Hyper-V does not emulate the PCI bus. However, the MMIO
information is packaged up in DSDT. Extract this information and export it
for use by the synthetic framebuffer driver. This is the only driver that
needs this currently.
In this version of the patch mmio, I have updated the hyperv header file
(linux/hyperv.h) with mmio definitions.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the guest attempts to connect with the host when there may already be a
connection with the host (as would be the case during the kdump/kexec path),
it is difficult to guarantee timely response from the host. Starting with
WS2012 R2, the host supports this ability to re-connect with the host
(explicitly to support kexec). Prior to responding to the guest, the host
needs to ensure that device states based on the previous connection to
the host have been properly torn down. This may introduce unbounded delays.
To deal with this issue, don't do a timed wait during the initial connect
with the host.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During the initial VMBUS connect phase, starting with WS2012 R2, we should
specify the VPCU in the guest that should receive the notification. Fix this
issue. This fix is required to properly connect to the host in the kexeced
kernel.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org> [3.9+]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This will allow us to use bigger receive buffer, and prevent allocation failure
due to fragmented memory.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- ACPI core changes to make it create a struct acpi_device object for every
device represented in the ACPI tables during all namespace scans regardless
of the current status of that device. In accordance with this, ACPI hotplug
operations will not delete those objects, unless the underlying ACPI tables
go away.
- On top of the above, new sysfs attribute for ACPI device objects allowing
user space to check device status by triggering the execution of _STA for
its ACPI object. From Srinivas Pandruvada.
- ACPI core hotplug changes reducing code duplication, integrating the
PCI root hotplug with the core and reworking container hotplug.
- ACPI core simplifications making it use ACPI_COMPANION() in the code
"glueing" ACPI device objects to "physical" devices.
- ACPICA update to upstream version 20131218. This adds support for the
DBG2 and PCCT tables to ACPICA, fixes some bugs and improves debug
facilities. From Bob Moore, Lv Zheng and Betty Dall.
- Init code change to carry out the early ACPI initialization earlier.
That should allow us to use ACPI during the timekeeping initialization
and possibly to simplify the EFI initialization too. From Chun-Yi Lee.
- Clenups of the inclusions of ACPI headers in many places all over from
Lv Zheng and Rashika Kheria (work in progress).
- New helper for ACPI _DSM execution and rework of the code in drivers
that uses _DSM to execute it via the new helper. From Jiang Liu.
- New Win8 OSI blacklist entries from Takashi Iwai.
- Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun Guo,
Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava, Rashika Kheria,
Tang Chen, Zhang Rui.
- intel_pstate driver updates, including proper Baytrail support, from
Dirk Brandewie and intel_pstate documentation from Ramkumar Ramachandra.
- Generic CPU boost ("turbo") support for cpufreq from Lukasz Majewski.
- powernow-k6 cpufreq driver fixes from Mikulas Patocka.
- cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark Brown.
- Assorted cpufreq drivers fixes and cleanups from Anson Huang, John Tobias,
Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh Kumar.
- cpuidle cleanups from Bartlomiej Zolnierkiewicz.
- Support for hibernation APM events from Bin Shi.
- Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC disabled
during thaw transitions from Bjørn Mork.
- PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf Hansson.
- PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente Kurusa,
Rashika Kheria.
- New tool for profiling system suspend from Todd E Brandt and a cpupower
tool cleanup from One Thousand Gnomes.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJS3a1eAAoJEILEb/54YlRxnTgP/iGawvgjKWm6Qqp7WSIvd5gQ
zZ6q75C6Pc/W2fq1+OzVGnpCF8WYFy+nFDAXOvUHjIXuoxSwFcuW5l4aMckgl/0a
TXEWe9MJrCHHRfDApfFacCJ44U02bjJAD5vTyL/hKA+IHeinq4WCSojryYC+8jU0
cBrUIV0aNH8r5JR2WJNAyv/U29rXsDUOu0I4qTqZ4YaZT6AignMjtLXn1e9AH1Pn
DPZphTIo/HMnb+kgBOjt4snMk+ahVO9eCOxh/hH8ecnWExw9WynXoU5Nsna0tSZs
ssyHC7BYexD3oYsG8D52cFUpp4FCsJ0nFQNa2kw0LY+0FBNay43LySisKYHZPXEs
2WpESDv+/t7yhtnrvM+TtA7aBheKm2XMWGFSu/aERLE17jIidOkXKH5Y7ryYLNf/
uyRKxNS0NcZWZ0G+/wuY02jQYNkfYz3k/nTr8BAUItRBjdporGIRNEnR9gPzgCUC
uQhjXWMPulqubr8xbyefPWHTEzU2nvbXwTUWGjrBxSy8zkyy5arfqizUj+VG6afT
NsboANoMHa9b+xdzigSFdA3nbVK6xBjtU6Ywntk9TIpODKF5NgfARx0H+oSH+Zrj
32bMzgZtHw/lAbYsnQ9OnTY6AEWQYt6NMuVbTiLXrMHhM3nWwfg/XoN4nZqs6jPo
IYvE6WhQZU6L6fptGHFC
=dRf6
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki:
"As far as the number of commits goes, the top spot belongs to ACPI
this time with cpufreq in the second position and a handful of PM
core, PNP and cpuidle updates. They are fixes and cleanups mostly, as
usual, with a couple of new features in the mix.
The most visible change is probably that we will create struct
acpi_device objects (visible in sysfs) for all devices represented in
the ACPI tables regardless of their status and there will be a new
sysfs attribute under those objects allowing user space to check that
status via _STA.
Consequently, ACPI device eject or generally hot-removal will not
delete those objects, unless the table containing the corresponding
namespace nodes is unloaded, which is extremely rare. Also ACPI
container hotplug will be handled quite a bit differently and cpufreq
will support CPU boost ("turbo") generically and not only in the
acpi-cpufreq driver.
Specifics:
- ACPI core changes to make it create a struct acpi_device object for
every device represented in the ACPI tables during all namespace
scans regardless of the current status of that device. In
accordance with this, ACPI hotplug operations will not delete those
objects, unless the underlying ACPI tables go away.
- On top of the above, new sysfs attribute for ACPI device objects
allowing user space to check device status by triggering the
execution of _STA for its ACPI object. From Srinivas Pandruvada.
- ACPI core hotplug changes reducing code duplication, integrating
the PCI root hotplug with the core and reworking container hotplug.
- ACPI core simplifications making it use ACPI_COMPANION() in the
code "glueing" ACPI device objects to "physical" devices.
- ACPICA update to upstream version 20131218. This adds support for
the DBG2 and PCCT tables to ACPICA, fixes some bugs and improves
debug facilities. From Bob Moore, Lv Zheng and Betty Dall.
- Init code change to carry out the early ACPI initialization
earlier. That should allow us to use ACPI during the timekeeping
initialization and possibly to simplify the EFI initialization too.
From Chun-Yi Lee.
- Clenups of the inclusions of ACPI headers in many places all over
from Lv Zheng and Rashika Kheria (work in progress).
- New helper for ACPI _DSM execution and rework of the code in
drivers that uses _DSM to execute it via the new helper. From
Jiang Liu.
- New Win8 OSI blacklist entries from Takashi Iwai.
- Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun
Guo, Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava,
Rashika Kheria, Tang Chen, Zhang Rui.
- intel_pstate driver updates, including proper Baytrail support,
from Dirk Brandewie and intel_pstate documentation from Ramkumar
Ramachandra.
- Generic CPU boost ("turbo") support for cpufreq from Lukasz
Majewski.
- powernow-k6 cpufreq driver fixes from Mikulas Patocka.
- cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark
Brown.
- Assorted cpufreq drivers fixes and cleanups from Anson Huang, John
Tobias, Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh
Kumar.
- cpuidle cleanups from Bartlomiej Zolnierkiewicz.
- Support for hibernation APM events from Bin Shi.
- Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC
disabled during thaw transitions from Bjørn Mork.
- PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf
Hansson.
- PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente
Kurusa, Rashika Kheria.
- New tool for profiling system suspend from Todd E Brandt and a
cpupower tool cleanup from One Thousand Gnomes"
* tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (153 commits)
thermal: exynos: boost: Automatic enable/disable of BOOST feature (at Exynos4412)
cpufreq: exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQ
Documentation: cpufreq / boost: Update BOOST documentation
cpufreq: exynos: Extend Exynos cpufreq driver to support boost
cpufreq / boost: Kconfig: Support for software-managed BOOST
acpi-cpufreq: Adjust the code to use the common boost attribute
cpufreq: Add boost frequency support in core
intel_pstate: Add trace point to report internal state.
cpufreq: introduce cpufreq_generic_get() routine
ARM: SA1100: Create dummy clk_get_rate() to avoid build failures
cpufreq: stats: create sysfs entries when cpufreq_stats is a module
cpufreq: stats: free table and remove sysfs entry in a single routine
cpufreq: stats: remove hotplug notifiers
cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly
cpufreq: speedstep: remove unused speedstep_get_state
platform: introduce OF style 'modalias' support for platform bus
PM / tools: new tool for suspend/resume performance optimization
ACPI: fix module autoloading for ACPI enumerated devices
ACPI: add module autoloading support for ACPI enumerated devices
ACPI: fix create_modalias() return value handling
...
This patch marks the function hv_synic_free_cpu() as static in hv.c
because it is not used outside this file.
Thus, it also eliminates the following warning in hv.c:
drivers/hv/hv.c:304:6: warning: no previous prototype for ‘hv_synic_free_cpu’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Replace direct inclusions of <acpi/acpi.h>, <acpi/acpi_bus.h> and
<acpi/acpi_drivers.h>, which are incorrect, with <linux/acpi.h>
inclusions and remove some inclusions of those files that aren't
necessary.
First of all, <acpi/acpi.h>, <acpi/acpi_bus.h> and <acpi/acpi_drivers.h>
should not be included directly from any files that are built for
CONFIG_ACPI unset, because that generally leads to build warnings about
undefined symbols in !CONFIG_ACPI builds. For CONFIG_ACPI set,
<linux/acpi.h> includes those files and for CONFIG_ACPI unset it
provides stub ACPI symbols to be used in that case.
Second, there are ordering dependencies between those files that always
have to be met. Namely, it is required that <acpi/acpi_bus.h> be included
prior to <acpi/acpi_drivers.h> so that the acpi_pci_root declarations the
latter depends on are always there. And <acpi/acpi.h> which provides
basic ACPICA type declarations should always be included prior to any other
ACPI headers in CONFIG_ACPI builds. That also is taken care of including
<linux/acpi.h> as appropriate.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> (drivers/pci stuff)
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> (Xen stuff)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rescind of subchannels were not being correctly handled. Fix the bug.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org> [3.11+]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The check for calling free_page() on hv_context.synic_event_page[cpu] is the
same for hv_context.synic_message_page[cpu], like a copy-paste error.
Signed-off-by: Felipe Pena <felipensp@gmail.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We need/want the mei fixes in here so we can apply other updates that
are depending on them.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Don't return success if the buffer has not been initialized.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 666b9adc80 terminated vmbus
version negotiation incorrectly. We need to terminate the version
negotiation only if the current negotiation were to timeout.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: Olaf Hering <ohering@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current code does not correctly negotiate the version numbers for the util
driver when hosted on earlier hosts. The version numbers presented by this
driver were not compatible with the version numbers supported by Windows Server
2008. Fix this problem.
I would like to thank Olaf Hering (ohering@suse.com) for identifying the problem.
Reported-by: Olaf Hering <ohering@suse.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This moves the ringbuffer bus attributes to the dev_groups structure,
deletes the now unneeded struct hv_device_info, and removes some now
unused functions, and variables as everything is now moved to the
dev_groups structure, dev_attrs is no longer needed.
Tested-by: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's no longer needed, and the struct hv_ring_buffer_debug_info
structure shouldn't be "global" so move it to the local .h file instead.
Tested-by: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's only used once, only contains 2 function calls, so just make those
calls directly, deleting the function, and the now unneeded structure
entirely.
Tested-by: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This moves the "client_monitor_conn_id" and "server_monitor_conn_id" bus
attributes to the dev_groups structure, removing the need for it to be
in a temporary structure.
Tested-by: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>