vhost_net: a kernel-level virtio server
What it is: vhost net is a character device that can be used to reduce
the number of system calls involved in virtio networking.
Existing virtio net code is used in the guest without modification.
There's similarity with vringfd, with some differences and reduced scope
- uses eventfd for signalling
- structures can be moved around in memory at any time (good for
migration, bug work-arounds in userspace)
- write logging is supported (good for migration)
- support memory table and not just an offset (needed for kvm)
common virtio related code has been put in a separate file vhost.c and
can be made into a separate module if/when more backends appear. I used
Rusty's lguest.c as the source for developing this part : this supplied
me with witty comments I wouldn't be able to write myself.
What it is not: vhost net is not a bus, and not a generic new system
call. No assumptions are made on how guest performs hypercalls.
Userspace hypervisors are supported as well as kvm.
How it works: Basically, we connect virtio frontend (configured by
userspace) to a backend. The backend could be a network device, or a tap
device. Backend is also configured by userspace, including vlan/mac
etc.
Status: This works for me, and I haven't see any crashes.
Compared to userspace, people reported improved latency (as I save up to
4 system calls per packet), as well as better bandwidth and CPU
utilization.
Features that I plan to look at in the future:
- mergeable buffers
- zero copy
- scalability tuning: figure out the best threading model to use
Note on RCU usage (this is also documented in vhost.h, near
private_pointer which is the value protected by this variant of RCU):
what is happening is that the rcu_dereference() is being used in a
workqueue item. The role of rcu_read_lock() is taken on by the start of
execution of the workqueue item, of rcu_read_unlock() by the end of
execution of the workqueue item, and of synchronize_rcu() by
flush_workqueue()/flush_work(). In the future we might need to apply
some gcc attribute or sparse annotation to the function passed to
INIT_WORK(). Paul's ack below is for this RCU usage.
(Includes fixes by Alan Cox <alan@linux.intel.com>,
David L Stevens <dlstevens@us.ibm.com>,
Chris Wright <chrisw@redhat.com>)
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-01-14 09:17:27 +03:00
|
|
|
config VHOST_NET
|
2013-01-17 06:53:56 +04:00
|
|
|
tristate "Host kernel accelerator for virtio net"
|
|
|
|
depends on NET && EVENTFD && (TUN || !TUN) && (MACVTAP || !MACVTAP)
|
2013-05-06 12:38:21 +04:00
|
|
|
select VHOST
|
2013-03-20 07:20:14 +04:00
|
|
|
select VHOST_RING
|
vhost_net: a kernel-level virtio server
What it is: vhost net is a character device that can be used to reduce
the number of system calls involved in virtio networking.
Existing virtio net code is used in the guest without modification.
There's similarity with vringfd, with some differences and reduced scope
- uses eventfd for signalling
- structures can be moved around in memory at any time (good for
migration, bug work-arounds in userspace)
- write logging is supported (good for migration)
- support memory table and not just an offset (needed for kvm)
common virtio related code has been put in a separate file vhost.c and
can be made into a separate module if/when more backends appear. I used
Rusty's lguest.c as the source for developing this part : this supplied
me with witty comments I wouldn't be able to write myself.
What it is not: vhost net is not a bus, and not a generic new system
call. No assumptions are made on how guest performs hypercalls.
Userspace hypervisors are supported as well as kvm.
How it works: Basically, we connect virtio frontend (configured by
userspace) to a backend. The backend could be a network device, or a tap
device. Backend is also configured by userspace, including vlan/mac
etc.
Status: This works for me, and I haven't see any crashes.
Compared to userspace, people reported improved latency (as I save up to
4 system calls per packet), as well as better bandwidth and CPU
utilization.
Features that I plan to look at in the future:
- mergeable buffers
- zero copy
- scalability tuning: figure out the best threading model to use
Note on RCU usage (this is also documented in vhost.h, near
private_pointer which is the value protected by this variant of RCU):
what is happening is that the rcu_dereference() is being used in a
workqueue item. The role of rcu_read_lock() is taken on by the start of
execution of the workqueue item, of rcu_read_unlock() by the end of
execution of the workqueue item, and of synchronize_rcu() by
flush_workqueue()/flush_work(). In the future we might need to apply
some gcc attribute or sparse annotation to the function passed to
INIT_WORK(). Paul's ack below is for this RCU usage.
(Includes fixes by Alan Cox <alan@linux.intel.com>,
David L Stevens <dlstevens@us.ibm.com>,
Chris Wright <chrisw@redhat.com>)
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-01-14 09:17:27 +03:00
|
|
|
---help---
|
|
|
|
This kernel module can be loaded in host kernel to accelerate
|
|
|
|
guest networking with virtio_net. Not to be confused with virtio_net
|
|
|
|
module itself which needs to be loaded in guest kernel.
|
|
|
|
|
|
|
|
To compile this driver as a module, choose M here: the module will
|
|
|
|
be called vhost_net.
|
|
|
|
|
2013-05-02 04:52:59 +04:00
|
|
|
config VHOST_SCSI
|
|
|
|
tristate "VHOST_SCSI TCM fabric driver"
|
|
|
|
depends on TARGET_CORE && EVENTFD && m
|
2013-05-06 12:38:21 +04:00
|
|
|
select VHOST
|
2013-05-03 01:14:04 +04:00
|
|
|
select VHOST_RING
|
2013-05-02 04:52:59 +04:00
|
|
|
default n
|
|
|
|
---help---
|
|
|
|
Say M here to enable the vhost_scsi TCM fabric module
|
|
|
|
for use with virtio-scsi guests
|
2013-03-20 07:20:14 +04:00
|
|
|
|
|
|
|
config VHOST_RING
|
|
|
|
tristate
|
|
|
|
---help---
|
|
|
|
This option is selected by any driver which needs to access
|
|
|
|
the host side of a virtio ring.
|
2013-05-06 12:38:21 +04:00
|
|
|
|
|
|
|
config VHOST
|
|
|
|
tristate
|
|
|
|
---help---
|
|
|
|
This option is selected by any driver which needs to access
|
|
|
|
the core of vhost.
|
2015-04-24 15:27:24 +03:00
|
|
|
|
|
|
|
config VHOST_CROSS_ENDIAN_LEGACY
|
|
|
|
bool "Cross-endian support for vhost"
|
|
|
|
default n
|
|
|
|
---help---
|
|
|
|
This option allows vhost to support guests with a different byte
|
|
|
|
ordering from host while using legacy virtio.
|
|
|
|
|
|
|
|
Userspace programs can control the feature using the
|
|
|
|
VHOST_SET_VRING_ENDIAN and VHOST_GET_VRING_ENDIAN ioctls.
|
|
|
|
|
|
|
|
This is only useful on a few platforms (ppc64 and arm64). Since it
|
|
|
|
adds some overhead, it is disabled by default.
|
|
|
|
|
|
|
|
If unsure, say "N".
|