aboutsummaryrefslogtreecommitdiffstats
path: root/doc/guides/prog_guide/vhost_lib.rst
diff options
context:
space:
mode:
authorLuca Boccassi <luca.boccassi@gmail.com>2018-08-14 18:52:30 +0100
committerLuca Boccassi <luca.boccassi@gmail.com>2018-08-14 18:53:17 +0100
commitb63264c8342e6a1b6971c79550d2af2024b6a4de (patch)
tree83114aac64286fe616506c0b3dfaec2ab86ef835 /doc/guides/prog_guide/vhost_lib.rst
parentca33590b6af032bff57d9cc70455660466a654b2 (diff)
New upstream version 18.08upstream/18.08
Change-Id: I32fdf5e5016556d9c0a6d88ddaf1fc468961790a Signed-off-by: Luca Boccassi <luca.boccassi@gmail.com>
Diffstat (limited to 'doc/guides/prog_guide/vhost_lib.rst')
-rw-r--r--doc/guides/prog_guide/vhost_lib.rst128
1 files changed, 121 insertions, 7 deletions
diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index 18227b6a..77af4d77 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -65,14 +65,18 @@ The following is an overview of some key Vhost API functions:
* zero copy is really good for VM2VM case. For iperf between two VMs, the
boost could be above 70% (when TSO is enableld).
- * for VM2NIC case, the ``nb_tx_desc`` has to be small enough: <= 64 if virtio
- indirect feature is not enabled and <= 128 if it is enabled.
+ * For zero copy in VM2NIC case, guest Tx used vring may be starved if the
+ PMD driver consume the mbuf but not release them timely.
- This is because when dequeue zero copy is enabled, guest Tx used vring will
- be updated only when corresponding mbuf is freed. Thus, the nb_tx_desc
- has to be small enough so that the PMD driver will run out of available
- Tx descriptors and free mbufs timely. Otherwise, guest Tx vring would be
- starved.
+ For example, i40e driver has an optimization to maximum NIC pipeline which
+ postpones returning transmitted mbuf until only tx_free_threshold free
+ descs left. The virtio TX used ring will be starved if the formula
+ (num_i40e_tx_desc - num_virtio_tx_desc > tx_free_threshold) is true, since
+ i40e will not return back mbuf.
+
+ A performance tip for tuning zero copy in VM2NIC case is to adjust the
+ frequency of mbuf free (i.e. adjust tx_free_threshold of i40e driver) to
+ balance consumer and producer.
* Guest memory should be backended with huge pages to achieve better
performance. Using 1G page size is the best.
@@ -83,6 +87,11 @@ The following is an overview of some key Vhost API functions:
of those segments, thus the fewer the segments, the quicker we will get
the mapping. NOTE: we may speed it by using tree searching in future.
+ * zero copy can not work when using vfio-pci with iommu mode currently, this
+ is because we don't setup iommu dma mapping for guest memory. If you have
+ to use vfio-pci driver, please insert vfio-pci kernel module in noiommu
+ mode.
+
- ``RTE_VHOST_USER_IOMMU_SUPPORT``
IOMMU support will be enabled when this flag is set. It is disabled by
@@ -160,6 +169,31 @@ The following is an overview of some key Vhost API functions:
Receives (dequeues) ``count`` packets from guest, and stored them at ``pkts``.
+* ``rte_vhost_crypto_create(vid, cryptodev_id, sess_mempool, socket_id)``
+
+ As an extension of new_device(), this function adds virtio-crypto workload
+ acceleration capability to the device. All crypto workload is processed by
+ DPDK cryptodev with the device ID of ``cryptodev_id``.
+
+* ``rte_vhost_crypto_free(vid)``
+
+ Frees the memory and vhost-user message handlers created in
+ rte_vhost_crypto_create().
+
+* ``rte_vhost_crypto_fetch_requests(vid, queue_id, ops, nb_ops)``
+
+ Receives (dequeues) ``nb_ops`` virtio-crypto requests from guest, parses
+ them to DPDK Crypto Operations, and fills the ``ops`` with parsing results.
+
+* ``rte_vhost_crypto_finalize_requests(queue_id, ops, nb_ops)``
+
+ After the ``ops`` are dequeued from Cryptodev, finalizes the jobs and
+ notifies the guest(s).
+
+* ``rte_vhost_crypto_set_zero_copy(vid, option)``
+
+ Enable or disable zero copy feature of the vhost crypto backend.
+
Vhost-user Implementations
--------------------------
@@ -214,8 +248,88 @@ the vhost device from the data plane.
When the socket connection is closed, vhost will destroy the device.
+Guest memory requirement
+------------------------
+
+* Memory pre-allocation
+
+ For non-zerocopy, guest memory pre-allocation is not a must. This can help
+ save of memory. If users really want the guest memory to be pre-allocated
+ (e.g., for performance reason), we can add option ``-mem-prealloc`` when
+ starting QEMU. Or, we can lock all memory at vhost side which will force
+ memory to be allocated when mmap at vhost side; option --mlockall in
+ ovs-dpdk is an example in hand.
+
+ For zerocopy, we force the VM memory to be pre-allocated at vhost lib when
+ mapping the guest memory; and also we need to lock the memory to prevent
+ pages being swapped out to disk.
+
+* Memory sharing
+
+ Make sure ``share=on`` QEMU option is given. vhost-user will not work with
+ a QEMU version without shared memory mapping.
+
Vhost supported vSwitch reference
---------------------------------
For more vhost details and how to support vhost in vSwitch, please refer to
the vhost example in the DPDK Sample Applications Guide.
+
+Vhost data path acceleration (vDPA)
+-----------------------------------
+
+vDPA supports selective datapath in vhost-user lib by enabling virtio ring
+compatible devices to serve virtio driver directly for datapath acceleration.
+
+``rte_vhost_driver_attach_vdpa_device`` is used to configure the vhost device
+with accelerated backend.
+
+Also vhost device capabilities are made configurable to adopt various devices.
+Such capabilities include supported features, protocol features, queue number.
+
+Finally, a set of device ops is defined for device specific operations:
+
+* ``get_queue_num``
+
+ Called to get supported queue number of the device.
+
+* ``get_features``
+
+ Called to get supported features of the device.
+
+* ``get_protocol_features``
+
+ Called to get supported protocol features of the device.
+
+* ``dev_conf``
+
+ Called to configure the actual device when the virtio device becomes ready.
+
+* ``dev_close``
+
+ Called to close the actual device when the virtio device is stopped.
+
+* ``set_vring_state``
+
+ Called to change the state of the vring in the actual device when vring state
+ changes.
+
+* ``set_features``
+
+ Called to set the negotiated features to device.
+
+* ``migration_done``
+
+ Called to allow the device to response to RARP sending.
+
+* ``get_vfio_group_fd``
+
+ Called to get the VFIO group fd of the device.
+
+* ``get_vfio_device_fd``
+
+ Called to get the VFIO device fd of the device.
+
+* ``get_notify_area``
+
+ Called to get the notify area info of the queue.