diff options
Diffstat (limited to 'doc/guides/prog_guide/vhost_lib.rst')
-rw-r--r-- | doc/guides/prog_guide/vhost_lib.rst | 94 |
1 files changed, 39 insertions, 55 deletions
diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst index 6b0c6b26..4f997d47 100644 --- a/doc/guides/prog_guide/vhost_lib.rst +++ b/doc/guides/prog_guide/vhost_lib.rst @@ -46,26 +46,8 @@ vhost library should be able to: * Know all the necessary information about the vring: Information such as where the available ring is stored. Vhost defines some - messages to tell the backend all the information it needs to know how to - manipulate the vring. - -Currently, there are two ways to pass these messages and as a result there are -two Vhost implementations in DPDK: *vhost-cuse* (where the character devices -are in user space) and *vhost-user*. - -Vhost-cuse creates a user space character device and hook to a function ioctl, -so that all ioctl commands that are sent from the frontend (QEMU) will be -captured and handled. - -Vhost-user creates a Unix domain socket file through which messages are -passed. - -.. Note:: - - Since DPDK v2.2, the majority of the development effort has gone into - enhancing vhost-user, such as multiple queue, live migration, and - reconnect. Thus, it is strongly advised to use vhost-user instead of - vhost-cuse. + messages (passed through a Unix domain socket file) to tell the backend all + the information it needs to know how to manipulate the vring. Vhost API Overview @@ -75,11 +57,10 @@ The following is an overview of the Vhost API functions: * ``rte_vhost_driver_register(path, flags)`` - This function registers a vhost driver into the system. For vhost-cuse, a - ``/dev/path`` character device file will be created. For vhost-user server - mode, a Unix domain socket file ``path`` will be created. + This function registers a vhost driver into the system. ``path`` specifies + the Unix domain socket file path. - Currently two flags are supported (these are valid for vhost-user only): + Currently supported flags are: - ``RTE_VHOST_USER_CLIENT`` @@ -97,6 +78,38 @@ The following is an overview of the Vhost API functions: This reconnect option is enabled by default. However, it can be turned off by setting this flag. + - ``RTE_VHOST_USER_DEQUEUE_ZERO_COPY`` + + Dequeue zero copy will be enabled when this flag is set. It is disabled by + default. + + There are some truths (including limitations) you might want to know while + setting this flag: + + * zero copy is not good for small packets (typically for packet size below + 512). + + * zero copy is really good for VM2VM case. For iperf between two VMs, the + boost could be above 70% (when TSO is enableld). + + * for VM2NIC case, the ``nb_tx_desc`` has to be small enough: <= 64 if virtio + indirect feature is not enabled and <= 128 if it is enabled. + + The is because when dequeue zero copy is enabled, guest Tx used vring will + be updated only when corresponding mbuf is freed. Thus, the nb_tx_desc + has to be small enough so that the PMD driver will run out of available + Tx descriptors and free mbufs timely. Otherwise, guest Tx vring would be + starved. + + * Guest memory should be backended with huge pages to achieve better + performance. Using 1G page size is the best. + + When dequeue zero copy is enabled, the guest phys address and host phys + address mapping has to be established. Using non-huge pages means far + more page segments. To make it simple, DPDK vhost does a linear search + of those segments, thus the fewer the segments, the quicker we will get + the mapping. NOTE: we may speed it by using tree searching in future. + * ``rte_vhost_driver_session_start()`` This function starts the vhost session loop to handle vhost messages. It @@ -139,35 +152,8 @@ The following is an overview of the Vhost API functions: default. -Vhost Implementations ---------------------- - -Vhost-cuse implementation -~~~~~~~~~~~~~~~~~~~~~~~~~ - -When vSwitch registers the vhost driver, it will register a cuse device driver -into the system and creates a character device file. This cuse driver will -receive vhost open/release/IOCTL messages from the QEMU simulator. - -When the open call is received, the vhost driver will create a vhost device -for the virtio device in the guest. - -When the ``VHOST_SET_MEM_TABLE`` ioctl is received, vhost searches the memory -region to find the starting user space virtual address that maps the memory of -the guest virtual machine. Through this virtual address and the QEMU pid, -vhost can find the file QEMU uses to map the guest memory. Vhost maps this -file into its address space, in this way vhost can fully access the guest -physical memory, which means vhost could access the shared virtio ring and the -guest physical address specified in the entry of the ring. - -The guest virtual machine tells the vhost whether the virtio device is ready -for processing or is de-activated through the ``VHOST_NET_SET_BACKEND`` -message. The registered callback from vSwitch will be called. - -When the release call is made, vhost will destroy the device. - -Vhost-user implementation -~~~~~~~~~~~~~~~~~~~~~~~~~ +Vhost-user Implementations +-------------------------- Vhost-user uses Unix domain sockets for passing messages. This means the DPDK vhost-user implementation has two options: @@ -214,8 +200,6 @@ For ``VHOST_SET_MEM_TABLE`` message, QEMU will send information for each memory region and its file descriptor in the ancillary data of the message. The file descriptor is used to map that region. -There is no ``VHOST_NET_SET_BACKEND`` message as in vhost-cuse to signal -whether the virtio device is ready or stopped. Instead, ``VHOST_SET_VRING_KICK`` is used as the signal to put the vhost device into the data plane, and ``VHOST_GET_VRING_BASE`` is used as the signal to remove the vhost device from the data plane. |