diff options
author | Steven Luong <sluong@cisco.com> | 2019-02-01 10:23:56 -0800 |
---|---|---|
committer | Damjan Marion <dmarion@me.com> | 2019-02-21 22:03:40 +0000 |
commit | 67f935ec6eb9ec37b7d73029c5afa89cbf4a9aa2 (patch) | |
tree | f32c614f93d27697115ffd412eeb1c28819ee7ea /src/vnet/devices/virtio/vhost_user.h | |
parent | 3be3cd60bc41d095174ea413aa753a7bd9eff73e (diff) |
vhost: VPP stalls with vhost performing control plane actions
Symptom
-------
With NDR traffic blasting at VPP, bringing up a new VM with vhost
connection to VPP causes packet drops. I am able to recreate this
problem easily using a simple setup like this.
TREX-------------- switch ---- VPP
|---------------| |-------|
Cause
-----
The reason for the packet drops is due to vhost holding onto the worker
barrier lock for too long in vhost_user_socket_read(). There are quite a
few of system calls inside the routine. At the end of the routine, it
unconditionally calls vhost_user_update_iface_state() for all message
types. vhost_user_update_iface_state() also unconditionally calls
vhost_user_rx_thread_placement() and vhost_user_tx_thread_placement().
vhost_user_rx_thread_placement scraps out all existing cpu/queue mappings
for the interface and creates brand new cpu/queue mappings for the
interface. This process is very disruptive and very expensive. In my
opinion, this area of code needs a makeover.
Fixes
-----
* vhost_user_socket_read() is rewritten that it should not hold
onto the worker barrier lock for system calls, or at least minimize the
need for doing it.
* Remove the call to vhost_user_update_iface_state as a default route at
the end of vhost_user_socket_read(). There is only a couple of message
types which really need to call vhost_user_update_iface_state(). We put
the call to those message types which need it.
* Remove vhost_user_rx_thread_placement() and
vhost_user_tx_thread_placement from vhost_user_update_iface_state().
There is no need to repetatively change the cpu/queue mappings.
* vhost_user_rx_thread_placement() is actually quite expensive. It should
be called only once per queue for the interface. There is no need to
scrap the existing cpu/queue mappings and create new cpu/queue mappings
when the additional queues becomes active/enable.
* Change to create the cpu/queue mappings for the first RX when the
interface is created. Dont remove the cpu/queue mapping when the
interface is disconnected. Remove the cpu/queue mapping only when the
interface is deleted.
The create vhost user interface CLI also has some very expensive system
calls if the command is entered with the optional keyword "server"
As a bonus, This patch makes the create vhost user interface binary-api and
CLI thread safe. Do the protection for the small amount of code which is
thread unsafe.
Change-Id: I4a19cbf7e9cc37ea01286169882e5603e6d7eb77
Signed-off-by: Steven Luong <sluong@cisco.com>
Diffstat (limited to 'src/vnet/devices/virtio/vhost_user.h')
-rw-r--r-- | src/vnet/devices/virtio/vhost_user.h | 11 |
1 files changed, 8 insertions, 3 deletions
diff --git a/src/vnet/devices/virtio/vhost_user.h b/src/vnet/devices/virtio/vhost_user.h index f2ed2dffd46..7dadfed2334 100644 --- a/src/vnet/devices/virtio/vhost_user.h +++ b/src/vnet/devices/virtio/vhost_user.h @@ -250,6 +250,14 @@ typedef struct /* The rx queue policy (interrupt/adaptive/polling) for this queue */ u32 mode; + + /* + * It contains the device queue number. -1 if it does not. The idea is + * to not invoke vnet_hw_interface_assign_rx_thread and + * vnet_hw_interface_unassign_rx_thread more than once for the duration of + * the interface even if it is disconnected and reconnected. + */ + i16 qid; } vhost_user_vring_t; #define VHOST_USER_EVENT_START_TIMER 1 @@ -293,9 +301,6 @@ typedef struct /* Whether to use spinlock or per_cpu_tx_qid assignment */ u8 use_tx_spinlock; u16 *per_cpu_tx_qid; - - /* Vector of active rx queues for this interface */ - u16 *rx_queues; } vhost_user_intf_t; typedef struct |