Age | Commit message (Collapse) | Author | Files | Lines |
|
Symptom
-------
With NDR traffic blasting at VPP, bringing up a new VM with vhost
connection to VPP causes packet drops. I am able to recreate this
problem easily using a simple setup like this.
TREX-------------- switch ---- VPP
|---------------| |-------|
Cause
-----
The reason for the packet drops is due to vhost holding onto the worker
barrier lock for too long in vhost_user_socket_read(). There are quite a
few of system calls inside the routine. At the end of the routine, it
unconditionally calls vhost_user_update_iface_state() for all message
types. vhost_user_update_iface_state() also unconditionally calls
vhost_user_rx_thread_placement() and vhost_user_tx_thread_placement().
vhost_user_rx_thread_placement scraps out all existing cpu/queue mappings
for the interface and creates brand new cpu/queue mappings for the
interface. This process is very disruptive and very expensive. In my
opinion, this area of code needs a makeover.
Fixes
-----
* vhost_user_socket_read() is rewritten that it should not hold
onto the worker barrier lock for system calls, or at least minimize the
need for doing it.
* Remove the call to vhost_user_update_iface_state as a default route at
the end of vhost_user_socket_read(). There is only a couple of message
types which really need to call vhost_user_update_iface_state(). We put
the call to those message types which need it.
* Remove vhost_user_rx_thread_placement() and
vhost_user_tx_thread_placement from vhost_user_update_iface_state().
There is no need to repetatively change the cpu/queue mappings.
* vhost_user_rx_thread_placement() is actually quite expensive. It should
be called only once per queue for the interface. There is no need to
scrap the existing cpu/queue mappings and create new cpu/queue mappings
when the additional queues becomes active/enable.
* Change to create the cpu/queue mappings for the first RX when the
interface is created. Dont remove the cpu/queue mapping when the
interface is disconnected. Remove the cpu/queue mapping only when the
interface is deleted.
The create vhost user interface CLI also has some very expensive system
calls if the command is entered with the optional keyword "server"
As a bonus, This patch makes the create vhost user interface binary-api and
CLI thread safe. Do the protection for the small amount of code which is
thread unsafe.
Change-Id: I4a19cbf7e9cc37ea01286169882e5603e6d7eb77
Signed-off-by: Steven Luong <sluong@cisco.com>
|
|
Change-Id: I25b2a28513821bc5eab9ac6890a3964d412b0399
Signed-off-by: Damjan Marion <damarion@cisco.com>
(cherry picked from commit e40231b1ecf4b49faaa9ce7b615a7d867104825b)
|
|
Change-Id: I0af68f6b41d0024aa64b93a8b18e2d179bf939b0
Signed-off-by: Jerome Tollet <jtollet@cisco.com>
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Fix inconsistencies between admin and link interface states
Admin down should imply link down:
link_up = admin_up && link_ready
Change-Id: I4d668d82d035b5d2ae508727f34f1722a0c3e677
Signed-off-by: Juraj Sloboda <jsloboda@cisco.com>
|
|
Change-Id: I0caa5fd584e3785f237d08f3d3be23e9bfee7605
Signed-off-by: Juraj Sloboda <jsloboda@cisco.com>
|
|
DBGvpp# show vhost-user
Virtio vhost-user interfaces
Global:
coalesce frames 32 time 1e-3
number of rx virtqueues in interrupt mode: 0
Interface: VirtualEthernet0/0/0�?x�D (ifindex 3)
The fix is to use format_vnet_hw_if_index_name rather than hi->name. The former
format the name with %v rather than %s
Change-Id: If4d275e1eb249cf87b2d6b796b42f24769f9e3e3
Signed-off-by: Steven <sluong@cisco.com>
|
|
Two flags to disable mergable rx buffers and indirect
descriptors are added to api.
Change-Id: Iba0ee9c48d19dfc3d3420a3fdaf44a1a1d325e99
Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
|
|
When VM is having mixed type of vhost-user and SRIOV ports, QEMU (RedHat
v2.10) will not send disconnect signal to VPP, and just gives the new
memory region directly. VPP is not able to handle new memory region
mapping without disconnect signal first, which will result in a SEGV.
The fix will handle the VM reboot scenario without explict disconnect
signal from QEMU.
The fix is to invalidate the avail, desc, and used pointers in the txvq
when the new memory regions are received. This is because these pointers
are not valid anymore with the new memory regions. In the input node, check
to make sure the avail pointer is valid and punt if not.
Change-Id: Ieb8b427b202f4442a58907dab1661d63a03650de
Signed-off-by: Yichen Wang <yicwang@cisco.com>
|
|
This significantly reduces need for
...
in multiarch code. Simply constructor macros will jost create static unused
entry if CLIB_MARCH_VARIANT is defined and that will be optimized out by
compiler.
Change-Id: I17d1c4ac0c903adcfadaa4a07de1b854c7ab14ac
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I39f87ca161c891fb22462a23188982fef7c3243f
Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
|
|
It is cheaper to get thread index from vlib_main_t if available...
Change-Id: I4582e160d06d9d7fccdc54271912f0635da79b50
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
It also refactors the vhost code which was in one big file vhost-user.c.
Receive side code is in vhost_user_input.c and
Transmit side code is in vhost_user_output.c
Change-Id: I1b539b5008685889723e228265786a2a3e9f3a78
Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
|
|
gcc8 introduced a new warning (Wstringop-truncation) which in our case
is being treated as error.
Disabling the warning globally might introduce bugs related to string
truncation which are not desired by the developer (e.g. bug).
Instead, this patch disables the warning only for those occurences
which have been verified to be non-bugs but the desired behaviour as per
developer will.
Change-Id: I0f04ff6b4fad44061e80a65af633fd7e0148a0c5
Signed-off-by: Marco Varlese <marco.varlese@suse.com>
|
|
This patch separates setting of hardware interfaec and software
interface MTU. Software MTU is L2 payload MTU (i.e. not including L2
header). Per-protocol MTU for IPv4, IPv6 and MPLS can also be set.
Currently only IP4, IP6 are enabled in adjacency / rewrite code.
Documentation in src/vnet/MTU.md
Change-Id: Iee2fd6f0bbc8210748dd8e073ab9fab87d323690
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
Change-Id: I84327197d59c72d0d046dd2cb4071bf74af6fc28
Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
|
|
It is much cheaper to use ctzll than to do shift,subtract and mask
in likely case when we are looking for 1st set bit in the uword.
Change-Id: I31954081571978878c7098bafad0c85a91755fa2
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
show vhost-user <interface> may cause a crash if interface is semi-bogus.
Semi-bogus means it is a known vpp interface which has a hw_if_index, but
it is bogus because it is not a vhost-user interface.
The fix is to add a check to reject non vhost-user interface for the
command.
Change-Id: I63f1e8bfbf46f5ec4c30f9fb3546982b63cd7cc5
Signed-off-by: Steven <sluong@cisco.com>
|
|
interface)"
This reverts commit 70083ee74c3141bbefb185525315f1b34497dcaa.
Reverting as this patch is causing following crash:
0: /home/damarion/cisco/vpp3/build-data/../src/vnet/devices/devices.h:131 (vnet_get_device_input_thread_index) assertion `queue_id < vec_len (hw->input_node_thread_index_by_queue)' fails
Aborted
Change-Id: Ie2a365032110b1f67be7a9d832885b9899813d39
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I98bd454a761a1032738a21edeb0fe847e801f901
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
virtio_free_rx_buffers uses the wrong slot in the vring to get
the buffer index. It uses desc_next. It should be last_used_idx
which is the slot number for the first valid descriptor.
Change-Id: I6b62b794f06869fbffffce45430b8b2e37b1266c
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: I373f429c53c6f66ad38322addcfaccddb7761392
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: Ib04a8787038fb536470a04d99fdc165102edfb5a
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
In the case that vhostuser server accepted more than one client connection,
'vui->clib_file_index' will be overwritten directly without release the possible
existed resource, so file descriptor leak occurs
Change-Id: I89d08133dae31a12a815df2631334dbf0aefeb1e
Signed-off-by: Haiyang Tan <haiyang.tan.dev@gmail.com>
|
|
(VPP-1085)
The NEON implementation searches particular address in
VHOST_MEMORY_MAX_NREGIONS regions. Searching two regions at a
time.
Change-Id: Icc3c6746bc98e3a1fa71424e51b64f62efbfdc74
Signed-off-by: Nitin Saxena <nitin.saxena@cavium.com>
|
|
This patch teaches worer threads to sleep and to be waken up by
kernel if there is activity on file desctiptors assigned to that thread.
It also adds counters to epoll file descriptors and new
debug cli 'show unix file'.
Change-Id: Iaf67869f4aa88ff5b0a08982e1c08474013107c4
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
1. Replace the magic number '-1' with MAP_FAILED
2. On x86 platform, QEMU uses vhostuser required the memory back-end is file based,
the file could be tmpfs(4K page size) or hugetlbfs(2M or 1G page size)
Change-Id: If1818cb6833728d641f68e4d4a3bc645e70f2ee6
Signed-off-by: Haiyang Tan <haiyang.tan.dev@gmail.com>
|
|
Add an SELinux profile such that VPP can run under SELinux on RPM based
platforms. The SELinux Policy is currently only implemented for RPM
packages, specifically, Fedora, CentOS and RHEL. Doxygen User
Documentation has been included (selinux_doc.md). Once some discussion
on file locations has completed (see vpp-devlist), updates to the Debug
CLI documentation will also need to be updated.
Additional changes:
Patch Set 2:
- Rework selinux_doc.md such that each line is only 80 characters
instead of each sentence on a line. Made additonal minor chnages
to the text.
- Update vHost Debug CLI documentation to reflex new socket location.
Cleaned up some text from when I originally wrote it, to better
reflex proper use.
- Update exec Debug CLI documentation to be more inline with suggested
helptext, added text regarding recommended script file location.
- For Debian builds, create the /var/log/vpp/ directory. I don't use
Debian very much, so please pay extra attention to
build-data/platforms.mk and build-root/deb/debian/.gitignore.
- Per discussion on VPP call, changed the default log location to
/var/log/vpp/vpp.log.
- Changed the socket location for vHost in AutoConfig to
/var/run/vpp/.
Patch Set 3:
- Update selinux_doc.md based on comments.
Change-Id: I400520dc33f1ca51012d09ef8fe5a7b7b96c631e
Signed-off-by: Billy McFall <bmcfall@redhat.com>
|
|
This is a version of the VPP API generator in Python PLY. It supports
the existing language, and has a plugin architecture for generators.
Currently C and JSON are supported.
Changes:
- vl_api_version to option version = "major.minor.patch"
- enum support
- Added error checking and reporting
- import support (removed the C pre-processor)
- services (tying request/reply together)
Version:
option version = "1.0.0";
Enum:
enum colours {
RED,
BLUE = 50,
};
define foo {
vl_api_colours_t colours;
};
Services:
service {
rpc foo returns foo_reply;
rpc foo_dump returns stream foo_details;
rpc want_stats returns want_stats_reply
events ip4_counters, ip6_counters;
};
Future planned features:
- unions
- bool, text
- array support (including length)
- proto3 output plugin
- Refactor C/C++ generator as a plugin
- Refactor Java generator as a plugin
Change-Id: Ifa289966c790e1b1a8e2938a91e69331e3a58bdf
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
address area
This patch fixed the VMA leak that if mapping one of guest physical address area get failed.
Change-Id: I07b0b9a932209561d6ff2b2dd08a111ea5db2209
Signed-off-by: Haiyang Tan <haiyang.tan.dev@gmail.com>
|
|
Change-Id: I4e2804754b443f5f41fb25eed8334908c4a70f84
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Buffers may be allocated for indirect descriptors by tx thread and
they are freed when tx thread is invoked in the next invocation.
This is to allow the recipient (kernel) to have a chance to process
them. But if the tap interface is deleted, the tx thread may not yet
be called to clean up the indirect descriptors' buffers. In that case,
we need to remove them without waiting for the tx thread to be called.
Failure to do so may cause buffers leak when the tap interface is deleted.
For the RX ring, leakage also exists for vring->buffers when the interface
is removed.
Change-Id: I3df313a0e60334776b19daf51a9f5bf20dfdc489
Signed-off-by: Steven <sluong@cisco.com>
(cherry picked from commit d8a998e74b815dd3725dfcd80080e4e540940236)
|
|
This does not update api client code. In other words, if the client
assumes the transport is shmem based, this patch does not change that.
Furthermore, code that checks queue size, for tail dropping, is not
updated.
Done for the following apis:
Plugins
- acl
- gtpu
- memif
- nat
- pppoe
VNET
- bfd
- bier
- tapv2
- vhost user
- dhcp
- flow
- geneve
- ip
- punt
- ipsec/ipsec-gre
- l2
- l2tp
- lisp-cp/one-cp
- lisp-gpe
- map
- mpls
- policer
- session
- span
- udp
- tap
- vxlan/vxlan-gpe
- interface
VPP
- api/api.c
OAM
- oam_api.c
Stats
- stats.c
Change-Id: I0e33ecefb2bdab0295698c0add948068a5a83345
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
- separate client/server code for both memory and socket apis
- separate memory api code from generic vlib api code
- move unix_shared_memory_fifo to svm and rename to svm_fifo_t
- overall declutter
Change-Id: I90cdd98ff74d0787d58825b914b0f1eafcfa4dc2
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: I097a738b96a304621520f1842dcac7dbf61a8e3f
Signed-off-by: Milan Lenco <milan.lenco@pantheon.tech>
|
|
- change interface naming scheme
- rework netlink code
- add option to set link address, namespace
Change-Id: Icf667babb3077a07617b0b87c45c957e345cb4d1
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
virtio backend stays in vnet/devices/virtio
Change-Id: Idbf04f1c645a809ed408670ba330662859fe9309
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
fd is not close when IOCTL encounters an error which causes resource
leak. The fix is to initialize fd to -1. At return, close fd if
it has a valid value.
Change-Id: I53c4f5c71ca0f556fb6586f5849e7cb622632d8f
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: I877cf1abb062a90f428c3ec0cab5c6e9dad0ca82
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
- add support for assigning tap interface to the bridge
- add support for assigning tap interface host side ip4 and ip6 address
- host namespace can be specified as PID (pid:12345) or full path to file
- automatically bring linux interface up
Change-Id: I1cf7c3cad9a740e430cc1b9c2bb0aad0ba4cc8d8
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Fix 3 coverity warnings
1. api_format.c: init net_ns = 0 and remove its corresponding vec_add and
vec_free
2. netlink.c (reported in tap.c before the code was removed): resource leaked
due to fd is not close
3. tap.c: subtract 1 for size when calling strncpy to accommodate the
terminated NULL character
Change-Id: Iff4e66604862f0c06dac227b8cfd48d3979e41a5
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: Ib091875f77ea99421aec0947fd17833c4e6d2ec2
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: Ided667356d5c6fb9648eb34685aabd6b16a598b7
Signed-off-by: Damjan Marion <damarion@cisco.com>
Signed-off-by: Steven Luong <sluong@cisco.com>
|
|
With heavy traffic, tx code path may crash due to memory corruption
Thread 5 "vpp_wk_2" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff3995c700 (LWP 2505)]
0x00007ffff73675e8 in vhost_user_if_input (vm=0x7fffb5f5bf9c,
vum=0x7ffff7882a40 <vhost_user_main>, vui=0x7fffb65570c4, qid=0,
node=0x7fffb6577dac, mode=VNET_HW_INTERFACE_RX_MODE_POLLING)
at /home/sluong/vpp-master/vpp/build-data/../src/vnet/devices/virtio/vhost-user.c:1610
1610 bi_current = (vum->cpus[thread_index].rx_buffers)
[vum->cpus[thread_index].rx_buffers_len];
(gdb) p vum->cpus[thread_index].rx_buffers_len
$2 = 793212607
(gdb)
Apparently, some code accidentally wrote the bad value in rx_buffers_len.
rx_buffers_len should never be greater than 1024 since that is how many buffers
we request each time.
After debugging many hours, I discovered that the memory corruption happens
in the tx code path right here on line 2176.
{
vhost_copy_t *cpy = &vum->cpus[thread_index].copy[copy_len];
copy_len++;
cpy->len = bytes_left;
cpy->len = (cpy->len > buffer_len) ? buffer_len : cpy->len;
cpy->dst = buffer_map_addr;
cpy->src = (uword) vlib_buffer_get_current (current_b0) +
current_b0->current_length - bytes_left;
(gdb) p cpy
$3 = (vhost_copy_t *) 0x7fffb554077c
(gdb) p copy_len
$4 = 1025
(gdb) p &vum->cpus[3].rx_buffers_len
$8 = (u32 *) 0x7fffb5540784
copy_len is picking up the index entry 1024 before it was incremented. copy array has only
1024 members (0 - 1023 are valid).
The assignment here in cpy surely causes memory corruption. It is only discovered later
when the memory location that it corrupted is used.
The condition for the crash is to transmit jumbo frames under heavy volume. Since ring
size is 1024, with one packet taking up one index for frame size (less 2048), it does
not cause overflow. With jumbo frames, it requires multiple indices for one packet,
it can cause the overflow under heavy traffic.
The fix is to do copy out when we have 1000 entries in the array to avoid
overflow.
Change-Id: Iefbc739b8e80470f1cf13123113f8331ffcd0eb2
Signed-off-by: Steven <sluong@cisco.com>
|
|
Add one of these statements to foo.api:
vl_api_version 1.2.3
to generate a version tuple stanza in foo.api.h:
/****** Version tuple *****/
vl_api_version_tuple(foo, 1, 2, 3)
Change-Id: Ic514439e4677999daa8463a94f948f76b132ff15
Signed-off-by: Dave Barach <dave@barachs.net>
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
A bug was reported where a jumbo packet would stay in vhost
queue forever or until a large enough number of other packets
arrived in the queue too.
This is due to a bug in vhost input node buffer allocation.
The fix is to make sure that vhost always allocates at least
enough buffers for one single big packet. '40' is used to
account for 65kB frames.
Change-Id: I1d293028854165083e30cd798fab9d4140230b78
Signed-off-by: Pierre Pfister <ppfister@cisco.com>
(cherry picked from commit 67700d41169ac37d21c400949a316750eabad969)
|
|
- always use 'va_args' as pointer in all format_* functions
- u32 for all 'indent' params as it's declaration was inconsistent
Change-Id: Ic5799309a6b104c9b50fec309cba789c8da99e79
Signed-off-by: Christophe Fontaine <christophe.fontaine@enea.com>
|
|
When changing the admin state of a vhost-user interface, do not put it
in link-up mode if the interface is not actually ready.
Change-Id: Idbc631a7126efa79d199909f9e7656d21bd412ca
Signed-off-by: Yoann Desmouceaux <ydesmouc@cisco.com>
|
|
This will allow us to use this code in client libraries without vlib.
Change-Id: I8557b752496841ba588aa36b6082cbe2cd1867fe
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
According to the spec, supporting interrupt mode from the driver is optional,
not a must. When interrupt mode is configured on the interface, we should
check to make sure that the driver didn't opt out for the kickfd support and
reject the configuration if it did.
Change-Id: I7d3dbaddde65458e1a6a802754a3768ae8685a0e
Signed-off-by: Steven <sluong@cisco.com>
|
|
In the data path, we grab qsz from vhost_user_vring_t to compute
qsz_mask and store it in a stack variable to use on many occasions.
We never use qsz for any meaningful purpose. It is more useful to
cache qsz_mask in vhost_user_vring_t to avoid the needless computation
in the data path.
Change-Id: Idf4d94a9754d5c75c899f1f4f59602275b9904a6
Signed-off-by: Steven <sluong@cisco.com>
|