Age | Commit message (Collapse) | Author | Files | Lines |
|
The main thread squirrels away vlib_time_now (&vlib_global_main),
worker threads use it to calculate an offset in f64 seconds from their
own vlib_time_now(vm) value. We use that offset until the next barrier
sync.
Thanks to Damjan for the suggestion.
Change-Id: If56cdfe68e5ad8ac3b0d0fc885dc3ba556cd1215
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: I195c8eabc0ee67880f1e85fc7594b00be6b563e3
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
This patch introduces following changes:
- deprecated free lists which are not used and not compatible
with external buffer managers (i.e. DPDK)
- introduces native support for per-numa buffer pools
- significantly improves performance of buffer alloc and free
Change-Id: I4a8e723ae47056717afd6cac0efe87cb731b5be7
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I79b213b34c6071d14acf1922f89037a4a5a36c45
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
As a FUD reduction measure, this patch implements 2-way parallel
counter collection. Synthetic stat component counter pairs run at the
same time. Running two counters (of any kind) at the same time
naturally reduces the aggregate time required by an approximate
factor-of-2, depending on whether an even or odd number of stats have
been requested.
I don't completely buy the argument that computing synthetic stats
such as instructions-per-clock will be inaccurate if component counter
values are collected sequentially. Given uniform traffic pattern, it
must make no difference.
As the collection interval increases, the difference between serial
and parallel component counter collection will approach zero, see also
the Central Limit theorem.
Change-Id: I36ebdcf125e8882cca8a1929ec58f17fba1ad8f1
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: I3bb1d9f83dd08f4b93acd4a281bfec0674e39c2e
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: If88ccd965122b9318a39a8d71b53334cd1fd81e4
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
VPP graph dispatch trace record description:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Major Version | Minor Version | NStrings | ProtoHint |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Buffer index (big endian) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ VPP graph node name ... ... | NULL octet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Buffer Metadata ... ... | NULL octet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Buffer Opaque ... ... | NULL octet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Buffer Opaque 2 ... ... | NULL octet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| VPP ASCII packet trace (if NStrings > 4) | NULL octet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Packet data (up to 16K) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Graph dispatch records comprise a version stamp, an indication of how
many NULL-terminated strings will follow the record header, and a
protocol hint.
The buffer index allows downstream consumers of these data to easily
filter/track single packets as they traverse the forwarding
graph. FWIW, the 32-bit buffer index is stored in big endian format.
As of this writing, major version = 1, minor version = 0. Nstrings
will be either 4 or 5.
Here is the current set of protocol hints:
typedef enum
{
VLIB_NODE_PROTO_HINT_NONE = 0,
VLIB_NODE_PROTO_HINT_ETHERNET,
VLIB_NODE_PROTO_HINT_IP4,
VLIB_NODE_PROTO_HINT_IP6,
VLIB_NODE_PROTO_HINT_TCP,
VLIB_NODE_PROTO_HINT_UDP,
VLIB_NODE_N_PROTO_HINTS,
} vlib_node_proto_hint_t;
Example: VLIB_NODE_PROTO_HINT_IP6 means that the first octet of packet
data SHOULD be 0x60, and should begin an ipv6 packet header.
Change-Id: Idf310bad80cc0e4207394c80f18db5f77c378741
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: I56f25d653b71a25c70e6c5c1a93dd9c5158f2079
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
To facilitate dispatch trajectory tracing, vlib_buffer_t decoding, etc.
through Wireshark
Change-Id: I31356b9fa1f40cba8830aaf10a86a9fbb7546438
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: I2476e3e916a42b41d1e66bfc1ec4f8c4264c1720
Signed-off-by: Dave Barach <dbarach@cisco.com>
|
|
This reverts commit 71615399e194847d7833b744caedab9b841733e5.
There seems to be an issue with ARPs when running with multiple workers.
Change-Id: Iaa68081512362945a9caf24dcb8d70fc7c5b75df
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: Ib5c346641463768cf33eaf8cb5fab5b63171398d
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: Ic4c46bc733afae8bf0d8146623ed15633928de30
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: Ie5a00c15ee9536cc61afab57f6cadc1aa1972f3c
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Add an "elog trace [api][cli][barrier]" debug CLI command. Removed the
barrier elog test command. Remove unused reliable multicast code.
Change-Id: Ib3ecde901b7c49fe92b313d0087cd7e776adcdce
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: I1f54b994425c58776e1445c8d9fe142e7a644d3d
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Configure w/ --enable-dlmalloc, see .../build-data/platforms/vpp.mk
src/vppinfra/dlmalloc.[ch] are slightly modified versions of the
well-known Doug Lea malloc. Main advantage: dlmalloc mspaces have no
inherent size limit.
Change-Id: I19b3f43f3c65bcfb82c1a265a97922d01912446e
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
This is ~50% improvement in buffer alloc performance.
For a 256 buffer allocation, it was ~10 clocks/buffer, now is < 5 clocks.
Change-Id: I97590e240a79a42bcab5eb26587fc2d11e6eb163
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
This address crash with gcc-7 observed when -o3 is used.
Change-Id: I10e87da8e5037ad480eba7fb0aaa9a657d3bf48d
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
- buffer_main is no longer part of vlib_main_t
- pool of free lists is still part of vlib_main_t
- mheap is not used anymore for buffer allocation
- simple bitmap bassed buffer alloc scheme is introduced
Change-Id: I3e1e6d00e2c8122293ed0a741245eb841315a1ff
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
We don't need per vlib_main physmem_main, so keep it separatelly instead
of trying to keep them in sync.
Change-Id: I0fbeecf4d9672d31af7a43c640a7d8f05dd6e46f
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Queue RPC calls and send them from the main dispatch loop. As things stood,
if the vpp main input queue filled, worker threads could enter a
barrier-sync spin-wait in the middle of processing a frame. If thread
0 decided to recreate worker thread data structures, the worker thread(s)
could easily crash.
Legislate the problem out of existence by enqueueing RPC messages only
from the main dispatch loop. At that point, doing a barrier-sync wait
is perfectly OK.
Change-Id: I18da3e44bb1f29a63fe5f30cf11de732ecfd5bf7
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
It's way too easy to imagine leaving a mutex or a spin-lock held in
the /vpe-api shared-memory segment, or elsewhere. Set a volatile
variable and check it in a safe place...
Change-Id: I9d91c38cffeb921143c272162d055c9c24a6c312
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Support logging to both syslog and elog
Also include DaveB is_mp_safe fix, which had been lost
Change-Id: If82f7969e2f43c63c3fed5b1a0c7434c90c1f380
Signed-off-by: Colin Tregenza Dancer <ctd@metaswitch.com>
|
|
This patch adds supprot support for multiple numa-aware physmem regions.
Change-Id: I5c69a6f4da33c8ee21bdb8604d52fd2886f2327e
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change the rebuilding of worker thread clone datastructures
to run in parallel on the workers, instead of serially
on main.
Change-Id: Ib76bcfbef1e51f2399972090f4057be7aaa84e08
Signed-off-by: Colin Tregenza Dancer <ctd@metaswitch.com>
|
|
Off by default. Enable via cmdline "... vlib { elog-post-mortem-dump }
..."
Change-Id: I2056b9de9b37475f2bfeeb5404da838f1b42645a
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: I82c663bc0866c6c68ba354104b0bb059387f4b9d
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
This patch deprecates stack-based thread identification,
Also removes requirement that thread stacks are adjacent.
Finally, possibly annoying for some folks, it renames
all occurences of cpu_index and cpu_number with thread
index. Using word "cpu" is misleading here as thread can
be migrated ti different CPU, and also it is not related
to linux cpu index.
Change-Id: I68cdaf661e701d2336fc953dcb9978d10a70f7c1
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I6ff7b65a400734a47bc0a7d03faf86ef1cf4f8c8
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I4aa3e7e42fb81211de1aed07dc7befee87a1e18b
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: Id18d59c9442602633a6310b2001a95bce8b6b232
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I7b51f88292e057c6443b12224486f2d0c9f8ae23
Signed-off-by: Damjan Marion <damarion@cisco.com>
|