Age | Commit message (Collapse) | Author | Files | Lines |
|
Free node frames in worker mains on refork. Otherwise these frames are
never returned to free pool and it causes massive memory leaks if
performed under traffic load
Type: fix
Signed-off-by: Dmitry Valter <d-valter@yandex-team.ru>
Change-Id: I15cbf024a3f4b4082445fd5e5aaa10bfcf77f363
|
|
This allows specifying both c string and vector for node name
and removes need for crafting temporary string.
Type: improvement
Change-Id: I0b016cd70aeda0f68eb6f9171c5152f303be7369
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: refactor
Change-Id: Iaa1e43c87c5725ab33ea8489bff2a7bda18b9c79
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: improvement
Change-Id: I53890a13210cfb0d2b2d9d8cfd9b15118d3bb273
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: refactor
Change-Id: I8b273bc3bf16aa360f031f1b2692f766e5fc4613
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: improvement
Change-Id: I73383eb15186021cd6527d112da8443a0082f129
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: improvement
Change-Id: If3da7d4338470912f37ff1794620418d928fb77f
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
It allows default variant selection from startup.conf
Type: improvement
Change-Id: Idff95e12dd0c105dab7c905089548b05a6e974e0
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I03e164d8d5a329497f422e99f8b0058135241b4e
Signed-off-by: Mohammed Hawari <mohammed@hawari.fr>
Type: fix
|
|
Type: improvement
Change-Id: I4008cadfd5141f921afbdc09a3ebcd1dcf88eb29
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: refactor
Change-Id: I077110e1a422722e20aa546a6f3224c06ab0cde5
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: refactor
Change-Id: Ie67dc579e88132ddb1ee4a34cb69f96920101772
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
This adds a new data model for counters.
Specifying the errors severity and unit.
A later patch will update vpp_get_stats to take advantage of this.
Only the map plugin is updates as an example.
New .api language:
A new "counters" keyword to define counter sets.
counters map {
none {
severity info;
type counter64;
units "packets";
description "valid MAP packets";
};
bad_protocol {
severity error;
type counter64;
units "packets";
description "bad protocol";
};
};
Each counter has 4 keywords. severity, which is one of error, info or warn.
A type, which is one of counter64 or gauge64.
units, which is a text field using units from YANG.
paths {
"/err/ip4-map" "map";
"/err/ip6-map" "map";
"/err/ip4-t-map" "map";
"/err/ip6-t-map" "map";
};
A new paths keyword that maps the counter-set to a path in the stats segment KV store.
Updated VPP CLI to include severity so user can see error counter severity.
DBGvpp# show errors
Count Node Reason Severity
13 ethernet-input no error error
Type: feature
Signed-off-by: Ole Troan <ot@cisco.com>
Change-Id: Ib2177543f49d4c3aef4d7fa72476cff2068f7771
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
Heap may use different page sizes so we will not be able to create
stack protection page.
Type: improvement
Change-Id: Ibb35c9f0a151c464ee0167d17f2bd773ef6f530b
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: fix
Change-Id: I1b4f52e186165b04db5bd5f11058dc77b647bc94
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: fix
Calling vlib_get_node_by_name via the VPE api
doesn't work due to hash weirdness. Haven't
gotten around the real cause of this. But this
fixes it.
Change-Id: I89f95dba2bcd9573b8f1f435e063e9dd57f9ca93
Signed-off-by: Nathan Skrzypczak <nathan.skrzypczak@gmail.com>
|
|
When recycling a graph node vnet_register_interface, it is missing an
explicit call to vlib_worker_thread_node_runtime_update(). However,
there is an implicit call to vlib_worker_thread_node_runtime_update()
via vnet_sw_interface_set_flags_helper() if it enables a new feature on
the interface for the first time. But that implicit call is not
guaranteed. For example, if an interface is created, deleted, and
created, then it may skip the implicit call to
vlib_worker_thread_node_runtime_update(). When that happens, the graph
nodes on thread 0 are not sync'ed to the worker threads. So the worker
thread's graph nodes are out of sync momentarily with the main thread's
graph nodes until some other event happens which calls for a sync is
needed. During this window, the worker thread's graph node is
vulnerable and may experience a crash.
When deleting a graph node, we never trigger a sync to the worker
thread. A patch was committed 3 years ago via
https://gerrit.fd.io/r/c/vpp/+/7523 to fix a show run crash. In
hindsight, the approach taken by 7523 is not orthogonal. While at it,
let's fix it right for both issues with a call to
vlib_worker_thread_node_runtime_update() in the appropriate place and
remove 7523.
Type: fix
Ticket: VPPSUPP-86
Fixes: gerrit 7523 / 19e9d954bd9eb4f04d48640d6540198e84ef65d7
Signed-off-by: Steven Luong <sluong@cisco.com>
Change-Id: Ic9472bd2d3a212dbfeceb526506ed0400983a142
|
|
Fix a nit warning: we're not likely to create a vlib process with more
than 4gb of stack.
Type: fix
Ticket: VPP-1888
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I8bc7f64287c2802b0c286ce3d04443ac723a9a33
|
|
Instead of allocating stack from the main heap, this patch mmaps stack
memory together with guard page.
This aproach reduces main heap usage, and stack memory is prefaulted
on demand, so bigger process stacks will have zero impact on memory
usage as long as stack memory is not needed for real.
In addition, it fixes issue with systems which have bigger default page
size (observed with 65536).
Type: improvement
Change-Id: I593365c603d4702e428967d80fd425fdee2c4a21
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
- vlib_node_add_next_with_slot was not cleaning the old next node
references to the given slot when replacing it with new next node. This mostly
worked until one tried to set the slot to a previously (but not currently) used
next node for that slot.
Type: fix
Signed-off-by: Christian Hopps <chopps@labn.net>
Change-Id: I7ee607625da874e320158b80f12ddc16e377f8e9
|
|
The old code modified the node next array prior to obtaining the thread
barrier. Then it updated the runtime node data, and upon barrier release
caused reforking of each worker thread. The reforking clones the main
thread nodes and reconstructs the runtime node structure. This cloning
is not 100% "deep" in the sense that the node next array is
shared (i.e., only the pointer is copied). So prior to the barrier being
obtained the node's next array is being changed while workers are
actively using it (bad). Treating the node next array as read-only in
the workers and sharing it is a decent optimization so instead of trying
to fix that just move the barrier a little earlier in the process to
protect the node next array as well.
This was tripping an assert in next frame ownership change by way of the
ip4-arp node. The assert verifies that the node's next array length is
equal to the runtime next node count. The race above was lost and the
node next array data was updated in the main thread while the arp code
was still executing in a worker.
This was being hit when many arp requests were being sent from both ends
of a tunnel during which the add next node function was called, which
often led to an assert b/c the next node array was out of sync with the
runtime next node count.
- PS#2 update - move barrier sync to just above code that modifies state.
Ticket: VPP-1783
Type: fix
Signed-off-by: Christian E. Hopps <chopps@chopps.org>
Change-Id: I868784e28f994ee0922aaaae11c4894a3f4f1fe7
Signed-off-by: Christian E. Hopps <chopps@chopps.org>
|
|
Encoding the vpp node index into the vlib_error_t as a 10-bit quantity
limits us to 1K graph nodes. Unfortunately, a few nodes need 6 bit
per-node error codes. Only a very few nodes have so many counters.
It turns out that there are about 2K total error counters in the system,
which is (approximately) the maximum error heap index.
The current (index,code) encoding limits the number of interfaces to
around 250, since each interface has two associated graph nodes and we
have about 500 "normal, interior" graph node
This patch adds an error-index to node-index map, so we can store
error heap indices directly in the vlib_buffer_t.
Type: refactor
Change-Id: I28101cad3d8750819e27b8785fc0cf71ff54f79a
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Maintain the MAINTAINERS file. Removed src/plugins/*.am listings. Added
a couple of plugins.
Add vlib_process_create (vlib_main_t *vm, char *name,
vlib_node_function_t *f, u32 log2_n_stack_bytes);
/** @brief Create a vlib process
* @param vm &vlib_global_main
* @param f the process node function
* @param log2_n_stack_bytes size of the process stack, defaults to 16K
* @return newly-create node index
* @warning call only on the main thread. Barrier sync required.
*/
This function makes it easy to spin up periodic processes when features
are enabled for the first time. That coding pattern is highly recommended.
Update the emacs-lisp plugin generator to use vlib_process_create,
instead of generating static periodic process nodes.
Change-Id: Icda33e93b9034779d3a3e228cd1110af14b058a5
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
see register_node, node-name might be a vector
Change-Id: I883ec51c1fa9aa4da4ba6cba415a39bb6a4331e1
Signed-off-by: Kingwel Xie <kingwel.xie@ericsson.com>
|
|
It turns out that for scalar sizes 0..24, frames are always the same
size. That range includes all current use-cases - and then some - so
get rid of the hash table. Old code preserved under #ifdef
VLIB_SUPPORTS_ARBITRARY_SCALAR_SIZES.
Change-Id: Ic005c7143c9639f77d1a0fadd2fc0e90dccb68c1
Signed-off-by: Dave Barach <dbarach@cisco.com>
|
|
VPP graph dispatch trace record description:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Major Version | Minor Version | NStrings | ProtoHint |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Buffer index (big endian) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ VPP graph node name ... ... | NULL octet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Buffer Metadata ... ... | NULL octet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Buffer Opaque ... ... | NULL octet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Buffer Opaque 2 ... ... | NULL octet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| VPP ASCII packet trace (if NStrings > 4) | NULL octet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Packet data (up to 16K) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Graph dispatch records comprise a version stamp, an indication of how
many NULL-terminated strings will follow the record header, and a
protocol hint.
The buffer index allows downstream consumers of these data to easily
filter/track single packets as they traverse the forwarding
graph. FWIW, the 32-bit buffer index is stored in big endian format.
As of this writing, major version = 1, minor version = 0. Nstrings
will be either 4 or 5.
Here is the current set of protocol hints:
typedef enum
{
VLIB_NODE_PROTO_HINT_NONE = 0,
VLIB_NODE_PROTO_HINT_ETHERNET,
VLIB_NODE_PROTO_HINT_IP4,
VLIB_NODE_PROTO_HINT_IP6,
VLIB_NODE_PROTO_HINT_TCP,
VLIB_NODE_PROTO_HINT_UDP,
VLIB_NODE_N_PROTO_HINTS,
} vlib_node_proto_hint_t;
Example: VLIB_NODE_PROTO_HINT_IP6 means that the first octet of packet
data SHOULD be 0x60, and should begin an ipv6 packet header.
Change-Id: Idf310bad80cc0e4207394c80f18db5f77c378741
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Typically we have scalar_size == 0, so it doesn't matter
but vlib_frame_args was providing pointer to scalar frame
data, not vector data. To avoid future confusion function
is renamed to vlib_frame_scalar_args(...)
Change-Id: I48b75523b46d487feea24f3f3cb10c528dde516f
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: Ied34720ca5a6e6e717eea4e86003e854031b6eab
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: I084d7c9e34329f10b5fe45e0b157c4defe0f2811
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Seems to have minimal-to-zero performance consequences. Data appears
accurate: result match the debug CLI output. Checked at low rates, 27
MPPS sprayed across two worker threads.
Change-Id: I09ede5150b88a91547feeee448a2854997613004
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: Ibab5e27277f618ceb2d543b9d6a1a5f191e7d1db
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
- separate client/server code for both memory and socket apis
- separate memory api code from generic vlib api code
- move unix_shared_memory_fifo to svm and rename to svm_fifo_t
- overall declutter
Change-Id: I90cdd98ff74d0787d58825b914b0f1eafcfa4dc2
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
In multi-threaded model (e.g. 1 main and 1 worker threads),
after an ethernet interface is deleted (e.g. vhost-user interface),
'show runtime' command produces garbled output and sometimes
leads to vpp crash.
The reason is because vlib_node_rename() frees and reallocates node's
'n->name' vector, however the change is not propagated into copies
of the node on worker threads.
Change-Id: Ibf22422913b7f2df22f70f3b2fe8dafd34c1dd06
Signed-off-by: Igor Mikhailov (imichail) <imichail@cisco.com>
(cherry picked from commit 02989064e4c26a4940a5292ba6c47023e6dd3131)
|
|
traffic (VPP-892)
When stacking DPOs the VLIB graph is also updated to add the edge between the nodes, if this edge does not yet exist. This addition should be done with the workers stopped.
Change-Id: I327e4d7d26f0b23eb280f17e4619ff2093ff7940
Signed-off-by: Neale Ranns <nranns@cisco.com>
(cherry picked from commit c02bd03ddf5eec9e9c79811360685f13e4ba8ee1)
|
|
- refactor existing congestion control code (RFC 6582/5681). Handling of ack
feedback now consists of: ack parsing, cc event detection, event handling,
congestion control update
- extend sack scoreboard to support sack based retransmissions
- basic implementation of Eifel detection algorithm (RFC 3522) for
detecting spurious retransmissions
- actually initialize the per-thread frame freelist hash tables
- increase worker stack size to 2mb
- fix session queue node out-of-buffer handling
- ensure that the local buffer cache vec_len matches reality
- avoid 2x spurious event requeues when short of buffers
- count out-of-buffer events
- make the builtin server thread-safe
- fix bihash template threading issue: need to paint -1 across uninitialized
working_copy_length vector elements (via rebase from master)
Change-Id: I646cb9f1add9a67d08f4a87badbcb117980ebfc4
Signed-off-by: Florin Coras <fcoras@cisco.com>
Signed-off-by: Dave Barach <dbarach@cisco.com>
|
|
This patch deprecates stack-based thread identification,
Also removes requirement that thread stacks are adjacent.
Finally, possibly annoying for some folks, it renames
all occurences of cpu_index and cpu_number with thread
index. Using word "cpu" is misleading here as thread can
be migrated ti different CPU, and also it is not related
to linux cpu index.
Change-Id: I68cdaf661e701d2336fc953dcb9978d10a70f7c1
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I4aa3e7e42fb81211de1aed07dc7befee87a1e18b
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I7b51f88292e057c6443b12224486f2d0c9f8ae23
Signed-off-by: Damjan Marion <damarion@cisco.com>
|