Age | Commit message (Collapse) | Author | Files | Lines |
|
The old code modified the node next array prior to obtaining the thread
barrier. Then it updated the runtime node data, and upon barrier release
caused reforking of each worker thread. The reforking clones the main
thread nodes and reconstructs the runtime node structure. This cloning
is not 100% "deep" in the sense that the node next array is
shared (i.e., only the pointer is copied). So prior to the barrier being
obtained the node's next array is being changed while workers are
actively using it (bad). Treating the node next array as read-only in
the workers and sharing it is a decent optimization so instead of trying
to fix that just move the barrier a little earlier in the process to
protect the node next array as well.
This was tripping an assert in next frame ownership change by way of the
ip4-arp node. The assert verifies that the node's next array length is
equal to the runtime next node count. The race above was lost and the
node next array data was updated in the main thread while the arp code
was still executing in a worker.
This was being hit when many arp requests were being sent from both ends
of a tunnel during which the add next node function was called, which
often led to an assert b/c the next node array was out of sync with the
runtime next node count.
- PS#2 update - move barrier sync to just above code that modifies state.
Ticket: VPP-1783
Type: fix
Signed-off-by: Christian E. Hopps <chopps@chopps.org>
Change-Id: I868784e28f994ee0922aaaae11c4894a3f4f1fe7
Signed-off-by: Christian E. Hopps <chopps@chopps.org>
|
|
Prints the interior node vector rate, rx / tx / drop rates
Type: feature
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I57130db0f99e852a8498aa90d01e52f7ac33dcc9
|
|
If no Linux PCI driver module is loaded, then the driver_name in the PCI
info struct is NULL. This can triggers crash when checking driver name
eg. in vlib_pci_device_open().
Default to "<NONE>" as driver name, which should never match.
Type: fix
Change-Id: I9e69889a7566467bd8220b92bbbaa72ada957257
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: refactor
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I4b77879b0a84fdec3c1518a972cf003d5135222d
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
Type: fix
Change-Id: I81c4cf0ce87288bb2d3c7b9f31e9419290d588b4
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: feature
Change-Id: I913f08383ee1c24d610c3d2aac07cef402570e2c
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Use a single vnet_pcap_t in vlib_global_main, specifically to support
unified tracing
Update sphinx docs, doxygen tags
Type: refactor
Ticket: VPP-1776
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: Id15d41a596712968c0714cef1bd2cd5bc9cbdd55
|
|
The 9af7e2e87e used a comparison that fd is >= 0 to check that
the pcap needs closing. While the pcap_close() function does
reset the file descriptor to -1, the freshly initialized structure
has it equal to 0.
This causes the VPP to close stdin if the packets are being seen
on pg interface without the capture file being opened.
This triggers the vpp attempting to read from STDIN
(another bug), which results in running out of memory.
Change-Id: I11d61422701500a9b3e0dd52d59383f297d57f54
Type: fix
Fixes: 9af7e2e87e
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|
|
See .../src/vnet/classify/trace_classify.h for the business end
of the scheme.
It would be best to hash pkts, prefetch buckets, and do the primary
table lookups two at a time. The inline as given works, but perf
tuning will be required. "At least it works..."
Add "classify filter" debug cli, for example:
classify filter mask l3 ip4 src dst \
match l3 ip4 dst 192.168.2.10 src 192.168.1.10
Add "pcap rx | tx trace ... filter" to use the current classify filter chain
Patch includes sphinx documentation and doxygen tags.
Next step: device-driver integration
Type: feature
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I05b1358a769f61e6d32470e0c87058f640486b26
|
|
Some cli processes, including bringing up an i40e interface with dpdk,
consume more than the currently available stack space.
Type: fix
Fixes: VPP-1774
Signed-off-by: Aloys Augustin <aloaugus@cisco.com>
Change-Id: I86ceb9e6e07523d5e0f760b5922467f09a8d4006
|
|
Type: fix
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Change-Id: I3ae7dc3858d0353764d629d6a9eff2bdab5f8768
|
|
Separate debug CLI arg parsing from the underlying action
function. Fixes a number of subtle ordering dependencies, and will
allow us to add a binary API to control the feature at some point in
the future.
Type: refactor
Ticket: VPP-1770
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: Id0dbeda06dad20e756c941c691e2088ce3c50ec7
|
|
Separate debug CLI arg parsing from the underlying action
function. Fixes a number of subtle ordering dependencies, and will
allow us to add a binary API to control the feature at some point in
the future.
Type: refactor
Ticket: VPP-1762
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I1240fe3f61a0acf5ee9faed60d6ad3386e72e569
|
|
When checksumming chained buffers with odd lengths: insert a
NULL byte, or the calculation fails.
Type: fix
Signed-off-by: Dave Barach <dave@barachs.net>
Signed-off-by: John Lo <loj@cisco.com>
Change-Id: I380f7c42897bdb28c8c29aa1c4cdaaa849cc9ecc
|
|
The VPP code tries to set all userspace memory in the table via IOCTL
to VHOST_SET_MEM_TABLE. But on aarch64, the userspace address range is
larger (48 bits) than that on x86 (47 bits). Below is an segment from
/proc/[vpp]/maps.
fffb41200000-fffb43a00000 rw-s 00000000 00:0e 532232
/anon_hugepage (deleted)
Instead of setting all userspace memory space to vhost-net, will only set
the address space reserved by pmalloc module during initialization.
Type: fix
Change-Id: I91cb35e990869b42094cf2cd0512593733d33677
Signed-off-by: Lijian Zhang <Lijian.Zhang@arm.com>
Reviewed-by: Steve Capper <Steve.Capper@arm.com>
|
|
Credits to ray.kinsella@intel.com who spotted the issue and identified
root cause.
Type: fix
Change-Id: I4afe74c47769484309f6aebca2de56ad32c8041f
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff4b71de0 in __strncmp_sse42 () from /lib64/libc.so.6
(gdb) up
up
vm=0x7ffff6664d40 <vlib_global_main>, addr=0x7fffb4bec6d0,
ids=0x7fffb31675f0 <avf_pci_device_ids>, handle=0x7fffb4bec594)
at /usr/src/debug/vpp-20.01/src/vlib/linux/pci.c:1250
1250 if (strncmp ("vfio-pci", (char *) di->driver_name, 8) == 0)
(gdb) p di
p di
$1 = (vlib_pci_device_info_t *) 0x7fffb6446164
(gdb) p di->driver_name
p di->driver_name
$2 = (u8 *) 0x0
(gdb)
driver_name may be null. strncmp is not forgiving. Change to use C11 safeC
version.
Type: fix
Signed-off-by: Steven Luong <sluong@cisco.com>
Change-Id: I1777a5966ceee7409d7bde86c30b14dc75534a5a
|
|
Ensure the runtime directory is created at startup.
Default /run/vpp
Type: fix
Fixes: I53d70939c8125d04a365ac51a6cbf8926dc52adf
Change-Id: I6d70364ea756b86768c4dd1f6a9383238ed275c8
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
when use pcap cli to capture pcakets into two files rx01.pcap && rx02.pcap,
the first time:
1)pcap rx trace on max 100 intfc any file rx01.pcap
2)......the process of capture data to buffer......
3)pcap rx trace off
the second time:
4)pcap rx trace on max 100 intfc any file rx02.pcap
5)......the process of capture data to buffer......
6)pcap rx trace off
the pcap_write function bug in this two lines
pm->n_packets_captured = 0;
if (pm->n_packets_captured >= pm->n_packets_to_capture) referring to calling pcap_close()
will result in that the twice pcap cli both writes the packets
into rx01.pcap, but nothing into rx02.pcap. Beside, the rx02.pcap
file will not be created.
solution: separate the pcap_close() out of pcap_write()
Change-Id: Iedeb46f9cf0a4cb12449fd75a4014f95f3bb3fa8
Signed-off-by: Jack Xu <jack.c.xu@ericsson.com>
|
|
Type: fix
Signed-off-by: Guanghua Zhang <ghzhang@fiberhome.com>
Change-Id: I8252ed2555f5af6db2f12dc7c30e41cc1ec7dde0
|
|
Since vlib_buffer_copy() and vlib_buffer_clone() both preserve
VLIB_BUFFER_IS_TRACED bit in flags field, it should also copy
trace_handle which would add minimal overhead. Thus, callers of
these functions do not have to call vlib_buffer_copy_trace_flags()
to copy trace_handle.
Type: refactor
Signed-off-by: John Lo <loj@cisco.com>
Change-Id: Iff6a3f81660dd62b36a2966033eb380305340310
|
|
Make vlib_buffer_copy() preserve buffer flags bit the same way as
that of vlib_buffer_clone() so both are consistent.
Type: fix
Signed-off-by: John Lo <loj@cisco.com>
Change-Id: I6c32aa1e88724b482ce2439d82019e690311b664
|
|
When VPP tries to bind to stats.sock it will complain about non-existing
/run/vpp directory.
/run/vpp is created before cli socket operations are performed.
The same should be done for stat socket.
Ticket: VPP-1708
Type: fix
Change-Id: I53d70939c8125d04a365ac51a6cbf8926dc52adf
Signed-off-by: YohanPipereau <ypiperea@cisco.com>
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
'show node foo' causes infinite loop resulting in out of memory.
This patch fixes the issue by breaking the loop on invalid input.
Ticket: VPP-1538
Type: fix
Fixes: 98afc711c5
Change-Id: Icf2be92e277a7f820d4e08bea9ef22ffbbb116f6
Signed-off-by: Filip Tehlar <ftehlar@cisco.com>
|
|
Provide default packet_to_capture value. Display interface name
correctly for "pcap tx/rx trace status".
Type: fix
Signed-off-by: John Lo <loj@cisco.com>
Change-Id: I7064d0dbea236a9aff68bba7fbaf2c4a73b16c6f
Signed-off-by: John Lo <loj@cisco.com>
|
|
Error index calculation is error_code + error_node->error_heap_index.
Type: fix
Fixes: gerrit 20802
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I66cf05a29b3cfd9ef9c5468e399290e862b784af
|
|
Type: fix
Fixes: 99536f4
Change-Id: Ica230ec9fa7f6fd36e2754e8b0b9db555460ca55
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Type: feature
this means DHCP packets are subject to the IP features configured on the interface
- the unicast packets already were sent throught the adj
- added UT for DHCP client sending a unicast renewal
Change-Id: Id50db0b71822f44bf7cb639a524195cdc9873526
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Encoding the vpp node index into the vlib_error_t as a 10-bit quantity
limits us to 1K graph nodes. Unfortunately, a few nodes need 6 bit
per-node error codes. Only a very few nodes have so many counters.
It turns out that there are about 2K total error counters in the system,
which is (approximately) the maximum error heap index.
The current (index,code) encoding limits the number of interfaces to
around 250, since each interface has two associated graph nodes and we
have about 500 "normal, interior" graph node
This patch adds an error-index to node-index map, so we can store
error heap indices directly in the vlib_buffer_t.
Type: refactor
Change-Id: I28101cad3d8750819e27b8785fc0cf71ff54f79a
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Hash keys are not copied by the hash infrastructure, instead the pointer
is used directly. stat_segment_register_gauge() does not allocate a
private object for the key, causing issues when it is freed or reused.
Allocate a private object on insertion into the hashtable instead.
Type: fix
Fixes: 92e3082199d10add866894e86a9762d79a3536c4
Change-Id: Ifb6addfcaec81bdb7ea3512050ce55f06ef09a4c
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
The fast path almost always has to deal with the real
pointers. Deriving the frame pointer from a frame_index requires a
load of the 32bit frame_index from memory, another 64bit load of the
heap base pointer and some calculations.
Lets store the full pointer instead and do a single 64bit load only.
This helps avoiding problems when the heap is grown and frames are
allocated below vm->heap_aligned_base.
Type: refactor
Change-Id: Ifa6e6e984aafe1e2755bff80f0a4dfcddee3623c
Signed-off-by: Andreas Schultz <andreas.schultz@travelping.com>
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Cleaned up a few instances of side-bet elog_string hash table
usage. Elog_string handles that problem itself.
Add cli commands to vat to initialize, enable/disable, and save an
event log.
Event logging at the same time in both vpp and vat yields a pair
of event logs which can be merged by the "test_elog" tool.
Type: refactor
Change-Id: I8d6a72206f2309c967ea1630077fba31aef47f93
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
The CLI code, when it accepts a socket connection, ran a timer
for each session that would ensure the CLI session was started
should the TELNET negotiation stage fail to complete.
It has since transpired that this is unsafe; the timer is capable
of firing in critical sections, during a spinlock, and since we
peform non-trivial things in the handler it can cause a deadlock.
This was reported recently in VPP-1711 but a search of history
suggests this may also be (one of) the causes in VPP-1413.
This change replaces that method with an event-driven process.
The process is created when the first socket connection is
accepted.
When new connections are created the process is sent an event
to register the new session in a list. That event process has
a loop that evaluates the list of oustanding sessions and if
a deadline expires, their session is started if it has not been
already, and then removed from the list.
If we have pending sessions then the loop waits on a timer or an
event; if there are no sessions it waits on events only.
Type: fix
Ticket: VPP-1711
Change-Id: I8c6093b7d0fc1bea0eb790032ed282a0ca169194
Signed-off-by: Chris Luke <chrisy@flirble.org>
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
- Replaces the need to screen scrape "show log".
- Adds an api to return the system time. When running over a socket, the
api client may have different time than the vpp host.
expected use:
vpp_time_before_command = self.vapi.show_vpe_system_time_ticks().vpe_system_time_ticks
<run some commands>
log_output = self.vapi.log_dump(start_timestamp=vpp_time_before_command)
Depends-on: https://gerrit.fd.io/r/20484
Depends-on: https://gerrit.fd.io/r/#/c/19581/
==============================================================================
TestVpeApi
==============================================================================
log_details(_0=838, context=3, timestamp_ticks=2.4954863503546518e+48, level=<vl_api_log_level_t.VPE_API_LOG_LEVEL_WARNING: 4>, timestamp=u'2019/07/04 20:35:41:281', msg_class=u'buffer', message=u'vlib_physmem_shared_map_create: clib_mem_create_hugetlb_fd: open: No such file or directory\n\n')
log_details(_0=838, context=3, timestamp_ticks=1.6101902879480125e+159, level=<vl_api_log_level_t.VPE_API_LOG_LEVEL_WARNING: 4>, timestamp=u'2019/07/04 20:35:41:281', msg_class=u'buffer', message=u'falling back to non-hugepage backed buffer pool')
test_log_dump_default (test_vpe_api.TestVpeApi) OK
log_details(_0=838, context=13, timestamp_ticks=2.4954863503546518e+48, level=<vl_api_log_level_t.VPE_API_LOG_LEVEL_WARNING: 4>, timestamp=u'2019/07/04 20:35:41:281', msg_class=u'buffer', message=u'vlib_physmem_shared_map_create: clib_mem_create_hugetlb_fd: open: No such file or directory\n\n')
log_details(_0=838, context=13, timestamp_ticks=1.6101902879480125e+159, level=<vl_api_log_level_t.VPE_API_LOG_LEVEL_WARNING: 4>, timestamp=u'2019/07/04 20:35:41:281', msg_class=u'buffer', message=u'falling back to non-hugepage backed buffer pool')
test_log_dump_timestamp_0 (test_vpe_api.TestVpeApi) OK
test_log_dump_timestamp_future (test_vpe_api.TestVpeApi) SKIP
test_show_vpe_system_time_ticks (test_vpe_api.TestVpeApi) SKIP
==============================================================================
TEST RESULTS:
Scheduled tests: 4
Executed tests: 4
Passed tests: 2
Skipped tests: 2
==============================================================================
Test run was successful
Type: feature
Change-Id: I893fc0a65f39749d2091093c2c604659aadd8447
Signed-off-by: Paul Vinciguerra <pvinci@vinciconsulting.com>
|
|
Type: feature
Change-Id: Ia3d9a47679202c2a47cd3746b50e86c6b8627ef6
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
this change is being made to fix a crash when current_data < 0 in buffer
linearization function
Ticket: N/A
Type: fix
Fixes: f883f6a1132ad4bb7aa9d9a79d420274fbcf3b64
Change-Id: Ia4ede823f673780e0c30d075b091db42e183650d
Signed-off-by: Klement Sekera <ksekera@cisco.com>
|
|
This fixes two leaks in registering errors in the stats segment.
- The error name created by vlib_register_errors() was not freed.
- Duplicate error names (when interface readded) was added to the vector.
This fix also adds memory usage statistics for the statistics segment
as /mem/statseg/{used, total}
Change-Id: Ife98d5fc5baef5bdae426a5a1eef428af2b9ab8a
Type: fix
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
Type: feature
Change-Id: Ie020fd7e2618284a63efbeb9895068f27c0fb9ab
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Type: fix
Fixes: 910d369
Change-Id: I0e8380cd2b0dc038a028d9cf2568741059de460f
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type:fix
Change-Id: Id7f4f6e2a2f844085f511a33aa1db3968f5d97bb
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
FRAME_QUEUE_NELTS is 64 in thread.c
Change-Id: Ie7e5962afe05dfc7f38e3d597dabc74dcc2dab8d
Signed-off-by: dongjuan <dong.juan1@zte.com.cn>
|
|
Otherwise, all N worker threads try to sort the list at the same time:
a good way to have a bad day.
This approach performs *far* better than maintaing order by adding a
spin-lock. By direct measurement w/ elog + g2: 11 threads execute the
per-thread init function list in 22us, vs. 50ms with a CLIB_PAUSE()
enabled spin-lock.
Change-Id: I1745f2a213c0561260139a60114dcb981e0c64e5
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Maintain the MAINTAINERS file. Removed src/plugins/*.am listings. Added
a couple of plugins.
Add vlib_process_create (vlib_main_t *vm, char *name,
vlib_node_function_t *f, u32 log2_n_stack_bytes);
/** @brief Create a vlib process
* @param vm &vlib_global_main
* @param f the process node function
* @param log2_n_stack_bytes size of the process stack, defaults to 16K
* @return newly-create node index
* @warning call only on the main thread. Barrier sync required.
*/
This function makes it easy to spin up periodic processes when features
are enabled for the first time. That coding pattern is highly recommended.
Update the emacs-lisp plugin generator to use vlib_process_create,
instead of generating static periodic process nodes.
Change-Id: Icda33e93b9034779d3a3e228cd1110af14b058a5
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
- add to the Punt API to allow different descriptions of the desired packets: UDP or exceptions
- move the punt nodes into punt_node.c
- improve tests (test that the correct packets are punted to the registered socket)
Change-Id: I1a133dec88106874993cba1f5a439cd26b2fef72
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Change-Id: Iddeb3a1b0e20706e72ec8f74dabc60b342f003ba
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
The current code only allowed access to the main thread error counters.
That is not so useful for a multi worker instance.
No return a vector indexed by thread of counter_t values.
Type: fix
Change-Id: Ie322c8889c0c8175e1116e71de04a2cf453b9ed7
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
leak-check { <any-debug-cli-command-and-args> }
Hint: "set term history off" or you'll have to sort through a bunch of
bogus leaks related to the debug cli history mechanism.
Cleaned up a set of reported leaks in the "show interface" command. At
some point, we thought about making a per-thread vlib_mains vector,
but we never did that. Several interface-related CLI's maintained
local static cache vectors. Not a bad idea, but not useful as things
shook out. Removed the static vectors.
Change-Id: I756bf2721a0d91993ecfded34c79da406f30a548
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: I1455cb507f6ecffbb053b0e3e2de833dd40fa5f5
Signed-off-by: Paul Vinciguerra <pvinci@vinciconsulting.com>
|
|
- 'set terminal history off' or '... limit 0' has an incorrect
terminal condition and tries to vec_delete one-too-many times
causing a crash.
- Changing >= to > fixes this.
- In any case, a single vec_delete is more efficient, so do that
instead.
Change-Id: Ia0db63b6c5c7891d75b302e793b4e4985dd86ebb
Signed-off-by: Chris Luke <chrisy@flirble.org>
|
|
The vlib init function subsystem now supports a mix of procedural and
formally-specified ordering constraints. We should eliminate procedural
knowledge wherever possible.
The following schemes are *roughly* equivalent:
static clib_error_t *init_runs_first (vlib_main_t *vm)
{
clib_error_t *error;
... do some stuff...
if ((error = vlib_call_init_function (init_runs_next)))
return error;
...
}
VLIB_INIT_FUNCTION (init_runs_first);
and
static clib_error_t *init_runs_first (vlib_main_t *vm)
{
... do some stuff...
}
VLIB_INIT_FUNCTION (init_runs_first) =
{
.runs_before = VLIB_INITS("init_runs_next"),
};
The first form will [most likely] call "init_runs_next" on the
spot. The second form means that "init_runs_first" runs before
"init_runs_next," possibly much earlier in the sequence.
Please DO NOT construct sets of init functions where A before B
actually means A *right before* B. It's not necessary - simply combine
A and B - and it leads to hugely annoying debugging exercises when
trying to switch from ad-hoc procedural ordering constraints to formal
ordering constraints.
Change-Id: I5e4353503bf43b4acb11a45fb33c79a5ade8426c
Signed-off-by: Dave Barach <dave@barachs.net>
|