Age | Commit message (Collapse) | Author | Files | Lines |
|
Included statistic bundles (all NODE type):
- Instructions and CPU cycles, including IPC
- Data cache access/refills/%
- Data TLB cache access/refills/%
- Instruction cache access/refills/%
- Instruction TLB cache access/refills/%
- Memory/Bus accesses, memory errors
- Branch (mis)predictions, architecturally & speculatively executed
- Processor frontend/backend stalls (stalled cycles)
Type: feature
Signed-off-by: Zachary Leaf <zachary.leaf@arm.com>
Tested-by: Jieqiang Wang <jieqiang.wang@arm.com>
Change-Id: I7ea4a27c8df8fc7222b743a98bdceaff727e4112
|
|
This patch enables statistics from the Arm PMUv3 through the perfmon
plugin.
In comparison to using the Linux "perf" tool, it allows obtaining
direct, per node level statistics (rather than per thread). By accessing
the PMU counter registers directly from userspace, we can avoid the
overhead of using a read() system call and get more accurate and fine
grained statistics about the running of individual nodes.
A demo of perfmon on Arm can be found at:
https://asciinema.org/a/egVNN1OF7JEKHYmfl5bpDYxfF
*Important Note*
Perfmon on Arm is dependent on and works only on Linux kernel versions
of v5.17+ as this is when userspace access to Arm perf counters was
included.
On most Arm systems, a maximum of 7 PMU events can be configured at once
- (6x PMU events + 1x CPU_CYCLE counter). If some perf counters are in
use elsewhere by other applications, and there are insufficient counters
remaining to open the bundle, the perf_event_open call will fail
(provided the events are grouped with the group_fd param, which perfmon
currently utilises).
See arm/events.h for a list of PMUv3 events available, although it is
implementation defined whether most events are implemented or not. Only
a small set of 7 events is required to be implemented in Armv8.0, with
some additional events required in later versions. As such, depending on
the implementation, some statistics may not be available. See Arm
Architecture Reference Manual for Armv8-A, D7.10.2 "The PMU event number
space and common events" for more information.
arm/events.c:arm_init() gets information from the sysfs about what
events are implemented on a particular CPU at runtime. Arm's
implementation of the perfmon source callback .bundle_support uses this
information to disable unsupported events in a bundle, or in the case
no events are supported, disable the entire bundle.
Where a particular event in a bundle is not implemented, the statistic
for that event is shown as '-' in the 'show perfmon statistics' cli
output, by disabling the column.
There is additional code in perfmon.c to only open events which are
marked as implemented. Since we're only opening and reading events that
are implemented, some extra logic is required in cli.c to re-align
either perfmon_node_stats_t or perfmon_reading_t with the column
headings configured in each bundle, taking into account disabled
columns.
Userspace access to perf counters is disabled by default, and needs to
be enabled with 'sudo sysctl kernel/perf_user_access=1'.
There is a check built into the Arm event source init function
(arm/events.c:arm_init) to check that userspace reading of perf counters
is enabled in the /proc/sys/kernel/perf_user_access file.
If the above file does not exist, it means the kernel version is
unsupported. Users without a supported kernel will see a warning
message, and no Arm bundles will be registered to use in perfmon.
Enabling/using plugin:
- include the following in startup.conf:
- plugins { plugin perfmon_plugin.so { enable }
- 'show perfmon bundle [verbose]' - show available statistics bundles
- 'perfmon start bundle <bundle-name>' - enable and start logging
- 'perfmon stop' - stop logging
- 'show perfmon statistics' - show output
For a general guide on using and understanding Arm PMUv3 events, see
https://community.arm.com/arm-community-blogs/b/tools-software-ides-blog/posts/arm-neoverse-n1-performance-analysis-methodology
Type: feature
Signed-off-by: Zachary Leaf <zachary.leaf@arm.com>
Tested-by: Jieqiang Wang <jieqiang.wang@arm.com>
Change-Id: I0620fe5b1bbe78842dfb1d0b6a060bb99e777651
|
|
In preparation for enabling perfmon on Arm platforms, move some Intel
/arch specific logic into the /intel directory and update the CMake to
split the common code from arch specific files.
Since the dispatch_wrapper code is very different on Arm/Intel,
each arch can provide their own implementation + conduct any additional
arch specific config e.g. on Intel, all indexes from the mmap pages are
cached. The new method intel_config_dispatch_wrapper conducts this
config and returns a pointer to the dispatch wrapper to use.
Similarly, is_bundle_supported() looks very different on Arm/Intel, so
each implementation is to provide their own arch specific checks.
Two new callbacks/function ptrs are added in PERFMON_REGISTER_SOURCE to
support this - .bundle_support and .config_dispatch_wrapper.
Type: refactor
Signed-off-by: Zachary Leaf <zachary.leaf@arm.com>
Change-Id: Idd121ddcfd1cc80a57c949cecd64eb2db0ac8be3
|
|
Type: fix
Signed-off-by: Artem Glazychev <artem.glazychev@xored.com>
Change-Id: I62f13ee8cb9b86f8106505fd32a03d66c1a73bce
|
|
Type: improvement
Enable use of 4th gen QAT devices. Will be available on Sapphire Rapids.
Signed-off-by: Matthew Smith <mgsmith@netgate.com>
Change-Id: I89e7d29e10ecb4c36c700ff5e017796161ec6c5e
|
|
Type: fix
Signed-off-by: Ivan Shvedunov <ivan4th@gmail.com>
Change-Id: I5ecfb242e5905c9bd8ce19cd9ab6efd657ee14d4
|
|
Type: fix
Signed-off-by: Ivan Shvedunov <ivan4th@gmail.com>
Signed-off-by: Sergey Matov <sergey.matov@travelping.com>
Change-Id: I4ec1a68b7266f05ab7c543cd8207afb29e740743
|
|
0 is not NULL (at least not in all cases), passing 0 into a variadic
function in a place where the consumer reads it as pointer might
leave parts of the pointer uninitilized and hence filled with random
data.
It seems that this used to work with gcc, but clang seems to treat the
0 in those places as a 32bit integer.
Type: fix
Signed-off-by: Ivan Shvedunov <ivan4th@gmail.com>
Signed-off-by: Andreas Schultz <andreas.schultz@travelping.com>
Change-Id: I37d975eef5a1ad98fbfb65ebe47d73458aafea00
|
|
There is a very rare bug in NAT processing that yeilds a thread
index of ~0. When this happens, vlib_get_frame_queue_elt()
suffers a segfault and VPP quits. Prevent an outright fault
by dropping the packet instead.
Type: fix
Signed-off-by: Jon Loeliger <jdl@netgate.com>
Change-Id: I48c7a268925bb821ea15e58db5d4bfb211c40c09
|
|
Type: fix
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: Ie057d0d5a51d3226a1a188cf9d48a5d82dc4a3c7
|
|
Here is bug example:
vpp# create loopback interface
loop0
vpp# vrrp vr add loop0 vr_id 1 priority 100 192.168.1.1 192.168.1.2
vpp# vrrp vr del loop0 vr_id 1
vpp# vrrp vr add loop0 vr_id 1 priority 100 192.168.1.1 192.168.1.2
vrrp vr add: vrrp_vr_add_del returned -105
Type: fix
Signed-off-by: GaoChX <chiso.gao@gmail.com>
Change-Id: I3e0d086ac8fb52756339cff19b9a83911ec9748b
|
|
Type: feature
Signed-off-by: Ahmed Abdelsalam <ahabdels@cisco.com>
Change-Id: I2d3a0211abfee3501d3d77c80da20e67e1e9e133
|
|
Change-Id: I0e1bb39d765ec3efa7b28ca02fb7beeb23607e51
Type: improvement
Signed-off-by: Mohammed Hawari <mohammed@hawari.fr>
|
|
classify hash used to be stored as u64 in buffer metadata, use 32 bits
instead:
- on almost all our supported arch (x86 and arm64) we use crc32c
intrinsics to compute the final hash: we really get a 32-bits hash
- the hash itself is used to compute a 32-bits bucket index by masking
upper bits: we always discard the higher 32-bits
- this allows to increase the l2 classify buffer metadata padding such
as it does not overlap with the ip fib_index metadata anymore. This
overlap is an issue when using the 'set metadata' action in the ip
ACL node which updates both fields
Type: fix
Change-Id: I5d35bdae97b96c3cae534e859b63950fb500ff50
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: refactor
Signed-off-by: Ahmed Abdelsalam <ahabdels@cisco.com>
Change-Id: Iff5e85952273526d5c9d9e7e73bd2b6c15bcd7f6
|
|
svm_msg_q_size_to_alloc must return a valid base address, if it fails
pass up the error for handling
Type: fix
Change-Id: I408492f65f646862122acb9a187819b3bbf4f91c
Signed-off-by: Ofer Heifetz <oferh@marvell.com>
|
|
This patch adds support for the infrastructure
required to support SRv6 Path Tracing defined in
https://datatracker.ietf.org/doc/draft-filsfils-spring-path-tracing/
Type: feature
Change-Id: If3b09d6216490a60dd5a816577477b6399abc124
Signed-off-by: Ahmed Abdelsalam <ahabdels@cisco.com>
|
|
Type: improvement
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: I7afc6116ca9a609992f26d9e78084732bba1b2ea
|
|
This patch adds performacne and functional tests for ip4
outbound traffic policy matching.
Test setup is configurable in startup.conf and though the test
parameters. Cache, fast path, fast path burst mode can be enabled
and disabled,
and performance for different lookup setup can be measured.
Type: feature
Signed-off-by: Piotr Bronowski <piotrx.bronowski@intel.com>
Change-Id: I1d04d196e412f47f43b7e5cbd46607bf6a9cc40e
|
|
This patch updates the "show ipsec spd" cli to display
policies maintained by fast path bihash table.
Type: feature
Signed-off-by: Piotr Bronowski <piotrx.bronowski@intel.com>
Change-Id: I58b9f92f3132dc9809b50786dc912e09c4b84d81
|
|
Parser can be configured from the level of startup.conf file:
fast path can be enabled and disabled.
Type: feature
Signed-off-by: Piotr Bronowski <piotrx.bronowski@intel.com>
Change-Id: Ifab83ddcb75bc44c8165e7fa87a1a56d047732a1
|
|
This patch adds matching functionality for spd fast path
policy matching. Fast path matching has been introduced
for outbound traffic only.
Type: feature
Signed-off-by: Piotr Bronowski <piotrx.bronowski@intel.com>
Change-Id: I03d5edf7d7fbc03bf3e6edbe33cb15bc965f9d4e
|
|
This patch introduces ipsec_output.h file. Matching implementation is
moved there. The reason behind is the possibility of unit testing
matching mechanism. Therefore we need to have functions that are in
scope of our intrest there and since these are inline their
implementation needs to be moved to the header file as well.
Type: improvement
Signed-off-by: Piotr Bronowski <piotrx.bronowski@intel.com>
Change-Id: Id7c605375d1f3be146abf96ef70d336a5d156444
|
|
This patch introduces functions to add and delete fast path
policies.
Type: feature
Signed-off-by: Piotr Bronowski <piotrx.bronowski@intel.com>
Change-Id: I3f1f1323148080c9dac531fbe9fa33bad4efe814
|
|
Type: fix
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: I0963bae4b56b08c0a9ab4ee1f2738013217e1fb7
|
|
Type: fix
Signed-off-by: Florin Coras <fcoras@cisco.com>
Signed-off-by: Dave Wallace <dwallacelf@gmail.com>
Change-Id: Idc0fdebfea29c241d8a36128241ccec03eace5fd
|
|
This patch introdcues basic types supporting fast path lookup.
Fast path performs policy matching with use of hash lookup
(particularly bihash tries has been used for that purpose). Fast path
lookup addresses situation where huge number of policies is created
(~100k or more). In such scenario adding/removing a policy
and policy matching is not efficient and poorly scales (for example
adding 500k policies takes a few hours. Also lookup time
increases significantly). With fast path adding and matching up to
1M flows scales up linearly (adding 1M of policies takes about 150s
on the test machine vs many hours in case of original implementation,
also matching time is significantly improved). Fast path will not
deal well with a huge number of policies that are spanning large
ip/port ranges. Large range will be masked out almost entirely leaving
only a few bits for calculating the hash key. Such keys will tend to
gather much more policies than other keys and hash will match most of
the packets anihilating advantages of hashing. Having said that
we also think that it is not the real life scenario.
Type: feature
Signed-off-by: Piotr Bronowski <piotrx.bronowski@intel.com>
Change-Id: I600dae5111a37768ed4b23aa18426e66bbf7b529
|
|
Currently 0 has been used as the wildcard representing ANY type of
protocol. However 0 is valid value of ip protocol (HOPOPT) and therefore
it should not be used as a wildcard. Instead 255 is used which is
guaranteed by IANA to be reserved and not used as a protocol id.
Type: improvement
Signed-off-by: Piotr Bronowski <piotrx.bronowski@intel.com>
Change-Id: I2320bae6fe380cb999dc5a9187beb68fda2d31eb
|
|
if when the rx_fifo grows, svm_fifo_enqueue() return -4,
stream_data->app_rx_data_len += rlen type conversion occurs,
Finally,stream->recvstate.data_off calculation is wrong.
Type:fix
Signed-off-by: fanxb <fxb_mail@163.com>
Change-Id: Iae11f0c453f32d836f4148d70e3b121545a53a90
|
|
Type: fix
Currently, prometheus exporter may crash because of memory exhaustion
when dumps metrics if the FIB contains large number of routes.
With this fix, increase memory size for prometheus exporter to be able
to handle large number of FIB entries.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
Change-Id: Ia2b9a665368883c87448deee9bcf8d2ac1168357
|
|
Type: fix
Added stats for success and failure cases
Fixed Custom app behaviors for the error / drop cases
Signed-off-by: Vijayabhaskar Katamreddy <vkatamre@cisco.com>
Change-Id: Id6e981c7be5c5b3cee5af2df505666d5558da470
|
|
Issue:
Let iperf3 server run via ldp and vcl on top of vpp's host stack. If
iperf3 client connects this iperf3 server with tcp MSS setting option,
iperf3 server will always crash.
Root cause:
When MSS option is specified by iperf3 client, iperf3 server will
recreate the listening socket firstly, then call setsockopt() to set MSS
immediately. Iperf3 code can be referred here:
https://github.com/esnet/iperf/blob/58332f8154e2140e40a6e0ea060a418138291718/src/iperf_tcp.c#L186.
However, in vcl layer vpp_evt_q of this recreated session is not
allocated yet. So iperf3 server crashes with vpp_evt_q null pointer access.
Fix:
Add session vpp_evt_q null pointer check in vcl_session_transport_attr().
Add a vcl test case for this MSS option scenario.
Type: fix
Signed-off-by: Liangxing Wang <liangxing.wang@arm.com>
Change-Id: I2863bd0cffbe6e60108ab333f97c00530c006ba7
|
|
Type: fix
Change-Id: I1e8655baaf09b455f7f0052452402a372f738d0f
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: improvement
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: I9c502a491ff56806a2e631f7a4c18903a2e93ab2
|
|
Type: improvement
Change-Id: I85c73cb940d81d0b249eda0d57de135bcd798418
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: improvement
Change-Id: I7489327d8b9c5f69b4ceb2159456f00f8a3612df
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: improvement
Change-Id: Ic949e3136a7cf27011d098a50e91920f83226ea9
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: improvement
When packets were received and processed successfully, increment the
byte/packet counters for the tunnel interface.
Change-Id: I42855607ac6916de641be42aac86c9942cc97140
Signed-off-by: Matthew Smith <mgsmith@netgate.com>
|
|
We were not allocating space for the
variable length payload in the response
message.
Type: fix
Change-Id: I345102f4555f66c5632ab0882ca1dd178e98eb7b
Signed-off-by: Nathan Skrzypczak <nathan.skrzypczak@gmail.com>
|
|
If ip4_neighbor_probe (or any other) is sending packet to a deleted interface,
ASSERT trips and dataplane crashes. Example:
create loopback interface instance 0
set interface ip address loop0 10.0.0.1/32
set interface state GigabitEthernet3/0/1 up
set interface state loop0 up
set interface state loop0 down
set interface ip address del loop0 10.0.0.1/32
delete loopback interface intfc loop0
set interface state GigabitEthernet3/0/1 down
set interface state GigabitEthernet3/0/1 up
comment { the following crashes VPP }
set interface state GigabitEthernet3/0/1 down
This sequence reliably crashes VPP:
(gdb)p n->name
$4 = (u8 *) 0x7fff82b47578 "interface-3-output-deleted”
If the interface doesn't exist, return ~0 and be tolerant of this in the
two call sites of counter_index()
Type: fix
Signed-off-by: Pim van Pelt <pim@ipng.nl>
Change-Id: I90ec58fc0d14b20c9822703fe914f2ce89acb18d
|
|
Adding support for the SRv6 TEF (Timestamp, Encapsulation and Forward) behavior defined in
draft-filsfils-spring-path-tracing (https://datatracker.ietf.org/doc/draft-filsfils-spring-path-tracing/).
Type: feature
Change-Id: I7f38b593147daf8d27af9c983448cf82947e5bed
Signed-off-by: Ahmed Abdelsalam <ahabdels@cisco.com>
|
|
Type: fix
Currently, neighbor adjacencies on a wg interface are converted into a
midchain only if one of the peers has a matching allowed prefix
configured. If create a route that goes through a wg interface but the
next-hop address does not match any allowed prefixes, an ARP/ND request
will try to be sent via the wg interface to resolve the next-hop address
when matching traffic occurs. And sending an ARP request will cause VPP
to crash while copying hardware address of the wg interface which is
NULL. Sending an ND message will not cause VPP to crash but the error
logged will be unclear (no source address).
With this fix, convert all neighbor adjacencies on a wg interface into a
midchain and update tests to cover the case. If there is no matching
allowed prefix configured, traffic going such routes will be dropped
because of "Peer error". No changes if there is matching allowed prefix
configured.
Also, fix getting peer by adjacency index.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
Change-Id: I15bc1e1f83de719e97edf3f7210a5359a35bddbd
|
|
Type: fix
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: Ia66c12e1da126d0d8d101b645e6dc8454c3826d6
|
|
Type: improvement
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: Ic68627bbca676cc78b0be05bc1fa0f386f5d27fa
|
|
Type: fix
Signed-off-by: Filip Tehlar <ftehlar@cisco.com>
Change-Id: I646ac946d0b07929dfdd1966a4f4a3b697768040
|
|
The flow_report_process_send() function always allocates a frame.
However, when no template_send is needed, template_bi is ~0.
When this happens, no vectors are placed in the frame. When
the frame is then "put", a check for n_vectors == 0 prevents
the frame from actually being placed back on the free list.
Fix that by using a direct call to vlib_frame_free() when
there are no frame vctors.
Type: fix
Signed-off-by: Jon Loeliger <jdl@netgate.com>
Change-Id: I936b5cea4cb3c358247c3d2e1a77d034a322ea76
|
|
Type: improvement
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: I3c573641bd95fe899823b66f6c59a2525a18d293
|
|
Type: fix
reported stats seem to have mixed up used and total counters
Signed-off-by: Leland Krych <leland.krych@gmail.com>
Change-Id: I221c7b114c0da2ed53171d7f047a4bda07ee6cb2
|
|
https://docs.python.org/3/library/stdtypes.html
"if concatenating bytes objects, you can similarly use bytes.join() or io.BytesIO, or you can do in-place concatenation with a bytearray object. bytearray objects are mutable and have an efficient overallocation mechanism"
Type: improvement
Signed-off-by: Viktor Velichkin <avisom@yandex.ru>
Change-Id: Id20d337f909cce83fcd9e08e8049bb0bf5970fbc
|
|
Allows features to update their data structures after change in number
of worker threads.
Type: improvement
Change-Id: Icd4d197e28608f5bbb1edd13eb624cd98e33cafe
Signed-off-by: Damjan Marion <damarion@cisco.com>
|