Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch introduces DMA infrastructure into vlib. This is well known
that large amount of memory movements will drain core resource. Nowadays
more and more hardware accelerators were designed out for freeing core
from this burden. Meanwhile some restrictions still remained when
utilizing hardware accelerators, e.g. cross numa throughput will have a
significant drop compared to same node. Normally the number of hardware
accelerator instances will less than cores number, not to mention that
applications number will even beyond the number of cores. Some hardware
may support share virtual address with cores, while others are not.
Here we introduce new DMA infrastructure which can fulfill the
requirements of vpp applications like session and memif and in the
meantime dealing with hardware limitations.
Here is some design backgrounds:
Backend is the abstract of resource which allocated from DMA device
and can do some basic operations like configuration, DMA copy and
result query.
Config is the abstract of application DMA requirement. Application
need to request an unique config index from DMA infrastructure. This
unique config index is associated with backend resource. Two options
cpu fallback and barrier before last can be specified in config.
DMA transfer will be performed by CPU when backend is busy if cpu
fallback option is enabled. DMA transfer callback will be in order
if barrier before last option is enabled.
We constructs all the stuffs that DMA transfer request needed into
DMA batch. It contains the pattern of DMA descriptors and function
pointers for submission and callback. One DMA transfer request need
multiple times batch update and one time batch submission.
DMA backends will assigned to config's workers threads equally. Lock
will be used for thread-safety if same backends assigned to multiple
threads. Backend node will check all the pending requests in worker
thread and do callback with the pointer of DMA batch if transfer
completed. Application can utilize cookie in DMA batch for selves
usage.
DMA architecture:
+----------+ +----------+ +----------+ +----------+
| Config1 | | Config2 | | Config1 | | Config2 |
+----------+ +----------+ +----------+ +----------+
|| || || ||
+-------------------------+ +-------------------------+
| DMA polling thread A | | DMA polling thread B |
+-------------------------+ +-------------------------+
|| ||
+----------+ +----------+
| Backend1 | | Backend2 |
+----------+ +----------+
Type: feature
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Change-Id: I1725e0c26687985aac29618c9abe4f5e0de08ebf
|
|
The ipv6 header length should not be counted in the ipv6 payload length.
This is similar to https://gerrit.fd.io/r/c/vpp/+/36945.
Type: fix
Change-Id: I22de0ff828175829102a85288513ee3f55709108
Signed-off-by: Aloys Augustin <aloaugus@cisco.com>
|
|
Type: improvement
VPP crashes when a linux-cp tap is added to a bridge on the host system
because rtnl_neigh_get_dst() returns NULL for the neighbor message that
is sent by the kernel.
Check for NULL before trying to use the address from a neighbor in a
netlink message.
Signed-off-by: Matthew Smith <mgsmith@netgate.com>
Change-Id: I8a683d815a09620df9c0cc76e18df39828428e2c
Signed-off-by: Matthew Smith <mgsmith@netgate.com>
|
|
Add the error checks in parsing, aimed to avoid parser walking past the end of packet in case the data
is garbage.
Type: fix
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
Change-Id: I9541b555a18baf63cb8081bcd7a4c2750f2ed012
|
|
flags is u64, makes sure we do not overflow when shifting.
Type: fix
Change-Id: Ieea34187c0b568dc4d24c9415b9cff36907a5a87
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
rather than using obfuscated macro hacery, simplify
the per-protocol data management by directly using
an array of NAT protocol types.
Type: refactor
Signed-off-by: Jon Loeliger <jdl@netgate.com>
Change-Id: I6fe987556ac9f402f8d490da0740e2b91440304c
|
|
Type: improvement
If an SA protecting an IPv6 tunnel interface has UDP encapsulation
enabled, the code in esp_encrypt_inline() inserts a UDP header but does
not set the next protocol or the UDP payload length, so the peer that
receives the packet drops it. Set the next protocol field and the UDP
payload length correctly.
The port(s) for UDP encapsulation of IPsec was not registered for IPv6.
Add this registration for IPv6 SAs when UDP encapsulation is enabled.
Add punt handling for IPv6 IKE on NAT-T port.
Add registration of linux-cp for the new punt reason.
Add unit tests of IPv6 ESP w/ UDP encapsulation on tun protect
Signed-off-by: Matthew Smith <mgsmith@netgate.com>
Change-Id: Ibb28e423ab8c7bcea2c1964782a788a0f4da5268
|
|
Free up the vapi context in case of failure.
Type: fix
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
Change-Id: I4f64e8718014d714f1b82877e69d2354b5fa44fb
|
|
Crypto backend errors should not be using the same error as missing
keypair.
Type: fix
Change-Id: I78c2b3df3f08a354463b7824349b08627f2b023c
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
IPv6 payload length should not include the size of the IPv6 header.
Type: fix
Change-Id: Iedcd17d0af8d72d9b5f8f9b605da7c99e151bc9d
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Previously, each address maintained an array of 32-bit
reference counts for each of 65K possible ports for each
of 4 NAT protocols. Totalling 1MB per address. Wow.
A close read of the code shows that an "is used" check
precedes each attempted reference count increment.
That means the refcount never actually gets above 1.
That in turn means algorithmically, a bit vector is
sufficient. And one need not be allocated for more
than the highest validated port referenced.
These changes introduce a dynamically sized bit vector
replacing the reference counts, for a maximum of 32K
if all 4 protocols use port 65535. In fact, protocol
OTHER is never used, so at most 24K will be used, and
none of it will be "statically" allocated per address.
Type: fix
Fixes: 85bee7548bc5a360851d92807dae6d4159b68314
Change-Id: I7fd70050e7bf4871692a862231f8f38cf0158132
Signed-off-by: Jon Loeliger <jdl@netgate.com>
|
|
Zero-initialize the temporary struct on stack.
Type: fix
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
Change-Id: I89ced4cca8e832827fe054e2e60986de5910360c
|
|
Zero-initialize the temporary struct on stack.
Type: fix
Change-Id: I651f87deeb79c6c073d5c510435fa268893a3b0e
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|
|
In RFC 7296, CREATE_CHILD_SA Exchange may contain the KE payload
to enable stronger guarantees of forward secrecy.
When the KEi payload is included in the CREATE_CHILD_SA request,
responder should reply with the KEr payload and complete the key
exchange, in accordance with the RFC.
Type: improvement
Signed-off-by: Atzm Watanabe <atzmism@gmail.com>
Change-Id: I13cf6cf24359c11c3366757e585195bb7e999638
|
|
Type: fix
Signed-off-by: Atzm Watanabe <atzmism@gmail.com>
Change-Id: Icbd452b43ecaafe46def1276c98f7e8cbf761e51
|
|
We validate each descriptor via memif_validate_desc_data and set
desc_status to non-zero for the corresponding descriptor when
the descriptor is bad. However, desc_status is not propagated back to
xor_status in memif_validate_desc_data which eventually sets
ptd->xor_status.
Not setting ptd->xor_status causes us to treat all descriptors as
"simple". In that case, when we try to copy also the bad descriptors to
the buffers, it results a crash since desc_data is not set to point
to the correct memory in the descriptor.
The fix is to set xor_status in memif_validate_desc_data such that if
there is a bad descriptor in the frame, "is_simple" is set to false and
we have to selectively copy only the good descriptors to the buffers.
Type: fix
Signed-off-by: Steven Luong <sluong@cisco.com>
Change-Id: I780f51a42aa0f8745edcddebbe02b2961c183598
|
|
Type: fix
After peers roaming support addition, FIB entry tracking stopped
working. For example, it can be observed when an adjacency is stacked on
a FIB entry by the plugin and the FIB entry hasn't got ARP resolution
yet. Once the FIB entry gets ARP resolution, the adjacency is not
re-stacked as it used to. This results in endless ARP requests when a
traffic is sent via the adjacency.
This is broken because the plugin stopped using "midchain delegate" with
peers roaming support addition. The reason is that "midchain delegate"
didn't support stacking on a different FIB entry which is needed when
peer's endpoint changes. Now it is supported there (added in 36892).
With this fix, start using "midchane delegate" again and thus, fix FIB
entry tracking. Also, cover this in tests.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
Change-Id: Iea91f38739ab129e601fd6567b52565dbd649371
|
|
In several NAT submodules, the number of available ports (0xffff - 1024)
may not be divisible by the number of workers, so port_per_thread is
determined by integer division, which is the floor of the quotient.
Later when a worker index is needed, dividing the port with port_per_thread
may yield an out-of-bound array index into the workers array.
As an example, assume 2 workers are configured, then port_per_thread
will be (0xffff - 1024) / 2, which is 32255. When we compute a worker
index with port 0xffff, we get (0xffff - 1024) / 32255, which is 2,
but since we only have 2 workers, only 0 and 1 are valid indices.
This patch fixes the problem by adding a modulo at the end of the division.
Type: fix
Signed-off-by: Jing Peng <pj.hades@gmail.com>
Change-Id: Ieae3d5faf716410422610484a68222f1c957f3f8
|
|
we need cancel vrrp_vr_timer when deleting vrrp vr
Type: fix
Signed-off-by: luoyaozu <luoyaozu@chinatelecom.cn>
Change-Id: I8ea01f1943d6e3e60c4990c5be945de613bc8b53
|
|
Type: fix
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: I18b9d0d67f5fe4c1714427259df29026153d8dd1
|
|
Type: improvement
If a tun/L3 interface is paired with a multipoint tunnel interface,
pass packets arriving from the host to ip[46]-lookup instead of
cross-connecting them to the tunnel interface. Adjacencies are used
to drive the rewrite for Multipoint tunnel interfaces, so the generic
adjacency used with a P2P tunnel will not work correctly.
Change-Id: I2d8be56dc5029760978c05bc4953f84c8924a412
Signed-off-by: Matthew Smith <mgsmith@netgate.com>
|
|
Type: fix
Signed-off-by: Atzm Watanabe <atzmism@gmail.com>
Change-Id: I11b6107492004a45104857dc2dae01b9a5a01e3b
|
|
Type: feature
With this change, peers are able to roam between different external
endpoints. Successfully authenticated handshake or data packet that is
received from a new endpoint will cause the peer's endpoint to be
updated accordingly.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
Change-Id: Ib4eb7dfa3403f3fb9e8bbe19ba6237c4960c764c
|
|
Type: feature
With this change, if being under load a handshake message with both
valid mac1 and mac2 is received, the peer will be rate limited. Cover
this with tests.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
Change-Id: Id8d58bb293a7975c3d922c48b4948fd25e20af4b
|
|
Type: feature
stats of the like from:
https://datatracker.ietf.org/doc/html/draft-ietf-rtgwg-arp-yang-model-03#section-4
Signed-off-by: Neale Ranns <neale@graphiant.com>
Change-Id: Icb1bf4f6f7e6ccc2f44b0008d4774b61cae96184
|
|
Type: feature
With this change:
- if the number of received handshake messages exceeds the limit
calculated based on the peers number, under load state will activate;
- if being under load a handshake message with a valid mac1 is
received, but mac2 is invalid, a cookie reply will be sent.
Also, cover these with tests.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
Change-Id: I3003570a9cf807cfb0b5145b89a085455c30e717
|
|
Type: fix
Signed-off-by: Atzm Watanabe <atzmism@gmail.com>
Change-Id: I065bd5c26055d863d786023970e7deeed261b31c
|
|
Type: feature
Change-Id: I0abbe925d6b9d3dd7196cd8beaf4f471beb45bd6
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
May cause pointers point to unexpected non-zero addresses if not
validate vec
Type: fix
Change-Id: Ie4d3343d6734125b98e0dc962e33e0c7514da829
Signed-off-by: GaoChX <chiso.gao@gmail.com>
|
|
Type: feature
Currently, if a handshake message is sent and a cookie message is
received in reply, the cookie message will be ignored. Thus, further
handshake messages will not have valid mac2 and handshake will not be
able to be completed.
With this change, process received cookie messages to be able to
calculate mac2 for further handshake messages sent. Cover this with
tests.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
Change-Id: I6d51459778b7145be7077badec479b2aa85960b9
|
|
If an API methos is specified as "autoendian" it should use macros with
_END at the end.
Type: fix
Change-Id: I73b7b4f6996b30631c4355ace156ed0665c4b8ad
Signed-off-by: Stanislav Zaikin <zstaseg@gmail.com>
|
|
namespace is a keyword for c++ compilers
Type: fix
Change-Id: Ia8fc9ef1cc15fe9d0e40b3f543f9e8f411203b89
Signed-off-by: Stanislav Zaikin <zstaseg@gmail.com>
|
|
Type: fix
A user had trouble compiling C++ code to work with the linux-cp APIs
because some messages contain a field called namespace, which is a
reserved word for C++. We wish to rename those fields so the messages
which are affected are being set to in_progress.
Change-Id: I3bd1dc898c146a9980161a562b2b453313bb58fd
Signed-off-by: Matthew Smith <mgsmith@netgate.com>
|
|
Build vpp with MLX DPDK PMD,
make DPDK_MLX4_PMD=y DPDK_MLX5_PMD=y DPDK_MLX5_COMMON_PMD=y build-release
With no-multi-seg in startup.conf,
Mellanox NIC init failed with following message,
rte_eth_rx_queue_setup[port:2, errno:-12]: Unknown error -12
mlx5_net: port 2 Rx queue 0: Scatter offload is not configured and
no enough mbuf space(2176) to contain the maximum RX packet length(2065)
with head-room(128)
In Mellanox NIC PMD driver, 'di.max_rx_pktlen' is returned as 65536,
and 'di.max_mtu' is returned as 65535, which makes
the driver_frame_overhead logic not suitable for Mellanox NICs.
So skip the logic code if MAX_MTU is returned as 65535.
Type: fix
Fixes: 1cd0e5dd533f ("vnet: distinguish between max_frame_size and MTU")
Signed-off-by: Tianyu Li <tianyu.li@arm.com>
Change-Id: I027b76b8d07fb453015b8eebb36d160b4bc8df9c
|
|
Type: fix
Fixes: 5b4b4c0
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: If4bd8f30cd23d862109cab665251ad89804b1734
|
|
Included statistic bundles (all NODE type):
- Instructions and CPU cycles, including IPC
- Data cache access/refills/%
- Data TLB cache access/refills/%
- Instruction cache access/refills/%
- Instruction TLB cache access/refills/%
- Memory/Bus accesses, memory errors
- Branch (mis)predictions, architecturally & speculatively executed
- Processor frontend/backend stalls (stalled cycles)
Type: feature
Signed-off-by: Zachary Leaf <zachary.leaf@arm.com>
Tested-by: Jieqiang Wang <jieqiang.wang@arm.com>
Change-Id: I7ea4a27c8df8fc7222b743a98bdceaff727e4112
|
|
This patch enables statistics from the Arm PMUv3 through the perfmon
plugin.
In comparison to using the Linux "perf" tool, it allows obtaining
direct, per node level statistics (rather than per thread). By accessing
the PMU counter registers directly from userspace, we can avoid the
overhead of using a read() system call and get more accurate and fine
grained statistics about the running of individual nodes.
A demo of perfmon on Arm can be found at:
https://asciinema.org/a/egVNN1OF7JEKHYmfl5bpDYxfF
*Important Note*
Perfmon on Arm is dependent on and works only on Linux kernel versions
of v5.17+ as this is when userspace access to Arm perf counters was
included.
On most Arm systems, a maximum of 7 PMU events can be configured at once
- (6x PMU events + 1x CPU_CYCLE counter). If some perf counters are in
use elsewhere by other applications, and there are insufficient counters
remaining to open the bundle, the perf_event_open call will fail
(provided the events are grouped with the group_fd param, which perfmon
currently utilises).
See arm/events.h for a list of PMUv3 events available, although it is
implementation defined whether most events are implemented or not. Only
a small set of 7 events is required to be implemented in Armv8.0, with
some additional events required in later versions. As such, depending on
the implementation, some statistics may not be available. See Arm
Architecture Reference Manual for Armv8-A, D7.10.2 "The PMU event number
space and common events" for more information.
arm/events.c:arm_init() gets information from the sysfs about what
events are implemented on a particular CPU at runtime. Arm's
implementation of the perfmon source callback .bundle_support uses this
information to disable unsupported events in a bundle, or in the case
no events are supported, disable the entire bundle.
Where a particular event in a bundle is not implemented, the statistic
for that event is shown as '-' in the 'show perfmon statistics' cli
output, by disabling the column.
There is additional code in perfmon.c to only open events which are
marked as implemented. Since we're only opening and reading events that
are implemented, some extra logic is required in cli.c to re-align
either perfmon_node_stats_t or perfmon_reading_t with the column
headings configured in each bundle, taking into account disabled
columns.
Userspace access to perf counters is disabled by default, and needs to
be enabled with 'sudo sysctl kernel/perf_user_access=1'.
There is a check built into the Arm event source init function
(arm/events.c:arm_init) to check that userspace reading of perf counters
is enabled in the /proc/sys/kernel/perf_user_access file.
If the above file does not exist, it means the kernel version is
unsupported. Users without a supported kernel will see a warning
message, and no Arm bundles will be registered to use in perfmon.
Enabling/using plugin:
- include the following in startup.conf:
- plugins { plugin perfmon_plugin.so { enable }
- 'show perfmon bundle [verbose]' - show available statistics bundles
- 'perfmon start bundle <bundle-name>' - enable and start logging
- 'perfmon stop' - stop logging
- 'show perfmon statistics' - show output
For a general guide on using and understanding Arm PMUv3 events, see
https://community.arm.com/arm-community-blogs/b/tools-software-ides-blog/posts/arm-neoverse-n1-performance-analysis-methodology
Type: feature
Signed-off-by: Zachary Leaf <zachary.leaf@arm.com>
Tested-by: Jieqiang Wang <jieqiang.wang@arm.com>
Change-Id: I0620fe5b1bbe78842dfb1d0b6a060bb99e777651
|
|
In preparation for enabling perfmon on Arm platforms, move some Intel
/arch specific logic into the /intel directory and update the CMake to
split the common code from arch specific files.
Since the dispatch_wrapper code is very different on Arm/Intel,
each arch can provide their own implementation + conduct any additional
arch specific config e.g. on Intel, all indexes from the mmap pages are
cached. The new method intel_config_dispatch_wrapper conducts this
config and returns a pointer to the dispatch wrapper to use.
Similarly, is_bundle_supported() looks very different on Arm/Intel, so
each implementation is to provide their own arch specific checks.
Two new callbacks/function ptrs are added in PERFMON_REGISTER_SOURCE to
support this - .bundle_support and .config_dispatch_wrapper.
Type: refactor
Signed-off-by: Zachary Leaf <zachary.leaf@arm.com>
Change-Id: Idd121ddcfd1cc80a57c949cecd64eb2db0ac8be3
|
|
Type: fix
Signed-off-by: Artem Glazychev <artem.glazychev@xored.com>
Change-Id: I62f13ee8cb9b86f8106505fd32a03d66c1a73bce
|
|
Type: improvement
Enable use of 4th gen QAT devices. Will be available on Sapphire Rapids.
Signed-off-by: Matthew Smith <mgsmith@netgate.com>
Change-Id: I89e7d29e10ecb4c36c700ff5e017796161ec6c5e
|
|
0 is not NULL (at least not in all cases), passing 0 into a variadic
function in a place where the consumer reads it as pointer might
leave parts of the pointer uninitilized and hence filled with random
data.
It seems that this used to work with gcc, but clang seems to treat the
0 in those places as a 32bit integer.
Type: fix
Signed-off-by: Ivan Shvedunov <ivan4th@gmail.com>
Signed-off-by: Andreas Schultz <andreas.schultz@travelping.com>
Change-Id: I37d975eef5a1ad98fbfb65ebe47d73458aafea00
|
|
Here is bug example:
vpp# create loopback interface
loop0
vpp# vrrp vr add loop0 vr_id 1 priority 100 192.168.1.1 192.168.1.2
vpp# vrrp vr del loop0 vr_id 1
vpp# vrrp vr add loop0 vr_id 1 priority 100 192.168.1.1 192.168.1.2
vrrp vr add: vrrp_vr_add_del returned -105
Type: fix
Signed-off-by: GaoChX <chiso.gao@gmail.com>
Change-Id: I3e0d086ac8fb52756339cff19b9a83911ec9748b
|
|
Type: improvement
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: I7afc6116ca9a609992f26d9e78084732bba1b2ea
|
|
This patch adds performacne and functional tests for ip4
outbound traffic policy matching.
Test setup is configurable in startup.conf and though the test
parameters. Cache, fast path, fast path burst mode can be enabled
and disabled,
and performance for different lookup setup can be measured.
Type: feature
Signed-off-by: Piotr Bronowski <piotrx.bronowski@intel.com>
Change-Id: I1d04d196e412f47f43b7e5cbd46607bf6a9cc40e
|
|
if when the rx_fifo grows, svm_fifo_enqueue() return -4,
stream_data->app_rx_data_len += rlen type conversion occurs,
Finally,stream->recvstate.data_off calculation is wrong.
Type:fix
Signed-off-by: fanxb <fxb_mail@163.com>
Change-Id: Iae11f0c453f32d836f4148d70e3b121545a53a90
|
|
Type: improvement
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: I9c502a491ff56806a2e631f7a4c18903a2e93ab2
|
|
Type: improvement
When packets were received and processed successfully, increment the
byte/packet counters for the tunnel interface.
Change-Id: I42855607ac6916de641be42aac86c9942cc97140
Signed-off-by: Matthew Smith <mgsmith@netgate.com>
|
|
Type: fix
Currently, neighbor adjacencies on a wg interface are converted into a
midchain only if one of the peers has a matching allowed prefix
configured. If create a route that goes through a wg interface but the
next-hop address does not match any allowed prefixes, an ARP/ND request
will try to be sent via the wg interface to resolve the next-hop address
when matching traffic occurs. And sending an ARP request will cause VPP
to crash while copying hardware address of the wg interface which is
NULL. Sending an ND message will not cause VPP to crash but the error
logged will be unclear (no source address).
With this fix, convert all neighbor adjacencies on a wg interface into a
midchain and update tests to cover the case. If there is no matching
allowed prefix configured, traffic going such routes will be dropped
because of "Peer error". No changes if there is matching allowed prefix
configured.
Also, fix getting peer by adjacency index.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
Change-Id: I15bc1e1f83de719e97edf3f7210a5359a35bddbd
|
|
Type: fix
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: Ia66c12e1da126d0d8d101b645e6dc8454c3826d6
|
|
Type: improvement
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: Ic68627bbca676cc78b0be05bc1fa0f386f5d27fa
|