Age | Commit message (Collapse) | Author | Files | Lines |
|
~5 clocks/packet improvement...
Change-Id: I1a78fa24dcd1b3ab7f45e10b9ded50f79517114a
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
This is ~50% improvement in buffer alloc performance.
For a 256 buffer allocation, it was ~10 clocks/buffer, now is < 5 clocks.
Change-Id: I97590e240a79a42bcab5eb26587fc2d11e6eb163
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: Ib121b24935d5c706cfba6e4b6d321086a38cad91
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
- implement a 1us purgatory for the session structures
by adding a special connection list, where all connections
about to be deleted go.
- add per-list-head timeouts updated upon the list enqueue/dequeue
for connection idle management
- add a "unused" session list with list ID#0, which should
never be used unless there is a logic error. Use this ID
to initialize the sessions.
- improve the maintainability of the session linked list
structures by using symbolic bogus index name instead of ~0
- change the ordering of session creations - first reverse, then
local. To minimize the potential for two workers competing for
the same session in the corner case of the two packets
on different workers creating the same logical session
- reduce the maximum session count to keep the memory usage the same
- add extra log/debug/trace to session cleaning logic
- be more aggressive with cleaning up sessions - wind up the
interrupts from the workers to themselves if there is more
work to do
Change-Id: I3aa1c91a925a08e83793467cb15bda178c21e426
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|
|
Prior to the change, dpdk plugin assumes xd->device_index is
used both as index for internal dpdk_main->devices array
and DPDK port index to call into DPDK APIs.
However, when running on top of Failsafe PMDs,
DPDK port index range may no longer be contiguous (as noted:
http://dpdk.org/ml/archives/dev/2018-March/092375.html
for related changes in DPDK). Because this, dpdk plugin can
no longer iterate through all available DPDK ports
with a for 0->rte_eth_dev_count() loop and the assumption of
device_index no longer holds.
This is part of initial effort to enable vpp running over
dpdk on failsafe PMD in Microsoft Azure(3/4).
Change-Id: I416fd80f2d40e12e139f8f3492814da98343eae7
Signed-off-by: Rui Cai <rucai@microsoft.com>
|
|
Change-Id: If1ef2d4bc6f90a4d4b6a345c63723117834c6504
Signed-off-by: Ping Yu <ping.yu@intel.com>
|
|
This fixes some compilation warnings with clang on AArch64.
Change-Id: Idb941944e3f199f483c80e143a9e5163a031c4aa
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
port_id be used for dpdk port_id
Change-Id: Ia7d8cdc5dec2ad658c11f9c0f3ef8005a470ac3c
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
This breaks VFIO operation.
This reverts commit d3b3baa4f8e9e4d95264aff16fe85434ef8061bd.
Change-Id: I2482e0da2d1ebfc365d13668c4b992b040f561b4
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: Ibab5e27277f618ceb2d543b9d6a1a5f191e7d1db
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
1、Adding PMD type for support Cavium LiquidIO II CN23XX NIC;
2、Our company is using VPP + DPDK +Cavium LiquidIO II CN23XX NIC,
Unfortunately, the latest VPP code does not support
Cavium LiquidIO II CN23XX pci.
So I increased the PMD type to support LiquidIO NIC,
and can run normally, we most subsequent projects are
based on VPP + DPDK + Cavium LiquidIO II CN23XX NIC model,
so I hope VPP team can adopt this requirement, thanks a lot.
Change-Id: I604ae444d69b37c2e26962bfe4ccdfe983b75041
Signed-off-by: chuhong yao <ych@panath.cn>
|
|
- Currently mempool priv size is getting initialized after releasing buffers
to pool. This is causing mismatch in expected & real metadata size value
and buffers are getting released with wrong offset. (when metadata offset
is in use for a given platform)
- Since private data size is 0 initially, metadata size don't include space
for VLIB_BUFFER_HDR.
Change-Id: I780c4d518104631a3dcf192185bacf58b3598e65
Signed-off-by: Sachin Saxena <sachin.saxena@nxp.com>
|
|
Change-Id: I088163f10ae5515d7a9115781cc13ef563fafed5
Signed-off-by: Matus Fabian <matfabia@cisco.com>
|
|
Change-Id: I9343672c5765a5a4cb56c99fa5de176ddcac62c7
Signed-off-by: Andrey "Zed" Zaikin <zed.0xff@gmail.com>
|
|
next nodes
Use the new frame-at-once functions vlib_get_buffers() and vlib_buffer_enqueue_to_next()
to calculate the buffer pointers and to dispatch the packets after the processing.
This simplifies the dataplane node processing loop.
Change-Id: I454308f847aac76a199f8dd7490c1e176414bde7
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|
|
- Fix the issue where eal iova mode is Virtual Address (RTE_IOVA_VA) but
setting DMA iova address to Physical address value always.
Change-Id: Ib1e9c1596d95885c7eff11723338121627203e61
Signed-off-by: Sachin Saxena <sachin.saxena@nxp.com>
|
|
clib_bihash_search_40_8 for session lookups
Use inline version rather than calling the function, this gives slightly better performance.
The straighforward diff uncovered an interesting problem: the stateful ACL IPv4 unit tests would fail
for the "make test" but succeed in "make test-debug". Also, they would succeed even in "make test",
if before calling the clib_bihash_search_inline_2_40_8 we would change the code
to store the key in a temporary variable.
Debugging revealed that the generated optimized code is not what one would expect:
the zeroing of the u64s overlaying the memcpy into ipv4 value of ip46_address_t
made the optimizer not notice the latter, and think that those fields should be
always zero in the bihash, thus generating incorrect assembly for the bihash key
comparison for the ipv4 nodes.
Changing the zeroing to be non-overlapping by zeroing only the pad fields resulted
in the optimizer generating the correct code and the tests pass.
Change-Id: Ib0f55cef2b5fe70c931d17ca4dc32a5755d160cd
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|
|
the ip4-dhcp-client-detect feature MUST run prior to nat44-out2in, or
inbound dhcp broadcast packets will be dropped. Certain dhcp servers
answer lease renewal dhcp-request packets with broadcast dhcp-acks, leading
to unrecoverable lease loss.
In detail, this constraint:
VNET_FEATURE_INIT (ip4_snat_out2in, static) = {
.arc_name = "ip4-unicast",
.node_name = "nat44-out2in",
.runs_after = VNET_FEATURES ("acl-plugin-in-ip4-fa"),
};
doesn't get the job done:
ip4-unicast:
[17] nat44-out2in
[23] ip4-dhcp-client-detect
[26] ip4-not-enabled
Add a proper constraint:
VNET_FEATURE_INIT (ip4_snat_out2in, static) = {
.arc_name = "ip4-unicast",
.node_name = "nat44-out2in",
.runs_after = VNET_FEATURES ("acl-plugin-in-ip4-fa",
"ip4-dhcp-client-detect"),
};
and the interface feature order is OK, at least in this regard:
ip4-unicast:
[17] ip4-dhcp-client-detect
[18] nat44-out2in
[26] ip4-not-enabled
We need to carefully audit (especially) the ip4-unicast feature arc,
which has [gasp] 37 features on it!
Change-Id: I5e749ead7ab2a25d80839a331de6261e112977ad
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
The option no-multi-seg doesn't take effect for RX since MTU
which is too large is passed to DPDK lib, Which causes PMDs
are running XXX_scattered_rx function. The patch fixes the issue.
Change-Id: I91a6fb23fd118e872c8a52a6c35c36a86cb2c02b
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
|
|
per-packet session key
Using a separate session key has proven to be tricky for the following reasons:
- it's a lot of storage to have what looks to be nearly identical to 5tuple,
just maybe with some fields swapped
- shuffling the fields from 5tuple adds to memory pressure
- the fact that the fields do not coincide with the packet memory
means for any staged processing we need to use up a lot of memory
Thus, just add two entries into the bihash table pointing to
the same session entry, so we could match the packets from either
direction.
With this we have the key layout of L3 info (which takes up
the majority of space for IPv6 case) the same as in the packet,
thus, opening up the possibility for other optimizations.
Not having to create and store a separate session key
should also give us a small performance win in itself.
Also, add the routine to show the session bihash in a better
way than a bunch of numbers.
Alas, the memory usage in the bihash obviously doubles.
Change-Id: I8fd2ed4714ad7fc447c4fa224d209bc0b736b371
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|
|
Should cost at most 1 clock per frame when not enabled.
Add "pcap rx trace..." debug CLI, refactored "pcap tx trace" debug CLI
to avoid duplicating code.
Change-Id: I19ac75d1cf94a6a24c98facbf0753381d37963ea
Signed-off-by: Dave Barach <dbarach@cisco.com>
|
|
Change-Id: I0fe60a639c7589dc842d85db092c81c1a7441cb7
Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
|
|
- hash is great. But it is a bit too slow for the DP. Use direct array indexing
to quickly retrieve the slave interface.
- the algorithm used by flow hash is great. But it is a bit too slow for the DP.
Use l2_hash_hash() extracted from lb_hash.h which ECMP is using. It makes use
of intrinsic crc32 instruction set.
- shortcut modulo arithmetic when the operand is 2**x (where x up to 4) to
avoid division instruction.
- special case for link count == 1 in bond_tx_fn()
- use clib_mem_unaligned to access data for the packet to avoid alignment error
- Fix some typos for packet tracing.
Change-Id: I8eae3ad497061c5473aa675ba894ee0211120d25
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: I7a4133c59ff45b0744b48e246a049d9f015026fc
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
Change-Id: Ic9f98c022e32715af395c9ed618589434eb0e526
Signed-off-by: Eyal Bari <ebari@cisco.com>
|
|
Change-Id: Ic8c5b527395fc99f1e1a72e51f8d41c9b4f415df
Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
|
|
This commit splits the functions from fa_node.c
into the pure dataplane node functions (which are multiarch-compiled),
session management node functions (which are compiled only once),
and session find/add/delete functions which are split out into the inlines.
As part of the refactoring:
- get rid of BV() macros in the affected chunk of code,
rather use the explicit bihash function names.
- add the magic trailer to the new files to
ensure make checkstyle watches them.
- move the bihash_template.c include for 40_8 bihash into acl.c
Change-Id: I4d781e9ec4307ea84e92af93c09470ea2bd0c375
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|
|
Replace hash with a vector to improve performance.
Plus other minor performance improvements.
Change-Id: I3f0ebd909782ce3727f6360ce5ff5ddd131f8574
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
|
|
when flows are enabled on the device
Change-Id: I971764988d5a9e7078468f627205b3fa60736263
Signed-off-by: Eyal Bari <ebari@cisco.com>
|
|
Change-Id: I1042c0fe179b57a00ce99c8d62cb1bdbe24d9184
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Add support of NAT66
Change-Id: Ie6aa79078a3835f989829b9a597c448dfd2f9ea3
Signed-off-by: Hongjun Ni <hongjun.ni@intel.com>
|
|
Change-Id: Ib3fcc3ceb7f315389bcdecbb7d9632540a5dd6ba
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I4b6577b496c56f27f07dd0066fcfdfd0cebb6f1a
Signed-off-by: Eyal Bari <ebari@cisco.com>
|
|
Change-Id: I484d79000c1bbd87ff83847cf567bf3414a719d3
Signed-off-by: Matus Fabian <matfabia@cisco.com>
|
|
During dpdk_lib_init, it calculates MRU and MTU and later calls
rte_eth_dev_set_mtu with calculated MTU value. However, dpdk_device_setup
calls rte_eth_dev_set_mtu with hi->max_packet_bytes, which is set to be
MRU value in dpdk_lib_init earlier.
Most of the time, MRU != MTU in dpdk_lib_init and it looks like
hi->max_packet_bytes is treated as MTU in other parts of vpp codebase.
Therefore, dpdk_lib_init should be consistent and use MTU instead of MRU
for hi->max_packet_bytes.
Change-Id: I23ff2a6cd45d6bc819b6f64d5f0fc0490b8a44de
Signed-off-by: Rui Cai <rucai@microsoft.com>
|
|
- The dpdk plugin always looks for libnuma library during compilation.
For non-numa aware platforms compilation breaks, if third party
libnuma lib is not available.
- Issue is more severe with Cross Compilation scenario where one has to
download and cross compile libnuma-dev package even when target platofrom
is NUMA disabled.
Like when cross compiling for ARM platforms, Linaro tool-chain doesn't have
libnuma by default.
Change-Id: Ib85b3188b787c23ba33b47e3f6123c74fd37190e
Signed-off-by: Sachin Saxena <sachin.saxena@nxp.com>
|
|
Change-Id: Id25b447bddccb7b321123e4abc4134e7261a0807
Signed-off-by: Matus Fabian <matfabia@cisco.com>
|
|
Instead of repeatedly cutting, pasting, and hacking to create a new
callback, use vnet_flow_rewrite_generic_callback(). Add three
arguments to the flow rewrite callback:
(in) pointer to an array of report elements,
(in) length of array,
(out) pointer to the stream index
Change existing code prototypes. Code owners encouraged to evaluate
whether they can use the generic callback or not, at leisure.
/* ipfix field definitions for a particular report */
typedef struct
{
u32 info_element;
u32 size;
} ipfix_report_element_t;
Best generated like so:
_(sourceIPv4Address, 4) \
_(destinationIPv4Address, 4) \
_(sourceTransportPort, 2) \
_(destinationTransportPort, 2) \
_(protocolIdentifier, 1) \
_(flowStartMicroseconds, 8) \
_(flowEndMicroseconds, 8)
static ipfix_report_element_t simple_report_elements[] = {
foreach_simple_report_ipfix_element
};
...
/* Set up the ipfix report */
memset (&a, 0, sizeof (a));
a.is_add = 1 /* to enable the report */ ;
a.domain_id = 1 /* pick a domain ID */ ;
a.src_port = UDP_DST_PORT_ipfix /* src port for reports */ ;
a.rewrite_callback = vnet_flow_rewrite_generic_callback;
a.report_elements = simple_report_elements;
a.n_report_elements = ARRAY_LEN (simple_report_elements);
a.stream_indexp = &jim->stream_index;
a.flow_data_callback = simple_flow_data_callback;
/* Create the report */
rv = vnet_flow_report_add_del (frm, &a, &template_id);
if (rv)
return rv;
...
Change-Id: If6131e6821d3a37a29269c0d58040cdf18ff05e4
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Adding name, enum constants and formatting code
for failsafe PMD.
This is part of initial effort to enable vpp running over
dpdk on failsafe PMD in Microsoft Azure(2/4).
Change-Id: I4eb0093db9f666e2635f7ddff451e3c9064bd0c4
Signed-off-by: Rui Cai <rucai@microsoft.com>
|
|
When port_type_from_speed_capa() is called before the port link update isn't completed,
xd->port_type becomes VNET_DPDK_PORT_TYPE_UNKNOWN. This happens with Mellanox NIC
without lsc interrupt. Calling rte_eth_link_get before getting dev_info will ensure
the link state is up-to-date.
Change-Id: I83a59654778eb4bf0c65a4a4e225a326227b9641
Signed-off-by: Steve Shin <jonshin@cisco.com>
|
|
It is much cheaper to use ctzll than to do shift,subtract and mask
in likely case when we are looking for 1st set bit in the uword.
Change-Id: I31954081571978878c7098bafad0c85a91755fa2
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I6306b81e0e1c3e1c591f929a76bb265c1c1d0859
Signed-off-by: Matus Fabian <matfabia@cisco.com>
|
|
Change-Id: Ibea4a96bdec5e368301a03d8b11a0712fa0265e0
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
The pointer to IP header was derived from l3_hdr_offset,
which would be ok, if l3_hdr_offset was valid. But it does not
have to be, so it was a bad solution. Now the previous nodes
mark whether it is a IPv6 or IPv4 packet tyle, and in esp_decrypt
we count get ip header pointer by substracting the size
of the ip header from the pointer to esp header (which lies
in front of the ip header).
Change-Id: I6d425b90931053711e8ce9126811b77ae6002a16
Signed-off-by: Szymon Sliwa <szs@semihalf.com>
|
|
Change-Id: I921465ea64b59d42674cc8f19069ed04e3b25026
Signed-off-by: Eyal Bari <ebari@cisco.com>
|
|
Change-Id: I3669068f694614f8555b33bf0b703c41e45363ef
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: Ifea9c772e8784642433b92091f5769eb9ec06890
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I387b22427b3f322969bcf32fcfc189123c8ed6ae
Signed-off-by: Eyal Bari <ebari@cisco.com>
|
|
Change-Id: Iba1cc1179ee80478e29888790a6476571d1904dc
Signed-off-by: Matus Fabian <matfabia@cisco.com>
|
|
Change-Id: I16557189aa4a763ec496cb4a45f6e12f2d46971f
Signed-off-by: Damjan Marion <damarion@cisco.com>
|