Age | Commit message (Collapse) | Author | Files | Lines |
|
Gain is around 6 clocks per packet (22 to 16).
Change-Id: Ia6f4293ea9062368a9a6b235c650591dbc0707d0
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Old code ~25 clocks/packet, new ~10.
Change-Id: I202cd6cbafb1ab2296939634d674f7ffd28253fc
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
clib_bihash_search_40_8 for session lookups
Use inline version rather than calling the function, this gives slightly better performance.
The straighforward diff uncovered an interesting problem: the stateful ACL IPv4 unit tests would fail
for the "make test" but succeed in "make test-debug". Also, they would succeed even in "make test",
if before calling the clib_bihash_search_inline_2_40_8 we would change the code
to store the key in a temporary variable.
Debugging revealed that the generated optimized code is not what one would expect:
the zeroing of the u64s overlaying the memcpy into ipv4 value of ip46_address_t
made the optimizer not notice the latter, and think that those fields should be
always zero in the bihash, thus generating incorrect assembly for the bihash key
comparison for the ipv4 nodes.
Changing the zeroing to be non-overlapping by zeroing only the pad fields resulted
in the optimizer generating the correct code and the tests pass.
Change-Id: Ib0f55cef2b5fe70c931d17ca4dc32a5755d160cd
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|
|
the ip4-dhcp-client-detect feature MUST run prior to nat44-out2in, or
inbound dhcp broadcast packets will be dropped. Certain dhcp servers
answer lease renewal dhcp-request packets with broadcast dhcp-acks, leading
to unrecoverable lease loss.
In detail, this constraint:
VNET_FEATURE_INIT (ip4_snat_out2in, static) = {
.arc_name = "ip4-unicast",
.node_name = "nat44-out2in",
.runs_after = VNET_FEATURES ("acl-plugin-in-ip4-fa"),
};
doesn't get the job done:
ip4-unicast:
[17] nat44-out2in
[23] ip4-dhcp-client-detect
[26] ip4-not-enabled
Add a proper constraint:
VNET_FEATURE_INIT (ip4_snat_out2in, static) = {
.arc_name = "ip4-unicast",
.node_name = "nat44-out2in",
.runs_after = VNET_FEATURES ("acl-plugin-in-ip4-fa",
"ip4-dhcp-client-detect"),
};
and the interface feature order is OK, at least in this regard:
ip4-unicast:
[17] ip4-dhcp-client-detect
[18] nat44-out2in
[26] ip4-not-enabled
We need to carefully audit (especially) the ip4-unicast feature arc,
which has [gasp] 37 features on it!
Change-Id: I5e749ead7ab2a25d80839a331de6261e112977ad
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: Ic1fab1f3aba92bbdbfd281459562d1f9697ab465
Signed-off-by: John Lo <loj@cisco.com>
|
|
The option no-multi-seg doesn't take effect for RX since MTU
which is too large is passed to DPDK lib, Which causes PMDs
are running XXX_scattered_rx function. The patch fixes the issue.
Change-Id: I91a6fb23fd118e872c8a52a6c35c36a86cb2c02b
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
|
|
- fix newreno cwnd computation
- reset snd_una_max on entering recovery
- accept acks beyond snd_nxt but less than snd_congestion when in
recovery
- avoid entering fast recovery multiple times when using sacks
- avoid as much as possible sending small segments when doing fast
retransmit
- more event logging
Change-Id: I19dd151d7704e39d4eae06de3a26f5e124875366
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
per-packet session key
Using a separate session key has proven to be tricky for the following reasons:
- it's a lot of storage to have what looks to be nearly identical to 5tuple,
just maybe with some fields swapped
- shuffling the fields from 5tuple adds to memory pressure
- the fact that the fields do not coincide with the packet memory
means for any staged processing we need to use up a lot of memory
Thus, just add two entries into the bihash table pointing to
the same session entry, so we could match the packets from either
direction.
With this we have the key layout of L3 info (which takes up
the majority of space for IPv6 case) the same as in the packet,
thus, opening up the possibility for other optimizations.
Not having to create and store a separate session key
should also give us a small performance win in itself.
Also, add the routine to show the session bihash in a better
way than a bunch of numbers.
Alas, the memory usage in the bihash obviously doubles.
Change-Id: I8fd2ed4714ad7fc447c4fa224d209bc0b736b371
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|
|
Should cost at most 1 clock per frame when not enabled.
Add "pcap rx trace..." debug CLI, refactored "pcap tx trace" debug CLI
to avoid duplicating code.
Change-Id: I19ac75d1cf94a6a24c98facbf0753381d37963ea
Signed-off-by: Dave Barach <dbarach@cisco.com>
|
|
Replace clib_warning with vlib_log_warn
Change-Id: I6d0b8d97048b75f4418609264af0c14e19fad79b
Signed-off-by: Juraj Sloboda <jsloboda@cisco.com>
|
|
Thanks to Ning Li <muziding001@163.com> for reporting.
Change-Id: I758bc6760ec5a9ec688172bc162a1873f96ab4f3
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: I0fe60a639c7589dc842d85db092c81c1a7441cb7
Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
|
|
- hash is great. But it is a bit too slow for the DP. Use direct array indexing
to quickly retrieve the slave interface.
- the algorithm used by flow hash is great. But it is a bit too slow for the DP.
Use l2_hash_hash() extracted from lb_hash.h which ECMP is using. It makes use
of intrinsic crc32 instruction set.
- shortcut modulo arithmetic when the operand is 2**x (where x up to 4) to
avoid division instruction.
- special case for link count == 1 in bond_tx_fn()
- use clib_mem_unaligned to access data for the packet to avoid alignment error
- Fix some typos for packet tracing.
Change-Id: I8eae3ad497061c5473aa675ba894ee0211120d25
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: I8335ebf266becf2f42bb3f28a17dfed8d9b08f97
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
bihash_48_8 case:
Scalar code: 6 clocks
SSE4.2 code: 3 clocks
AVX2 code: 2.27 clocks
AVX512 code: 1.5 clocks
Change-Id: I40700175835a1e7321276e47eadbf9771d3c5a68
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Add support for either copying TOS/TC from inner packet to outer,
or set to fixed value.
Change-Id: I716a95f875349acec94317b266c8cf9f2f81a785
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
Broken for years. Duh.
Change-Id: Ie5fb8e802f143aacd3301c45b136b24a8d4f6d74
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: I8cbd1eac80ae4aeb173d02786e9ccf3b4877304d
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Fix GRE/IPv6 setting of ip->payload_length (which has never worked).
Change-Id: Ie68f1cc7bbb70489d6ec97356132c783f2345e1e
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
Change-Id: I7a4133c59ff45b0744b48e246a049d9f015026fc
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
Note: The Python, Java and C/C++ bindings must be updated before ip/ip_types.api can be used.
ip_types.api:
typedef ip4_address {
u8 address[4];
};
typedef ip6_address {
u8 address[16];
};
enum address_family {
ADDRESS_IP4 = 0,
ADDRESS_IP6,
};
union address_union {
vl_api_ip4_address_t ip4;
vl_api_ip6_address_t ip6;
};
typedef address {
vl_api_address_family_t af;
vl_api_address_union_t un;
};
Change-Id: I22f67092f24db5bd650a03c6f446a84cd9fd1074
Signed-off-by: Ole Troan <ot@cisco.com>
|
|
Change-Id: Ic9f98c022e32715af395c9ed618589434eb0e526
Signed-off-by: Eyal Bari <ebari@cisco.com>
|
|
Change-Id: I6615bb612bcc3f795b5f822ea55209bb30ef35b5
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: Ic8c5b527395fc99f1e1a72e51f8d41c9b4f415df
Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
|
|
This commit splits the functions from fa_node.c
into the pure dataplane node functions (which are multiarch-compiled),
session management node functions (which are compiled only once),
and session find/add/delete functions which are split out into the inlines.
As part of the refactoring:
- get rid of BV() macros in the affected chunk of code,
rather use the explicit bihash function names.
- add the magic trailer to the new files to
ensure make checkstyle watches them.
- move the bihash_template.c include for 40_8 bihash into acl.c
Change-Id: I4d781e9ec4307ea84e92af93c09470ea2bd0c375
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|
|
Change-Id: I56782652d8ef10304900cc293cfc0502689d800e
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Replace hash with a vector to improve performance.
Plus other minor performance improvements.
Change-Id: I3f0ebd909782ce3727f6360ce5ff5ddd131f8574
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
|
|
Change-Id: I37705fb572045f42be4c2dabbd8460c8f8872167
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
when flows are enabled on the device
Change-Id: I971764988d5a9e7078468f627205b3fa60736263
Signed-off-by: Eyal Bari <ebari@cisco.com>
|
|
Remove functions which have native C equivalent (i.e. _is_equal can be
replaced with ==, _add with +)
Add SSE4.2, AVX-512 implementations of splat, load_unaligned, store_unaligned,
is_all_zero, is_equal, is_all_equal
Change-Id: Ie80b0e482e7a76248ad79399c2576468532354cd
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I6d1218c17ee055275596b9a49767f15994aa1b2b
Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
|
|
This fixes ARM64 build where we dont have defined u16x8_msb_mask(...)
Change-Id: I864f5134a0d951601810c800f587d173b3b7ef41
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I6d8e1351e088728f7363550a0fc117256cae2841
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: I4f245fd225bcc563fafee2696cd039477d661c57
Signed-off-by: Neale Ranns <neale.ranns@cisco.com>
|
|
Change-Id: I1042c0fe179b57a00ce99c8d62cb1bdbe24d9184
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: If01400e3434b25b2da36ba28ceb8444b216d0e38
Signed-off-by: Neale Ranns <neale.ranns@cisco.com>
|
|
Change-Id: I44278dea2ee1daa147b0928bfe26e861907a209f
Signed-off-by: Jon Loeliger <jdl@netgate.com>
|
|
Change-Id: I8c64a0d2f757d96ffa7fd042c23b0d814217c215
Signed-off-by: Neale Ranns <neale.ranns@cisco.com>
|
|
Add a session process node that handles main thread tx and retransmit in
order to avoid having a polling input node.
Change-Id: I3357e987c023a84b533b32793e37ab4204420f64
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: Idff55a19d27fed0d57e222f38d2e16c5367911cb
Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
|
|
Add support of NAT66
Change-Id: Ie6aa79078a3835f989829b9a597c448dfd2f9ea3
Signed-off-by: Hongjun Ni <hongjun.ni@intel.com>
|
|
Change-Id: Ib3fcc3ceb7f315389bcdecbb7d9632540a5dd6ba
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I238106c2afc46904fb0eb17164f30dbd1378892e
Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
|
|
vnet_feature_enable_disable_with_index() checks the
return status of vnet_config_{add,del}_feature().
If the config string heap index returned is the same
index that was in use prior to the add/delete, it is
concluded that a failure occurred and processing of
the feature stops.
Sometimes the config index that is returned
can legitimately be the same index that was in used
before the add/delete. The old list of features can
have its heap entry deallocated before a new entry for
the new list is allocated. The heap entry for the new
list can be the entry that was deallocated while
deleting the old one.
Make vnet_config_{add,del}_feature() return ~0 on
failure. Look for that return value as an indication
that an error occurred in
vnet_enable_disable_feature_by_index().
Change-Id: I88bb3ff88a76971c1b5e5ece74784ce8ba78373c
Signed-off-by: Matthew Smith <mgsmith@netgate.com>
|
|
Add default route to the VRF table in which the interface is bound.
Add missing pool_put.
Change-Id: Id76c7dbfbf9bcf18357f372f3eee9b931df1995e
Signed-off-by: Juraj Sloboda <jsloboda@cisco.com>
|
|
Change-Id: I4b6577b496c56f27f07dd0066fcfdfd0cebb6f1a
Signed-off-by: Eyal Bari <ebari@cisco.com>
|
|
Change-Id: I484d79000c1bbd87ff83847cf567bf3414a719d3
Signed-off-by: Matus Fabian <matfabia@cisco.com>
|
|
Change-Id: I9ede6bc861350c7d9e78fa4d96cd584c2816d06f
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Set vnet_buffer2(b0)->pg_replay_timestamp, for use when desired.
Fix a memory leak in pg_stream_free(...), which wasn't freeing the
replay packet templates.
Change-Id: I01822a9e91a52de4774d2b95cf0c2ee254a915e9
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
During dpdk_lib_init, it calculates MRU and MTU and later calls
rte_eth_dev_set_mtu with calculated MTU value. However, dpdk_device_setup
calls rte_eth_dev_set_mtu with hi->max_packet_bytes, which is set to be
MRU value in dpdk_lib_init earlier.
Most of the time, MRU != MTU in dpdk_lib_init and it looks like
hi->max_packet_bytes is treated as MTU in other parts of vpp codebase.
Therefore, dpdk_lib_init should be consistent and use MTU instead of MRU
for hi->max_packet_bytes.
Change-Id: I23ff2a6cd45d6bc819b6f64d5f0fc0490b8a44de
Signed-off-by: Rui Cai <rucai@microsoft.com>
|