aboutsummaryrefslogtreecommitdiffstats
path: root/src/vppinfra/atomics.h
diff options
context:
space:
mode:
authorGovindarajan Mohandoss <govindarajan.mohandoss@arm.com>2021-03-19 19:20:49 +0000
committerDamjan Marion <dmarion@me.com>2021-10-12 16:43:18 +0000
commit6d7dfcbfa4bc05f1308fc677f19ade44ea699da1 (patch)
treeeb17ffe94db34644ccfb870732a8c6e3d6ba58b7 /src/vppinfra/atomics.h
parentd9e9870dd941bfb826530815e3196ced0b544b5d (diff)
ipsec: Performance improvement of ipsec4_output_node using flow cache
Adding flow cache support to improve outbound IPv4/IPSec SPD lookup performance. Details about flow cache: Mechanism: 1. First packet of a flow will undergo linear search in SPD table. Once a policy match is found, a new entry will be added into the flow cache. From 2nd packet onwards, the policy lookup will happen in flow cache. 2. The flow cache is implemented using bihash without collision handling. This will avoid the logic to age out or recycle the old flows in flow cache. Whenever a collision occurs, old entry will be overwritten by the new entry. Worst case is when all the 256 packets in a batch result in collision and fall back to linear search. Average and best case will be O(1). 3. The size of flow cache is fixed and decided based on the number of flows to be supported. The default is set to 1 million flows. This can be made as a configurable option as a next step. 4. Whenever a SPD rule is added/deleted by the control plane, the flow cache entries will be completely deleted (reset) in the control plane. The assumption here is that SPD rule add/del is not a frequent operation from control plane. Flow cache reset is done, by putting the data plane in fall back mode, to bypass flow cache and do linear search till the SPD rule add/delete operation is complete. Once the rule is successfully added/deleted, the data plane will be allowed to make use of the flow cache. The flow cache will be reset only after flushing out the inflight packets from all the worker cores using vlib_worker_wait_one_loop(). Details about bihash usage: 1. A new bihash template (16_8) is added to support IPv4 5 tuple. BIHASH_KVP_PER_PAGE and BIHASH_KVP_AT_BUCKET_LEVEL are set to 1 in the new template. It means only one KVP is supported per bucket. 2. Collision handling is avoided by calling BV (clib_bihash_add_or_overwrite_stale) function. Through the stale callback function pointer, the KVP entry will be overwritten during collision. 3. Flow cache reset is done using BV (clib_bihash_foreach_key_value_pair) function. Through the callback function pointer, the KVP value is reset to ~0ULL. MRR performance numbers with 1 core, 1 ESP Tunnel, null-encrypt, 64B for different SPD policy matching indices: SPD Policy index : 1 10 100 1000 Throughput : MPPS/MPPS MPPS/MPPS MPPS/MPPS KPPS/MPPS (Baseline/Optimized) ARM Neoverse N1 : 5.2/4.84 4.55/4.84 2.11/4.84 329.5/4.84 ARM TX2 : 2.81/2.6 2.51/2.6 1.27/2.6 176.62/2.6 INTEL SKX : 4.93/4.48 4.29/4.46 2.05/4.48 336.79/4.47 Next Steps: Following can be made as a configurable option through startup conf at IPSec level: 1. Enable/Disable Flow cache. 2. Bihash configuration like number of buckets and memory size. 3. Dual/Quad loop unroll can be applied around bihash to further improve the performance. 4. The same flow cache logic can be applied for IPv6 as well as in IPSec inbound direction. A deeper and wider flow cache using bihash_40_8 can replace existing bihash_16_8, to make it common for both IPv4 and IPv6 in both outbound and inbound directions. Following changes are made based on the review comments: 1. ON/OFF flow cache through startup conf. Default: OFF 2. Flow cache stale entry detection using epoch counter. 3. Avoid host order endianness conversion during flow cache lookup. 4. Move IPSec startup conf to a common file. 5. Added SPD flow cache unit test case 6. Replaced bihash with vectors to implement flow cache. 7. ipsec_add_del_policy API is not mpsafe. Cleaned up inflight packets check in control plane. Type: improvement Signed-off-by: mgovind <govindarajan.Mohandoss@arm.com> Signed-off-by: Zachary Leaf <zachary.leaf@arm.com> Tested-by: Jieqiang Wang <jieqiang.wang@arm.com> Change-Id: I62b4d6625fbc6caf292427a5d2046aa5672b2006
Diffstat (limited to 'src/vppinfra/atomics.h')
-rw-r--r--src/vppinfra/atomics.h2
1 files changed, 2 insertions, 0 deletions
diff --git a/src/vppinfra/atomics.h b/src/vppinfra/atomics.h
index 5d3c5f8d601..92c45610391 100644
--- a/src/vppinfra/atomics.h
+++ b/src/vppinfra/atomics.h
@@ -52,6 +52,8 @@
#define clib_atomic_store_rel_n(a, b) __atomic_store_n ((a), (b), __ATOMIC_RELEASE)
#define clib_atomic_store_seq_cst(a, b) \
__atomic_store_n ((a), (b), __ATOMIC_SEQ_CST)
+#define clib_atomic_store_relax_n(a, b) \
+ __atomic_store_n ((a), (b), __ATOMIC_RELAXED)
#define clib_atomic_load_seq_cst(a) __atomic_load_n ((a), __ATOMIC_SEQ_CST)
#define clib_atomic_swap_acq_n(a, b) __atomic_exchange_n ((a), (b), __ATOMIC_ACQUIRE)