Age | Commit message (Collapse) | Author | Files | Lines |
|
Adding flow cache support to improve inbound IPv4/IPSec Security Policy
Database (SPD) lookup performance. By enabling the flow cache in startup
conf, this replaces a linear O(N) SPD search, with an O(1) hash table
search.
This patch is the ipsec4_input_node counterpart to
https://gerrit.fd.io/r/c/vpp/+/31694, and shares much of the same code,
theory and mechanism of action.
Details about the flow cache:
Mechanism:
1. First packet of a flow will undergo linear search in SPD
table. Once a policy match is found, a new entry will be added
into the flow cache. From 2nd packet onwards, the policy lookup
will happen in flow cache.
2. The flow cache is implemented using a hash table without collision
handling. This will avoid the logic to age out or recycle the old
flows in flow cache. Whenever a collision occurs, the old entry
will be overwritten by the new entry. Worst case is when all the
256 packets in a batch result in collision, falling back to linear
search. Average and best case will be O(1).
3. The size of flow cache is fixed and decided based on the number
of flows to be supported. The default is set to 1 million flows,
but is configurable by a startup.conf option.
4. Whenever a SPD rule is added/deleted by the control plane, all
current flow cache entries will be invalidated. As the SPD API is
not mp-safe, the data plane will wait for the control plane
operation to complete.
Cache invalidation is via an epoch counter that is incremented on
policy add/del and stored with each entry in the flow cache. If the
epoch counter in the flow cache does not match the current count,
the entry is considered stale, and we fall back to linear search.
The following configurable options are available through startup
conf under the ipsec{} entry:
1. ipv4-inbound-spd-flow-cache on/off - enable SPD flow cache
(default off)
2. ipv4-inbound-spd-hash-buckets %d - set number of hash buckets
(default 4,194,304: ~1 million flows with 25% load factor)
Performance with 1 core, 1 ESP Tunnel, null-decrypt then bypass,
94B (null encrypted packet) for different SPD policy matching indices:
SPD Policy index : 2 10 100 1000
Throughput : Mbps/Mbps Mbps/Mbps Mbps/Mbps Mbps/Mbps
(Baseline/Optimized)
ARM TX2 : 300/290 230/290 70/290 8.5/290
Type: improvement
Signed-off-by: Zachary Leaf <zachary.leaf@arm.com>
Signed-off-by: mgovind <govindarajan.Mohandoss@arm.com>
Tested-by: Jieqiang Wang <jieqiang.wang@arm.com>
Change-Id: I8be2ad4715accbb335c38cd933904119db75827b
|
|
Adding flow cache support to improve outbound IPv4/IPSec SPD lookup
performance. Details about flow cache:
Mechanism:
1. First packet of a flow will undergo linear search in SPD
table. Once a policy match is found, a new entry will be added
into the flow cache. From 2nd packet onwards, the policy lookup
will happen in flow cache.
2. The flow cache is implemented using bihash without collision
handling. This will avoid the logic to age out or recycle the old
flows in flow cache. Whenever a collision occurs, old entry will
be overwritten by the new entry. Worst case is when all the 256
packets in a batch result in collision and fall back to linear
search. Average and best case will be O(1).
3. The size of flow cache is fixed and decided based on the number
of flows to be supported. The default is set to 1 million flows.
This can be made as a configurable option as a next step.
4. Whenever a SPD rule is added/deleted by the control plane, the
flow cache entries will be completely deleted (reset) in the
control plane. The assumption here is that SPD rule add/del is not
a frequent operation from control plane. Flow cache reset is done,
by putting the data plane in fall back mode, to bypass flow cache
and do linear search till the SPD rule add/delete operation is
complete. Once the rule is successfully added/deleted, the data
plane will be allowed to make use of the flow cache. The flow
cache will be reset only after flushing out the inflight packets
from all the worker cores using
vlib_worker_wait_one_loop().
Details about bihash usage:
1. A new bihash template (16_8) is added to support IPv4 5 tuple.
BIHASH_KVP_PER_PAGE and BIHASH_KVP_AT_BUCKET_LEVEL are set
to 1 in the new template. It means only one KVP is supported
per bucket.
2. Collision handling is avoided by calling
BV (clib_bihash_add_or_overwrite_stale) function.
Through the stale callback function pointer, the KVP entry
will be overwritten during collision.
3. Flow cache reset is done using
BV (clib_bihash_foreach_key_value_pair) function.
Through the callback function pointer, the KVP value is reset
to ~0ULL.
MRR performance numbers with 1 core, 1 ESP Tunnel, null-encrypt,
64B for different SPD policy matching indices:
SPD Policy index : 1 10 100 1000
Throughput : MPPS/MPPS MPPS/MPPS MPPS/MPPS KPPS/MPPS
(Baseline/Optimized)
ARM Neoverse N1 : 5.2/4.84 4.55/4.84 2.11/4.84 329.5/4.84
ARM TX2 : 2.81/2.6 2.51/2.6 1.27/2.6 176.62/2.6
INTEL SKX : 4.93/4.48 4.29/4.46 2.05/4.48 336.79/4.47
Next Steps:
Following can be made as a configurable option through startup
conf at IPSec level:
1. Enable/Disable Flow cache.
2. Bihash configuration like number of buckets and memory size.
3. Dual/Quad loop unroll can be applied around bihash to further
improve the performance.
4. The same flow cache logic can be applied for IPv6 as well as in
IPSec inbound direction. A deeper and wider flow cache using
bihash_40_8 can replace existing bihash_16_8, to make it
common for both IPv4 and IPv6 in both outbound and
inbound directions.
Following changes are made based on the review comments:
1. ON/OFF flow cache through startup conf. Default: OFF
2. Flow cache stale entry detection using epoch counter.
3. Avoid host order endianness conversion during flow cache
lookup.
4. Move IPSec startup conf to a common file.
5. Added SPD flow cache unit test case
6. Replaced bihash with vectors to implement flow cache.
7. ipsec_add_del_policy API is not mpsafe. Cleaned up
inflight packets check in control plane.
Type: improvement
Signed-off-by: mgovind <govindarajan.Mohandoss@arm.com>
Signed-off-by: Zachary Leaf <zachary.leaf@arm.com>
Tested-by: Jieqiang Wang <jieqiang.wang@arm.com>
Change-Id: I62b4d6625fbc6caf292427a5d2046aa5672b2006
|
|
add bypass/discard functionality to ipsec4-input-feature node
Type: feature
Signed-off-by: ShivaShankarK <shivaashankar1204@gmail.com>
Change-Id: I152a5dfee0296109cccabe349a330dbbe395cc6c
|
|
Change-Id: I5e981f12ff44243623cfd18d5e0ae06a7dfd1eb8
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
- return the stats_index of each SPD in the create API call
- no ip_any in the API as this creates 2 SPD entries. client must add both v4 and v6 explicitly
- only one pool of SPD entries (rhter than one per-SPD) to support this
- no packets/bytes in the dump API. Polling the stats segment is much more efficient
(if the SA lifetime is based on packet/bytes)
- emit the policy index in the packet trace and CLI commands.
Change-Id: I7eaf52c9d0495fa24450facf55229941279b8569
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
No function change. Only breaking the monster ipsec.[hc]
into smaller constituent parts
Change-Id: I3fd4d2d041673db5865d46a4002f6bd383f378af
Signed-off-by: Neale Ranns <nranns@cisco.com>
|