Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch aims to improve decap performance by reducing expensive
hash_get callings as less as possible using AVX512 on XEON.
e.g. vxlan, vxlan_gpe, geneve, gtpu.
For the existing code, if vtep4 of the current packet match the last
vtep4_key_t well, expensive hash computation can be avoided and the
code returns directly.
This patch improves tunnel decap multiple flows case greatly by
leveraging 512bit vector register on XEON accommodating 8 vtep4_keys.
It enhances the possiblity of avoiding unnecessary hash computing
once hash key of the current packet hits any one of 8 in the 512bit
cache.
The oldest element in vtep4_cache_t is updated in round-robin order.
vlib_get_buffers is also leveraged in the meanwhile.
Type: improvement
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Signed-off-by: Junfeng Wang <drenfong.wang@intel.com>
Change-Id: I313103202bd76f2dd638cd942554721b37ddad60
|
|
Type: improvement
Change-Id: Ie0de374b89ec3a17befecf3f08e94951597609ec
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: improvement
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Change-Id: Idfec9cb9370a8cf4966d3fdfa440496f21e17005
|
|
Introduced on intel IceLake uarch.
Type: feature
Change-Id: I1514c76c34e53ce0577666caf32a50f95eb6548f
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
vpclmulqdq is introduced on intel icelake architecture and
allows computing 4 carry-less multiplications in paralled by using
512-bit SIMD registers
Type: feature
Change-Id: Idb09d6f51ba6f116bba11649b2d99f649356d449
Signed-off-by: Damjan Marion <damjan.marion@gmail.com>
|
|
Type: refactor
Change-Id: I61e25942de318d03fb3d75689259709d687479bc
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
This allows us to combine 2 XOR operations into signle instruction
which makes difference in crypto op:
- in x86, by using ternary logic instruction
- on ARM, by using EOR3 instruction (available with sha3 feature)
Type: refactor
Change-Id: Ibdf9001840399d2f838d491ca81b57cbd8430433
Signed-off-by: Damjan Marion <damjan.marion@gmail.com>
|
|
Change-Id: I26c704ec27b8f5431faef08156778f53ea454269
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I81bd967a580ae3b476dfd731e9933a9898568a91
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: Iefe9d20799a6f5f271aa5b675ea2b19ac3efbe1e
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
bihash_48_8 case:
Scalar code: 6 clocks
SSE4.2 code: 3 clocks
AVX2 code: 2.27 clocks
AVX512 code: 1.5 clocks
Change-Id: I40700175835a1e7321276e47eadbf9771d3c5a68
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I56782652d8ef10304900cc293cfc0502689d800e
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Remove functions which have native C equivalent (i.e. _is_equal can be
replaced with ==, _add with +)
Add SSE4.2, AVX-512 implementations of splat, load_unaligned, store_unaligned,
is_all_zero, is_equal, is_all_equal
Change-Id: Ie80b0e482e7a76248ad79399c2576468532354cd
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I1042c0fe179b57a00ce99c8d62cb1bdbe24d9184
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: If174d189de40e6f9ffae99997bba93a2519d9fda
Signed-off-by: Damjan Marion <damarion@cisco.com>
|