aboutsummaryrefslogtreecommitdiffstats
path: root/src/vppinfra/vector_avx512.h
AgeCommit message (Collapse)AuthorFilesLines
2021-05-05vppinfra: fix x86 packs / packus wrappersDamjan Marion1-0/+13
They both take signed value as input. Type: fix Change-Id: If3d8ec4e0b1c02d7d65262bdd9db49ff7fbfef39 Signed-off-by: Damjan Marion <damarion@cisco.com>
2021-04-27vlib: improve enqueue_to_next buffer indices extractionDamjan Marion1-0/+4
Type: improvement Change-Id: Ib7b2fa7d821f6d2708f6dc378a0f36f68c843f57 Signed-off-by: Damjan Marion <damarion@cisco.com>
2021-04-25vppinfra: AVX512 mask load/stores and compress storeDamjan Marion1-21/+55
Type: improvement Change-Id: Id6be598aade072653e408cca465e62931d060233 Signed-off-by: Damjan Marion <damarion@cisco.com>
2021-04-21buffers: vlib_get_buffers() with 512-bit SIMDDamjan Marion1-0/+1
Type: improvement Change-Id: Id8ce3ffc1299a38171b82a7082454412c840a40c Signed-off-by: Damjan Marion <damarion@cisco.com>
2021-04-21vppinfra: more avx512 inlines (compress, expand, from, is_equal_mask)Damjan Marion1-26/+79
Type: improvement Change-Id: I4cb86cafba92ae70cea160b9bf45f28a916ab6db Signed-off-by: Damjan Marion <damarion@cisco.com>
2020-09-04ip: enhance vtep4_check of tunnel by vector wayZhiyong Yang1-0/+6
This patch aims to improve decap performance by reducing expensive hash_get callings as less as possible using AVX512 on XEON. e.g. vxlan, vxlan_gpe, geneve, gtpu. For the existing code, if vtep4 of the current packet match the last vtep4_key_t well, expensive hash computation can be avoided and the code returns directly. This patch improves tunnel decap multiple flows case greatly by leveraging 512bit vector register on XEON accommodating 8 vtep4_keys. It enhances the possiblity of avoiding unnecessary hash computing once hash key of the current packet hits any one of 8 in the 512bit cache. The oldest element in vtep4_cache_t is updated in round-robin order. vlib_get_buffers is also leveraged in the meanwhile. Type: improvement Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com> Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Signed-off-by: Junfeng Wang <drenfong.wang@intel.com> Change-Id: I313103202bd76f2dd638cd942554721b37ddad60
2020-07-15vppinfra: more vector inlinesDamjan Marion1-0/+19
Type: improvement Change-Id: Ie0de374b89ec3a17befecf3f08e94951597609ec Signed-off-by: Damjan Marion <damarion@cisco.com>
2020-03-30vppinfra: add support for avx512 alignment version of load and storeZhiyong Yang1-0/+8
Type: improvement Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com> Change-Id: Idfec9cb9370a8cf4966d3fdfa440496f21e17005
2020-02-25crypto-native: GCM implementation with vector AESNI instructionsDamjan Marion1-0/+48
Introduced on intel IceLake uarch. Type: feature Change-Id: I1514c76c34e53ce0577666caf32a50f95eb6548f Signed-off-by: Damjan Marion <damarion@cisco.com>
2020-02-17crypto-native: calculate ghash using vpclmulqdq instructionsDamjan Marion1-0/+15
vpclmulqdq is introduced on intel icelake architecture and allows computing 4 carry-less multiplications in paralled by using 512-bit SIMD registers Type: feature Change-Id: Idb09d6f51ba6f116bba11649b2d99f649356d449 Signed-off-by: Damjan Marion <damjan.marion@gmail.com>
2020-02-14crypto-native: refactor CBC codeDamjan Marion1-0/+6
Type: refactor Change-Id: I61e25942de318d03fb3d75689259709d687479bc Signed-off-by: Damjan Marion <damarion@cisco.com>
2020-02-13vppinfra: add 128-bit and 512-bit a ^ b ^ c shortcutDamjan Marion1-0/+7
This allows us to combine 2 XOR operations into signle instruction which makes difference in crypto op: - in x86, by using ternary logic instruction - on ARM, by using EOR3 instruction (available with sha3 feature) Type: refactor Change-Id: Ibdf9001840399d2f838d491ca81b57cbd8430433 Signed-off-by: Damjan Marion <damjan.marion@gmail.com>
2019-04-17vppinfra: AVX512 interelaave, insert and permuteDamjan Marion1-0/+27
Change-Id: I26c704ec27b8f5431faef08156778f53ea454269 Signed-off-by: Damjan Marion <damarion@cisco.com>
2019-04-16vppinfra: more AVX2 and AVX512 inlinesDamjan Marion1-0/+46
Change-Id: I81bd967a580ae3b476dfd731e9933a9898568a91 Signed-off-by: Damjan Marion <damarion@cisco.com>
2019-04-12vppinfra: AVX-512 transpose (u32x16 and u64x8)Damjan Marion1-0/+126
Change-Id: Iefe9d20799a6f5f271aa5b675ea2b19ac3efbe1e Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-25Vectorized bihash_{48,40,24,16}_8 key compareDamjan Marion1-1/+6
bihash_48_8 case: Scalar code: 6 clocks SSE4.2 code: 3 clocks AVX2 code: 2.27 clocks AVX512 code: 1.5 clocks Change-Id: I40700175835a1e7321276e47eadbf9771d3c5a68 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-22vppinfra: add clib_count_equal_uXX and clib_memset_uXX functionsDamjan Marion1-1/+1
Change-Id: I56782652d8ef10304900cc293cfc0502689d800e Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-20vector functions cleanup and improvementsDamjan Marion1-2/+14
Remove functions which have native C equivalent (i.e. _is_equal can be replaced with ==, _add with +) Add SSE4.2, AVX-512 implementations of splat, load_unaligned, store_unaligned, is_all_zero, is_equal, is_all_equal Change-Id: Ie80b0e482e7a76248ad79399c2576468532354cd Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-18Add vlib_buffer_enqueue_to_next inline functionDamjan Marion1-1/+10
Change-Id: I1042c0fe179b57a00ce99c8d62cb1bdbe24d9184 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-04-25dpdk: complete rework of the dpdk-input nodeDamjan Marion1-0/+53
Change-Id: If174d189de40e6f9ffae99997bba93a2519d9fda Signed-off-by: Damjan Marion <damarion@cisco.com>