aboutsummaryrefslogtreecommitdiffstats
path: root/src/vppinfra/vector_avx2.h
AgeCommit message (Collapse)AuthorFilesLines
2021-12-02vppinfra: vector shuffle cleanupDamjan Marion1-6/+0
Type: refactor Change-Id: I8b3fc2ce30df313467274a174c5ac6adbf296153 Signed-off-by: Damjan Marion <damarion@cisco.com>
2021-09-23classify: use AVX-512 to calculate hash on x86Damjan Marion1-0/+3
Type:improvement Change-Id: I9f9f16eabf64203db11cd4338948d76ca5e0ef12 Signed-off-by: Damjan Marion <damarion@cisco.com>
2021-05-05vppinfra: fix x86 packs / packus wrappersDamjan Marion1-10/+12
They both take signed value as input. Type: fix Change-Id: If3d8ec4e0b1c02d7d65262bdd9db49ff7fbfef39 Signed-off-by: Damjan Marion <damarion@cisco.com>
2021-04-27vlib: improve enqueue_to_next buffer indices extractionDamjan Marion1-1/+20
Type: improvement Change-Id: Ib7b2fa7d821f6d2708f6dc378a0f36f68c843f57 Signed-off-by: Damjan Marion <damarion@cisco.com>
2021-04-25vppinfra: AVX512 mask load/stores and compress storeDamjan Marion1-8/+0
Type: improvement Change-Id: Id6be598aade072653e408cca465e62931d060233 Signed-off-by: Damjan Marion <damarion@cisco.com>
2021-04-15vppinfra: correct intrinsic called by u16x16_from_u8x16Lijian.Zhang1-2/+2
u16x16_from_u8x16() and i16x16_from_i8x16() call intrisics _mm256_cvtepu8_epi64 and _mm256_cvtepi8_epi64. But they are not seems doing the right data conversion from the name of the wrappers. The correct intrinsics been called should be _mm256_cvtepu8_epi16 and _mm256_cvtepi8_epi16. Type: fix Change-Id: Id71de6ae1a266a370f11c33a46684202be766c43 Signed-off-by: Lijian Zhang <Lijian.Zhang@arm.com>
2020-08-31vppinfra: convert A_extend_to_B to B_from_A format of vector inlinesDamjan Marion1-2/+2
Make it shorter and same format when converting to biggor or smaller types. Type: refactor Change-Id: I443d67e18ae65d779b4d9a0dce5406f7d9f0e4ac Signed-off-by: Damjan Marion <damarion@cisco.com>
2020-07-15vppinfra: more vector inlinesDamjan Marion1-0/+25
Type: improvement Change-Id: Ie0de374b89ec3a17befecf3f08e94951597609ec Signed-off-by: Damjan Marion <damarion@cisco.com>
2020-03-16rdma: add Mellanox mlx5 Direct Verbs receive supportDamjan Marion1-0/+10
Type: feature Change-Id: I3f287ab536a482c366ad7df47e1c04e640992ebc Signed-off-by: Damjan Marion <damarion@cisco.com>
2019-04-16vppinfra: more AVX2 and AVX512 inlinesDamjan Marion1-0/+26
Change-Id: I81bd967a580ae3b476dfd731e9933a9898568a91 Signed-off-by: Damjan Marion <damarion@cisco.com>
2019-04-08vppinfra: u32x8 transposeDamjan Marion1-0/+56
Change-Id: I7d39cb184f1f9ad24276183c29969327681a1f82 Signed-off-by: Damjan Marion <damarion@cisco.com>
2019-03-26ipsec: esp-encrypt reworkDamjan Marion1-0/+13
Change-Id: Ibe7f806b9d600994e83c9f1be526fdb0a1ef1833 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-11-20vppinfra: add 128 and 256 bit vector scatter/gather inlinesDamjan Marion1-0/+59
Change-Id: If6c65f16c6fba8beb90e189c1443c3d7d67ee02c Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-10-17bond: tx optimizationsDamjan Marion1-0/+12
Break up bond tx function into multiple small workloads: 1. parse the packet header and hash it based on the configured algorithm 2. optionally, trace the packet 3. convert the hash value from (1) to the slave port 4. update the buffers with the slave sw_if_index 5. Add the buffers to the queues 6. Create and send the frames old numbers ----------- Time 5.3, average vectors/node 223.74, last 128 main loops 40.00 per node 222.61 vector rates in 3.3627e6, out 6.6574e6, drop 3.3964e4, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call BondEthernet0-output active 68998 17662979 0 1.89e1 255.99 BondEthernet0-tx active 68998 17662979 0 2.60e1 255.99 TenGigabitEthernet3/0/1-output active 68998 8797416 0 1.03e1 127.50 TenGigabitEthernet3/0/1-tx active 68998 8797416 0 7.85e1 127.50 TenGigabitEthernet7/0/1-output active 68996 8865563 0 1.02e1 128.49 TenGigabitEthernet7/0/1-tx active 68996 8865563 0 7.65e1 128.49 new numbers ----------- BondEthernet0-output active 304064 77840384 0 2.29e1 256.00 BondEthernet0-tx active 304064 77840384 0 2.47e1 256.00 TenGigabitEthernet3/0/1-output active 304064 38765525 0 1.03e1 127.49 TenGigabitEthernet3/0/1-tx active 304064 38765525 0 7.66e1 127.49 TenGigabitEthernet7/0/1-output active 304064 39074859 0 1.01e1 128.51 Change-Id: I3ef9a52bfe235559dae09d055c03c5612c08a0f7 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-07-16vppinfra: AVX2 interleave functionsDamjan Marion1-3/+14
Change-Id: I8688f700fccd87484da3e202ca3a070cc14eb267 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-07-12Revert "vppinfra: AVX2 blend"Dave Barach1-6/+0
Causes clang validation failures. The patch did not actually pass validation; unfortunately it received a +1 from fd.io JJB - presumably due to a race condition This reverts commit 779c865cc6c7af5bb435d8b3465d80685370edb2. Change-Id: Ica3697f8f90e67d3eae4debc597f27d7d512004a Signed-off-by: Dave Barach <dbarach@cisco.com>
2018-07-12vppinfra: AVX2 blendDamjan Marion1-0/+6
Change-Id: Ie7a64318f10ebb535c98aff4e25cdfc48f60ff33 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-06-28ip: vectorized ip checksumDamjan Marion1-0/+28
Change-Id: Ida678e6f31daa8decb18189da712a350336326e2 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-06-27vppinfra: add vector horizontal add and byte swap (SSE4.2 & AVX2)Damjan Marion1-0/+16
Change-Id: I4e0fd487970796f0153a5b16333827d23b57deac Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-25Vectorized bihash_{48,40,24,16}_8 key compareDamjan Marion1-20/+25
bihash_48_8 case: Scalar code: 6 clocks SSE4.2 code: 3 clocks AVX2 code: 2.27 clocks AVX512 code: 1.5 clocks Change-Id: I40700175835a1e7321276e47eadbf9771d3c5a68 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-22vppinfra: add clib_count_equal_uXX and clib_memset_uXX functionsDamjan Marion1-2/+2
Change-Id: I56782652d8ef10304900cc293cfc0502689d800e Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-20vector functions cleanup and improvementsDamjan Marion1-4/+11
Remove functions which have native C equivalent (i.e. _is_equal can be replaced with ==, _add with +) Add SSE4.2, AVX-512 implementations of splat, load_unaligned, store_unaligned, is_all_zero, is_equal, is_all_equal Change-Id: Ie80b0e482e7a76248ad79399c2576468532354cd Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-18Add vlib_buffer_enqueue_to_next inline functionDamjan Marion1-0/+6
Change-Id: I1042c0fe179b57a00ce99c8d62cb1bdbe24d9184 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-17Add buffer pointer-to-index and index-to-pointer array functionsDamjan Marion1-0/+22
Change-Id: Ib3fcc3ceb7f315389bcdecbb7d9632540a5dd6ba Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-09dpdk: tx code reworkDamjan Marion1-0/+12
Change-Id: Ifea9c772e8784642433b92091f5769eb9ec06890 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-04-25dpdk: complete rework of the dpdk-input nodeDamjan Marion1-0/+80
Change-Id: If174d189de40e6f9ffae99997bba93a2519d9fda Signed-off-by: Damjan Marion <damarion@cisco.com>