summaryrefslogtreecommitdiffstats
path: root/src/vppinfra/vector_neon.h
AgeCommit message (Collapse)AuthorFilesLines
2018-10-09vppinfra: Fix extendto_high aarch64 NEON api.v19.01-rc0Sirshak Das1-1/+1
This fixes the l2BD and ip4 test case failures. Fixes VPP-1432, VPP-1428, VPP-1430 Change-Id: I48b5c961bab60cc3b39fcd6db47e098c81579480 Signed-off-by: Sirshak Das <sirshak.das@arm.com>
2018-09-12Add and enable u32x4_extend_to_u64x2_high for aarch64 NEON intrinsics.Sirshak Das1-0/+6
This is the high version of extendto. This function accomplishes the same task as both shuffling and extending done by SSE intrinsics. This enables the NEON version for buffer indexes to buffer pointer translation. Change-Id: I52d7bbf3d76ba69c9acb0e518ff4bc6abf3bbbd4 Signed-off-by: Sirshak Das <sirshak.das@arm.com> Reviewed-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Yi He <yi.he@arm.com> Verified-by: Lijian Zhang <lijian.zhang@arm.com>
2018-09-11Replacing vtbl NEON intrinsic with rev NEON intrinsic for byte_swap.Sirshak Das1-5/+1
Using rev16 vector intrinsic to reverse byteorder in each word independently. Change-Id: I071c40780baffe0bda614ec5d9dd92858f574b0d Signed-off-by: Sirshak Das <sirshak.das@arm.com> Reviewed-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Brian Brooks <brian.brooks@arm.com> Reviewed-by: Yi He <yi.he@arm.com> Verified-by: Lijian Zhang <lijian.zhang@arm.com>
2018-09-11Add u32x4_extend_to_u64x2 for aarch64 using NEON intrinsicsSirshak Das1-0/+6
This is used in vlib_get_buffers_with_offset. Change-Id: If4ff776bc97d21a22e870300b164eeb6a5ec3638 Signed-off-by: Sirshak Das <sirshak.das@arm.com> Reviewed-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Brian Brooks <brian.brooks@arm.com> Reviewed-by: Yi He <yi.he@arm.com> Verified-by: Lijian Zhang <lijian.zhang@arm.com>
2018-09-11Add horizontal add (hadd) vector intrinsic via NEON.Sirshak Das1-0/+6
Having the NEON equivalent of u32x4_hadd for CLIB_HAVE_VEC128 Change-Id: I210f96f7ecb9b80b4753311a68e5e09ccda7e95b Signed-off-by: Sirshak Das <sirshak.das@arm.com> Reviewed-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Brian Brooks <brian.brooks@arm.com> Reviewed-by: Yi He <yi.he@arm.com> Verified-by: Lijian Zhang <lijian.zhang@arm.com>
2018-08-01Add support for shuffle vector intrinsic via Neon in ARMSirshak Das1-0/+16
This adds byte_swap (variant of shuffle) and shuffle vector intrinsic for ARM based on Neon, concuring with same signature as SSE vector intrinsic. Change-Id: I386fd2b1dcc83654e4ad9f90a6065d7736e4ce5c Signed-off-by: Sirshak Das <sirshak.das@arm.com>
2018-06-26Fix load_unaligned undefined and other possible build failuresSirshak Das1-26/+40
Add aarch64 neon intrinsics to fix build failures similar to this: error: implicit declaration of function ‘u64x2_load_unaligned’ Change-Id: I6178504a48242742df3f7d75abdaf108796cf73f Signed-off-by: Sirshak Das <sirshak.das@arm.com>
2018-02-26Added u8x16,u32x4,u64x2 variants of _zero_byte_mask(x) for ARM/NEON ↵Adrian Oanca1-0/+20
platform. VPP-1129 Change-Id: I954acb56d901e42976e71534317f38d7c4359bcf Signed-off-by: Adrian Oanca <adrian.oanca@enea.com>
2018-02-24u8x16_compare_byte_mask - optimize to use 128bit registers as suggested by ↵Adrian Oanca1-24/+9
Nintin Change-Id: I88aabd34ef385d620695ac17ec3fe2f4a5177ada Signed-off-by: Adrian Oanca <adrian.oanca@enea.com>
2018-02-21add 'is_all_zero(x)' for NEONAdrian Oanca1-0/+24
Change-Id: I5045e0f3ac4698e820b69ad46b96763e404e6fe4 Signed-off-by: Adrian Oanca <adrian.oanca@enea.com>
2018-02-20vppinfra: autogerate vector typedefs and basic inline functionsDamjan Marion1-43/+0
Change-Id: Ie9f611fa6a962b0937245f5cc949571ba11c5604 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-02-08add CLIB_HAVE_VEC128 with NEON intrinsics (VPP-1127)Gabriel Ganne1-0/+60
Enable CLIB_HAVE_VEC128 if both aarch64 and __ARM_NEON ie. armv8 only, not armv7 Add more neon compare intrinsics wrappers. I only add simple intrinsics wrappers. More complex ones can be added later as they are needed, with performance tests on the corresponding feature to back them up. Remove wrongly added 128bits definitions defined on both armv7 and armv8 without concern for NEON instructions presence. Notable correspondinf code activations: * MHEAP_FLAG_SMALL_OBJECT_CACHE in mheap.c * ip4 fib mtrie leaves access * enable ixge plugin compilation for aarch64 (conf still disables it by default) Change-Id: I99953823627bdff6f222d232c78aa7b655aaf77a Signed-off-by: Gabriel Ganne <gabriel.ganne@enea.com>
2016-12-28Reorganize source tree to use single autotools instanceDamjan Marion1-0/+71
Change-Id: I7b51f88292e057c6443b12224486f2d0c9f8ae23 Signed-off-by: Damjan Marion <damarion@cisco.com>