vpp - Vector Packet Processing

Age	Commit message (Collapse)	Author	Files	Lines
2018-07-02	Add per-numa page allocation info to 'show memory'	Damjan Marion	2	-0/+64
	Change-Id: I64e4e3d68c0f3958323f30b12a26cfaafa8bad85 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-06-30	bitmap: add nocheck variants for bit ops	Florin Coras	2	-20/+54
	Change-Id: Ifd155e2980a9f8e6af9bb6b08619c15b2bf18ef1 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-06-29	bihash key compare improvements	Damjan Marion	3	-12/+10
	Looks like CPU doesn't like overlaping loads. This new codes in some cases shows 3-4 clock improvements. Change-Id: Ia1b49976ad95140c573f892fdc0a32eebbfa06c8 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-06-28	Fix mheap_get_aligned() performance jackpot	Dave Barach	2	-3/+64
	If non-trivial alignment (e.g. 64) requested, and the object size (e.g. 16) is smaller than (alignment_request - MHEAP_ELT_OVERHEAD_BYTES), round up the size request. This avoids creating remainder chunks, which are false-cache-line-sharing bait to begin with. Change-Id: Ie1a21286d29557d125bb346254b1be2def868b1a Signed-off-by: Dave Barach <dave@barachs.net>
2018-06-28	ip: vectorized ip checksum	Damjan Marion	1	-0/+28
	Change-Id: Ida678e6f31daa8decb18189da712a350336326e2 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-06-27	Tune pool_get / pool_put	Dave Barach	2	-7/+56
	Stop spending cycles repeatedly tail-trimming the pool free element bitmap; possibly at the expense of slightly hurting pool_foreach peformance. Change-Id: I8a7f3e7b26c71d7496ba9393b2a167dc7f538355 Signed-off-by: Dave Barach <dave@barachs.net>
2018-06-27	vppinfra: add vector horizontal add and byte swap (SSE4.2 & AVX2)	Damjan Marion	2	-0/+31
	Change-Id: I4e0fd487970796f0153a5b16333827d23b57deac Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-06-26	Fix load_unaligned undefined and other possible build failures	Sirshak Das	1	-26/+40
	Add aarch64 neon intrinsics to fix build failures similar to this: error: implicit declaration of function ‘u64x2_load_unaligned’ Change-Id: I6178504a48242742df3f7d75abdaf108796cf73f Signed-off-by: Sirshak Das <sirshak.das@arm.com>
2018-06-26	We don't have (yet) 128-bit unaligned load/store on ARM	Damjan Marion	1	-2/+2
	Change-Id: I16395bbf843e338cdd366d85bb4df3de95d9b265 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-06-26	add backtrace in unix_signal_handler	Kingwel Xie	1	-21/+14
	crash stack backtrace will be directed to syslog 1. make use of glic backtrace in execinfo.h. the old clib_backtrace is removed 2. install SIGABRT in signal handler, but have to remove it when backtrace is done. reason is to capture stack trace caused by SIGABRT. vPP ASSERT always call os_exit then abort(). we definitely want to know the trace of this situation. It is a little tricky to avoid SIGABRT infinite loop 3. always load symbols by calling clib_elf_main_init () in main(). Otherwise, PC addresses instead of symbols will be displayed. Change-Id: I150e15b94a4620b2ea4f08c73dc3e6ad1856de1e Signed-off-by: Kingwel Xie <kingwel.xie@ericsson.com>
2018-06-26	SIMD optimized linear search in clib_bitmap_first_set	Damjan Marion	1	-2/+23
	Change-Id: Ib3a55598a83cc99485b40e38e7c406ecb126fd42 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-06-25	tw: add light weight timer update function	Florin Coras	4	-31/+172
	Because it avoids pool putting/getting the timer, this function is somewhat faster than stopping and restarting a timer. Change-Id: Id99ed9d356b0b1f7e12facfe8da193e1cd30b3ec Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-06-14	Add clib_bihash_search_inline_2_with_hash to bihash template	Andrew Yourtchenko	1	-5/+15
	Use similar approach as in the clib_bihash_search_inline_with_hash to be able to do the hash calculation and lookup separately. Change-Id: Ief79aa0f9f1e42b0af88be4807ca01fac30a80d7 Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
2018-06-13	Disable bihash bucket-level caching	Dave Barach	3	-3/+3
	It'll be interesting to see what the perf trend job says about this change. Change-Id: I66307a19a865011ac9660108098874fa1481c895 Signed-off-by: Dave Barach <dave@barachs.net>
2018-06-08	Time range support for vppinfra	Dave Barach	3	-0/+781
	Change-Id: I2356b1e05fd868b46b4d26ade760900a5739ca4d Signed-off-by: Dave Barach <dave@barachs.net>
2018-06-05	VPP API: Memory trace	Ole Troan	4	-3/+27
	if you plan to put a hash into shared memory, the key sum and key equal functions MUST be set to constants such as KEY_FUNC_STRING, KEY_FUNC_MEM, etc. -lvppinfra is PIC, which means that the process which set up the hash won't have the same idea where the key sum and key compare functions live in other processes. Change-Id: Ib3b5963a0d2fb467b91e1f16274df66ac74009e9 Signed-off-by: Ole Troan <ot@cisco.com> Signed-off-by: Dave Barach <dave@barachs.net> Signed-off-by: Ole Troan <ot@cisco.com>
2018-06-04	Configure or deduce CLIB_LOG2_CACHE_LINE_BYTES (VPP-1064)	Dave Barach	1	-1/+4
	Added configure argument "--with-log2-cache-line-bytes=5\|6\|7\|auto" AKA 32, 64, or 128 bytes, or use the inferred value from the build host. produces build-xxx/vpp/vppinfra/config.h, which .../src/vppinfra/cache.h Kernels which implement the following pseudo-file (aka x86_64) are easy: /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size Otherwise, extract the cpuid from /proc/cpuinfo and map it to the cache line size. Change-Id: I7ff861e042faf82c3901fa1db98864fbdea95b74 Signed-off-by: Dave Barach <dave@barachs.net> Signed-off-by: Nitin Saxena <nitin.saxena@cavium.com>
2018-06-02	AVF input node rework	Damjan Marion	1	-0/+3
	Change-Id: Ib121b24935d5c706cfba6e4b6d321086a38cad91 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-30	Fix clang compilation on aarch64: value size does not match register size.	Sirshak Das	1	-1/+1
	Fixes clang error: value size does not match register size specified by the constraint and modifier Change-Id: I83e69445eacd6570607334e086a8582addb5bdfc Signed-off-by: Sirshak Das <sirshak.das@arm.com> Reviewed-by: Brian Brooks <brian.brooks@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2018-05-30	vppinfra: explicitely state for signed types that they are signed	Damjan Marion	2	-9/+9
	This fixes some compilation warnings with clang on AArch64. Change-Id: Idb941944e3f199f483c80e143a9e5163a031c4aa Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-29	Add VLIB_NODE_FN() macro to simplify multiversioning of node functions	Damjan Marion	1	-2/+25
	Change-Id: Ibab5e27277f618ceb2d543b9d6a1a5f191e7d1db Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-28	Change optimizaton level from tree-vectorize to O3	Damjan Marion	1	-1/+1
	Change-Id: Ia1b49d7fd5f32d9a5139df5df636b46264003a63 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-28	Fix flowhash size computation for very large hash tables	Pierre Pfister	1	-1/+1
	Change-Id: Ieae4ff6429fc5bdcf0e243db40ab7ec00c30730a Signed-off-by: Pierre Pfister <ppfister@cisco.com>
2018-05-25	bond: performance harvesting	Steven	2	-0/+71
	- hash is great. But it is a bit too slow for the DP. Use direct array indexing to quickly retrieve the slave interface. - the algorithm used by flow hash is great. But it is a bit too slow for the DP. Use l2_hash_hash() extracted from lb_hash.h which ECMP is using. It makes use of intrinsic crc32 instruction set. - shortcut modulo arithmetic when the operand is 2**x (where x up to 4) to avoid division instruction. - special case for link count == 1 in bond_tx_fn() - use clib_mem_unaligned to access data for the packet to avoid alignment error - Fix some typos for packet tracing. Change-Id: I8eae3ad497061c5473aa675ba894ee0211120d25 Signed-off-by: Steven <sluong@cisco.com>
2018-05-25	Vectorized bihash_{48,40,24,16}_8 key compare	Damjan Marion	6	-24/+83
	bihash_48_8 case: Scalar code: 6 clocks SSE4.2 code: 3 clocks AVX2 code: 2.27 clocks AVX512 code: 1.5 clocks Change-Id: I40700175835a1e7321276e47eadbf9771d3c5a68 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-22	vppinfra: add clib_count_equal_uXX and clib_memset_uXX functions	Damjan Marion	4	-5/+339
	Change-Id: I56782652d8ef10304900cc293cfc0502689d800e Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-20	vector functions cleanup and improvements	Damjan Marion	7	-93/+97
	Remove functions which have native C equivalent (i.e. _is_equal can be replaced with ==, _add with +) Add SSE4.2, AVX-512 implementations of splat, load_unaligned, store_unaligned, is_all_zero, is_equal, is_all_equal Change-Id: Ie80b0e482e7a76248ad79399c2576468532354cd Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-19	Disable vector code in vlib_buffer_enqueue_to_next if no msb mask function	Damjan Marion	1	-0/+2
	This fixes ARM64 build where we dont have defined u16x8_msb_mask(...) Change-Id: I864f5134a0d951601810c800f587d173b3b7ef41 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-18	Add vlib_buffer_enqueue_to_next inline function	Damjan Marion	3	-1/+22
	Change-Id: I1042c0fe179b57a00ce99c8d62cb1bdbe24d9184 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-17	Add buffer pointer-to-index and index-to-pointer array functions	Damjan Marion	1	-0/+22
	Change-Id: Ib3fcc3ceb7f315389bcdecbb7d9632540a5dd6ba Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-11	Periodic scan and probe of IP neighbors to maintain neighbor pools	John Lo	2	-0/+6
	Scan IPv4 and IPv6 neigbor pool entries once a minute to keep them up to date. The neighbor of an entry is probed if its time-stamp is older than 1 minute. If the neighbor respond, its time-stamp will be updated. If there is no response from a neighbor, its entry will be deleted when the time-stamp of the entry become more than 4 minutes old. Static neighbor entries are not probed nor deleted. Implemented CLI and API to enable and disable priodic scan of IPv4, IPv6 or both types of IP neighbors. CLI is "ip scan-neighbor" and API is "ip_scan_neighbor_enable_disable". Other IP neighbor scan parameters can also be changed from their defaults via the CLI/API. Change-Id: Id1a0a934ace15d03db845aa698bcbb9cdabebfcd Signed-off-by: John Lo <loj@cisco.com>
2018-05-11	VPP-1275 Fix memory leaks in IPsec CLI	Klement Sekera	1	-1/+1
	Change-Id: I1f7c634328f25b33580a215af2daeb498cd3b181 Signed-off-by: Klement Sekera <ksekera@cisco.com>
2018-05-10	vppinfra: use count_trailing_zeros in sparse_vec_index	Damjan Marion	3	-68/+30
	It is much cheaper to use ctzll than to do shift,subtract and mask in likely case when we are looking for 1st set bit in the uword. Change-Id: I31954081571978878c7098bafad0c85a91755fa2 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-10	vppinfra: use popcnt instruction when available	Damjan Marion	1	-0/+8
	Change-Id: Id02d613b8613a2d448840fe2d6a5e3b168a3c563 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-09	dpdk: tx code rework	Damjan Marion	1	-0/+12
	Change-Id: Ifea9c772e8784642433b92091f5769eb9ec06890 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-05	autodetect alignment during _vec_resize	Damjan Marion	5	-8/+12
	Change-Id: I2896dbde78b5d58dc706756f4c76632c303557ae Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-05-04	Harmonize vec/pool_get_aligned object sizes and alignment requests	Dave Barach	2	-0/+4
	Object sizes must evenly divide alignment requests, or vice versa. Otherwise, only the first object will be aligned as requested. Three choices: add CLIB_CACHE_LINE_ALIGN_MARK(align_me) at the end of structures, manually pad to an even divisor or multiple of the alignment request, or use plain vectors/pools. static assert for enforcement. Change-Id: I41aa6ff1a58267301d32aaf4b9cd24678ac1c147 Signed-off-by: Dave Barach <dbarach@cisco.com>
2018-04-30	Remove historical README file	Dave Barach	1	-43/+0
	Change-Id: I54a00686a7f3a61f583a5f701a0ab6c5480a455b Signed-off-by: Dave Barach <dave@barachs.net>
2018-04-25	dpdk: complete rework of the dpdk-input node	Damjan Marion	4	-5/+149
	Change-Id: If174d189de40e6f9ffae99997bba93a2519d9fda Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-04-18	vppinfra: make set_mempolicy failure non-critical unless NUMA_FORCE is set	Damjan Marion	1	-1/+2
	Change-Id: I6c1c855cf5fc2ee06f1c7ddd6576ca16cd556fdd Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-04-11	Clean up temp dir in failure cases	Dave Barach	1	-0/+3
	Change-Id: Icfb99a09726c01e96ff14967afbafa4116e02eff Signed-off-by: Dave Barach <dbarach@cisco.com>
2018-03-22	Add circular logging	Dave Barach	3	-21/+85
	Change-Id: Ide8bf41e24a427643a3a17b1c9089993790c12a6 Signed-off-by: Dave Barach <dave@barachs.net>
2018-03-12	Remove md5.[ch] from vppinfra	Dave Barach	3	-515/+0
	Removed the sole use of it from ip6_neighbor.c Change-Id: Ie53cb3b6a3a41ec0917ec2042e5006d0cfaefc01 Signed-off-by: Dave Barach <dave@barachs.net>
2018-03-09	Correct address calculation for VPP-1168	Lee Roberts	1	-1/+1
	Use (u64) cast to ensure proper address calculations. Change-Id: I6bad50010b140189f1b0af177e55da0045bd7a93 Signed-off-by: Lee Roberts <lee.roberts@hpe.com>
2018-03-06	glibc 2.27 fix	Marco Varlese	1	-0/+2
	With glibc 2.27 the memfd_create has been added to the devel libraries. That's causing the internally defined static function to clash with the system wide one. This patch addresses that issue on systems with latest glibc libraries. Change-Id: I788bf49b23d5b5f1cb1c0374e243d8a429178a71 Signed-off-by: Marco Varlese <marco.varlese@suse.com>
2018-03-04	vppinfra: fix clib_mem_vm_ext_alloc non-shared allocations	Damjan Marion	2	-3/+9
	Change-Id: I6d049c0875b91f67f008dc04ae7efe2f8ddc276e Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-02-26	Added u8x16,u32x4,u64x2 variants of _zero_byte_mask(x) for ARM/NEON ↵	Adrian Oanca	1	-0/+20
	platform. VPP-1129 Change-Id: I954acb56d901e42976e71534317f38d7c4359bcf Signed-off-by: Adrian Oanca <adrian.oanca@enea.com>
2018-02-24	u8x16_compare_byte_mask - optimize to use 128bit registers as suggested by ↵	Adrian Oanca	1	-24/+9
	Nintin Change-Id: I88aabd34ef385d620695ac17ec3fe2f4a5177ada Signed-off-by: Adrian Oanca <adrian.oanca@enea.com>
2018-02-23	Add prefetch inlines, update bi-hash doc tags	Dave Barach	2	-12/+90
	Change-Id: I2e9d01ccba5288e89b886464436097d3cb7d2d18 Signed-off-by: Dave Barach <dave@barachs.net>
2018-02-22	bihash table size perf/scale improvements	Dave Barach	3	-41/+73
	Directly allocate and carve cache-line-aligned chunks of virtual memory. To a first approximation, bihash wasn't using clib_mem_free(...). We eliminate mheap object header/trailers, which improves space efficiency. We also eliminate the 4gb bihash table size limit. An 8_8 bihash w/ 100 million random entries uses 3.8 Gbytes. Change-Id: Icf925fdf99bce7d6ac407ac4edd30560b8f04808 Signed-off-by: Dave Barach <dave@barachs.net>