Age | Commit message (Collapse) | Author | Files | Lines |
|
The mheap allocator has been turned off for several releases. This
commit removes the cmake config parameter, parallel support for
dlmalloc and mheap, and the mheap allocator itself.
Type: refactor
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I104f88a1f06e47e90e5f7fb3e11cd1ca66467903
|
|
Type: fix
Ticket: VPP-1837
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I6b1ea13fc83460bf4ee75cb9249d83dddaa64ded
|
|
Type: improvement
Change-Id: Ib2cb708fdcb14fdea9298c10d67f8fe73887f18b
Signed-off-by: Damjan Marion <dmarion@me.com>
|
|
Type: fix
Change-Id: I77b03efcac04cc46550d03657464ab8de5d7da78
Signed-off-by: Klement Sekera <ksekera@cisco.com>
|
|
Type: feature
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: I999836a7893a89aac5243b111eac35fddd03e2a6
|
|
Type: refactor
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: I13b239cd572ae6dfaec07019d3d9b7c0ed3edcfa
|
|
Type: fix
Change-Id: Ifd61c0683c85fe7340965c225ed23e46ec88e01a
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: feature
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I7e7d95a089dd849c1f01ecea84529d8dbf239f21
|
|
Sporadic reports of os_cpu_clock_frequency() returning 0.0 in highly
parallel container environments.
To avoid immediate division by zero:
Step 1: try estimate_clock_frequency(1e-3).
Step 2: give up. Pretend we have a 2gHz clock.
Type: fix
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I19d0fe5259b757ab778599c7026ce485153b43fa
|
|
Fix minor memory leak
Type: fix
Ticket: VPP-1833
Fixes: 4af9ba1dab
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: Id10fba70471ca78f73f14146054f6b12c5d4431f
|
|
Type: feature
Change-Id: I32256061b9509880eec843db2f918879cdafbe47
Signed-off-by: Damjan Marion <dmarion@me.com>
|
|
Apply exponential smoothing to the clock rate update calculation in
clib_time_verify_frequency(), with a half-life of 1 minute and a
sampling frequency of 16 seconds. Within 5 minutes or so, the
calculation converges
With each rate recalculation: reset total_cpu_time based on the kernel
timebase delta since vpp started, and the new clock rate
Improve the "show clock [verbose]" debug CLI command.
BFD echo + echo fail tests marked off until the BFD code can be
reworked a bit.
Type: fix
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I24e88a78819b12867736c875067b386ef6115c5c
|
|
Type: fix
Change-Id: Ifb007207be97119e07c3a0eba4714eb519de043c
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: feature
Change-Id: I9d1f9f00ac011a93709850186dcf4cf5ea3bf88a
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Fixing compilation issuues for 32-bit also setting init flag for shm based bihash
Type: fix
Signed-off-by: Vijayabhaskar Katamreddy <vkatamre@cisco.com>
Change-Id: Ic2072c5ba7fc77d061ca9f1b844a71f6e22e58b2
|
|
Type: fix
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: I93577acf559a8fa639aab7ec3f7cdbe7df9a248d
|
|
Type: refactor
the key is not modified by these functions
Change-Id: I578f054355fca69e8a086bb69013155a01ed759f
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Fix pg code to close it's open file descriptors before zero'ing the
pcap_main structure for re-use.
Ticket: VPP-1780
Type: fix
Signed-off-by: Christian E. Hopps <chopps@chopps.org>
Change-Id: I32945c6476ae83b8d210ee67ac78db3e8f786f46
|
|
Type: fix
Change-Id: I2b8273666db864d80012c39623ae866ac3527426
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: fix
Change-Id: I99e3951f8cfb7ab9d2f0a7dcee92199eab29043c
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: fix
Change-Id: Idb1fff8a172034044bb33d5b271a84d1fd672ef5
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: feature
Change-Id: I28f7a658be3f3beec9ea32635b60d1d3a10d9b06
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
If clib_time_verify_frequency() adjusts the clock frequency, transform
total_cpu_time to the new time coordinate space. Otherwise, we break
comparisons with previous clib_time_now() value.
Without this correction, time jumps in one direction or the other
depending on the sign of the frequency change. Reasonably harmless in
most cases, but under perfect storm conditions the wheels fall off.
Type: fix
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I21802c2630e2c87ff817cd732b7d78bc022cd2d7
|
|
Introduce AddressSanitizer support: https://github.com/google/sanitizers/
This starts with heap instrumentation. vlib_buffer, bihash and stack
instrumentation should follow.
Type: feature
Change-Id: I7f20e235b2f79db72efd0e756f22c75f717a9884
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Valgrind never really worked well with VPP. Remove the partial support.
Type: refactor
Change-Id: Ic09773fd85f904fdd2240bc161e23a4c2b196cf6
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
set the address to MMAP_FAILED if mmap has failed,
so that we do not attempt to free it in the error
handling path.
Change-Id: I6e6b51a365fb68086dc20aa40a676a36af59a3ba
Type: fix
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|
|
Type: feature
Change-Id: I5bbf37969c9c51e40a013d1fc3ab966838eeb80d
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: refactor
Change-Id: Ic1c3e1f7987702cd88972acc34849dc1f585d5fe
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Coverity gets confused by ASSERT((l) <= vec_max_len(v)) when l is 0.
Type: fix
Change-Id: I247d7015b148233d8f195bcf41e9a047b7a21309
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
IPsec zero-es all allocated key memory including memory sur-allocated by
the allocator.
Move it to its own function in clib mem infra to make it easier to
instrument.
Type: refactor
Change-Id: Icd1c44d18b741e723864abce75ac93e2eff74b61
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
l2-flood and bier nodes reset vector length without updating it to its
effective size. Introduce a helper to do it (this allows ASAN to keep
track of the new vector size).
Type: refactor
Change-Id: I2d652550c440f0553a2b49c3ee3d37b49ebc16c3
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Fix a day-1 bug, possibly dating back as far as 2002. The zap64() game
involves fetching 8 byte chunks, and clearing octets not to be
included in the key.
That's fine *unless* the 8-byte fetch happens to cross a page boundary
into unmapped or no-access space.
Type: fix
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I4607e9840032257c96ba7387f86c931c0921749d
|
|
Type: feature
Signed-off-by: MathiasRaoul <mathias.raoul@gmail.com>
Change-Id: I8d71078a9ed42326e19453ea10008c6bb6992c52
|
|
Define CLIB_PAUSE () to generate the "yield" instruction. No significant
performance changes were observed for clib_spinlock_t and clib_rwlock_t.
Type: feature
Change-Id: I59eb996e61c7a16007517e57e6996567302c1657
Signed-off-by: Jason Zhang <jason.zhang2@arm.com>
Reviewed-by: Lijian Zhang <Lijian.Zhang@arm.com>
|
|
Type: feature
Change-Id: I8f5f4841965beb13ebc8c2a37ce0dc331c920109
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
In some cases it may happen that buffer is allocated by DPDK, and freed
by VPP native code. In such cases dpdk metadata is not reset, so we need
to do that during mempool dequeue. Template approach is taken to reduce
cost of that operation.
Type: fix
Fixes: 910d369
Change-Id: Ic239007cfc8fbceb965021c56963cda9d53f63be
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Add controls to list / not list a specific bihash in clib_all_bihashes,
to immediately initialize a bihash.
clib_bihash_init2 is now the primary API. It takes a typical args_t
structure. clib_bihash_init becomes a compatibility widget. It
fabricates an args_t and calls init2...
Type: refactor
Ticket: VPP-1758
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: Ib3e1304884997cf7025af20bdc67a7dda290f15b
|
|
when use pcap cli to capture pcakets into two files rx01.pcap && rx02.pcap,
the first time:
1)pcap rx trace on max 100 intfc any file rx01.pcap
2)......the process of capture data to buffer......
3)pcap rx trace off
the second time:
4)pcap rx trace on max 100 intfc any file rx02.pcap
5)......the process of capture data to buffer......
6)pcap rx trace off
the pcap_write function bug in this two lines
pm->n_packets_captured = 0;
if (pm->n_packets_captured >= pm->n_packets_to_capture) referring to calling pcap_close()
will result in that the twice pcap cli both writes the packets
into rx01.pcap, but nothing into rx02.pcap. Beside, the rx02.pcap
file will not be created.
solution: separate the pcap_close() out of pcap_write()
Change-Id: Iedeb46f9cf0a4cb12449fd75a4014f95f3bb3fa8
Signed-off-by: Jack Xu <jack.c.xu@ericsson.com>
|
|
- Allow "Microarch model(family)" row to show PASS
revison as either string (like A0, B0) or number (like
1.0, 2.0).
- Fix part number for Marvell CN96XX
Change-Id: Ie01a3960c4e5e481be354dc8bb60f744e5c65737
Signed-off-by: Nitin Saxena <nsaxena@marvell.com>
|
|
Type: feature
This is needed when creating pthreads in client applications,
they need a way to set __os_thread_index per thread
that does not conflict with the binary API thread index.
If __os_thread_index is left to 0 in two client pthreads and
they call vl_msg_api_alloc and vec_resize at the same time
it can fail due to them sharing (and push/poping) the same
clib_per_cpu_mheaps slot.
Change-Id: I85d4248a39b641a4d3ad5a1c1bd6e0db5875fab6
Signed-off-by: Nathan Skrzypczak <nathan.skrzypczak@gmail.com>
|
|
Provide default packet_to_capture value. Display interface name
correctly for "pcap tx/rx trace status".
Type: fix
Signed-off-by: John Lo <loj@cisco.com>
Change-Id: I7064d0dbea236a9aff68bba7fbaf2c4a73b16c6f
Signed-off-by: John Lo <loj@cisco.com>
|
|
Type: fix
Change-Id: I67b72b5ad03b972198c27bc0d927867f41b0e20b
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Previous implementation of clib_rwlock_t used two spinlocks: one
writer lock, and one to guard the counter for the number of readers.
This implementation uses a single condition variable rw_cnt which
has the following properties:
if a writer has the rwlock, rw_cnt = -1
if the rwlock is free, rw_cnt = 0
otherwise, rw_cnt > 0 and rw_cnt = number of readers
rw_cnt will never be less than -1
Benchmarking:
The results below are the cycle counts from test_rwlock.c, configured so
that for 10000 iterations, 6 reader and 6 writer threads on separate cores
are spawned such that each writer thread increments a global counter
10000 times in each iteration. For Taishan, 4 reader and 4 writer
threads are spawned in each test.
x86 Xeon old rwlock: 12.473e8, 11.655e8, 13.201e8, 11.347e8, 13.182e8
x86 Xeon new rwlock: 5.881e8, 5.796e8, 6.536e8, 5.540e8, 5.890e8
Aarch64 ThX2* old rwlock: 9.263e7, 8.933e7, 9.074e7, 8.979e7, 9.378e7
Aarch64 ThX2* new rwlock: 7.221e7, 8.107e7, 7.515e7, 7.672e7, 7.386e7
A72 old rwlock: 3.268e6, 3.200e6, 3.086e6, 3.176e6, 3.170e6
A72 new rwlock: 1.261e6, 1.288e6, 1.251e6, 1.229e6, 1.234e6
*ThunderX2 used additional gcc options "-march=armv8.1-a+crc+crypto+lse"
Type: refactor
Change-Id: I7c347d3037b36205ab532cbcb52a374c846eb275
Signed-off-by: Jason Zhang <jason.zhang2@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Lijian Zhang <Lijian.Zhang@arm.com>
|
|
"timer.[ch]" used a signal handler to deliver timer callbacks. Without
indulging in a set of sigprocmask(...) system calls, it would be
unsafe to use the mechanism.
Rather than wait for another developer to accidentally open this
particular can of worms, best to remove the code. It's nothing more
than an attractive nuisance at this point.
Type: refactor
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: Ia3e7b00a389c302b466605dff0c1bf3566b8dbbd
|
|
Type: fix
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: Ie37ff66faba79e3b8f46c7a704137f9ef2acc773
|
|
Tested performance of a CAS implementation (using __atomic_compare_exchange)
against a TAS implementation (using __atomic_exchange) using test_spinlock.c
and found some performance improvement.
Generated assembly for CAS and TAS implementations show that TAS always
executes with a load-store dependency, but CAS moves a branch condition
between the load and store so that only a load occurs when the lock is free.
Benchmarking:
The results below are the cycle counts from test_spinlock.c, configured so
that for 10000 iterations, 12 threads on separate cores are spawned, each of
which increments a global counter 10000 times in each iteration. For
A72, 8 threads are spawned in each test.
x86 Xeon TAS: 7.333e8, 7.605e8, 7.535e8, 7.485e8, 7.321e8
x86 Xeon CAS: 5.842e8, 5.433e8, 5.389e8, 5.983e8, 5.552e8
Aarch64 ThX2* TAS: 9.852e7, 10.209e7, 9.190e7, 9.600e7, 9.224e7
Aarch64 ThX2* CAS: 7.640e7, 7.486e7, 7.425e7, 7.269e7, 7.534e7
A72 TAS: 7.289e6, 6.963e6, 7.208e6, 6.976e6, 7.200e6
A72 CAS: 1.695e6, 1.608e6, 1.600e6, 1.634e6, 1.746e6
*ThunderX2 used additional gcc options "-march=armv8.1-a+crc+crypto+lse"
Type: refactor
Change-Id: Ic5cd97991804f6b012707fad1a5d1a6edb96cd3d
Signed-off-by: Jason Zhang <jason.zhang2@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Lijian Zhang <Lijian.Zhang@arm.com>
|
|
Spawns a uniform number of writer and reader threads across a number of
cores where each writer thread increments a global variable a specified
number of times, and the reader threads continually poll the global's
value until the writers complete.
Type: test
Change-Id: I979c3734c6d03139d0802bff1846875d226f6fbb
Signed-off-by: Jason Zhang <jason.zhang2@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Lijian Zhang <Lijian.Zhang@arm.com>
|
|
Spinlock performance improved when implemented with compare_and_exchange
instead of test_and_set. All instances of test_and_set locks were refactored
to use clib_spinlock_t when possible. Some locks e.g. ssvm synchronize
between processes rather than threads, so they cannot directly use
clib_spinlock_t.
Type: refactor
Change-Id: Ia16b5d4cd49209b2b57b8df6c94615c28b11bb60
Signed-off-by: Jason Zhang <jason.zhang2@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Lijian Zhang <Lijian.Zhang@arm.com>
|
|
Spawns a uniform number of threads across a number of cores where each
thread increments a global variable a specified number of times.
Type: test
Change-Id: I12b3a37708a199c297d022348d99dbb0e8349a9f
Signed-off-by: Jason Zhang <jason.zhang2@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Lijian Zhang <Lijian.Zhang@arm.com>
|
|
All instances of test_and_set locks used the following sequence
to release the locks:
CLIB_MEMORY_BARRIER ();
p->lock = 0; // p is a generic struct with a TAS lock
Use clib_atomic_release to generate more efficient assembly code.
Type: refactor
Change-Id: Idca3a38b1cf43578108bdd1afe83b6ebc17a4c68
Signed-off-by: Jason Zhang <jason.zhang2@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Lijian Zhang <Lijian.Zhang@arm.com>
|