Age | Commit message (Collapse) | Author | Files | Lines |
|
* Avoid doing expensive bit extraction for most likely case where bucket
.log2_page_size == 0 and .linear_search == 0, saves 3-5 cycles for
lookup, data_prefetch and add operation
* use bextr instruction when available (x86 BMI instruction set)
Type: improvement
Change-Id: I163df36a29287482c5f133be8b21d62a2f7440de
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: improvement
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: Id28299a188feefa1899d835fd499f018af95d81b
|
|
Type: fix
Ticket: VPP-1837
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I9ec87d2293d8f92c3e488a0f61083cf815ac496c
|
|
Type: fix
Change-Id: Ic7441a09bab2cabc7632ee502368584ac022f997
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
It is a bad idea to poison memory after munmap because the address space
can be reused (eg. for global data of dlopen()ed object) and ASan model
allows access by default.
Moreover, access to a stale address space will fault.
Type: fix
Change-Id: I356de422f255447d9d50a3a71fb0c2eaa790d731
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: improvement
Change-Id: Ia04bd26ecd689894753e036e52920316de611910
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Measured improvement is from 439 to 167 clocks for add operation
in 16_8 case...
Type: improvement
Change-Id: I975ff46ff30b983a3ec80a5cde25ccb68d7fa03b
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Template instances can allocate BIHASH_KVP_PER_PAGE data records
tangent to the bucket, to remove a dependent read / prefetch.
Template instances can ask for immediate memory allocation, to avoid
several branches in the lookup path.
Clean up l2 fib, gpb plugin codes: use clib_bihash_get_bucket(...)
Use hugepages for bihash allocation arenas
Type: improvement
Signed-off-by: Dave Barach <dave@barachs.net>
Signed-off-by: Damjan Marion <damarion@cisco.com>
Change-Id: I92fc11bc58e48d84e2d61f44580916dd1c56361c
|
|
Type: improvement
Signed-off-by: Yu Sun <yusun2@cisco.com>
Change-Id: I68aea7c5776c5b31081c98388df4133d2062218a
|
|
Type: improvement
Change-Id: I7e11bf72be5fad5967724c038eb649a261294ca0
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
There is no need to calculate bucket2 if there is hit on bucket1
Type: improvement
Change-Id: Id01c37963497668c0160068501294568a181d011
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: improvement
Change-Id: I547263ae954506f11101666ff768524fbfdb579e
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: improvement
Change-Id: Ifb4eec00fd4f1d19e4b0af802d015a35e402e0af
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Type: fix
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I921adae4ad797bf80cfcdb05d2a89ace9183a89a
|
|
Type: feature
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I72cacfb5696dca74335f31415c0df795467615a5
|
|
Type: improvement
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Change-Id: Idfec9cb9370a8cf4966d3fdfa440496f21e17005
|
|
Type: improvement
Change-Id: I073bb7bea2a55eabbb6c253b003966f0a821e4a3
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Fix libffi package name for Ubuntu 20.04
Type: fix
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: Idc567717494b4c40c307f20a40d5e10cd26b0a46
|
|
Type: improvement
Change-Id: If1164d2eb81e9d4748436cb1bb8b164857d70565
Signed-off-by: Klement Sekera <ksekera@cisco.com>
|
|
Direct Verb allows for direct access to NIC HW rx/tx rings. This patch
introduce TX direct verb support for Mellanox ConnectX-4/5 adapters.
'dv' mode must be explicitely selected at interface creation to benefit
from this.
Type: feature
Change-Id: If830ba9f33db73299acdbddc68b5c09eaf6add98
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: feature
Change-Id: I3f287ab536a482c366ad7df47e1c04e640992ebc
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Add a clib_time_t * argument to clib_timebase_init(...), to encourage
client code to share the vlib_main_t's clib_time_t object.
Display the current day / date in GMT via the "show time" debug CLI.
Fix the test framework so it processes the new "show time" output format.
Type: refactor
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I5e52d57eb164b7cdb6355362d520df6928491711
|
|
A partial revert of gerrit 25729. The last_run_time == 0.0 check is
necessary and remains in place.
Type: fix
Fixes: 3d9f134
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I3d2c9f90b2bc867f02c4749a5b19f997b84185b9
|
|
Type: improvement
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: I5db3457a9fed11d6ecf6eaabcdf8f1d1177b2a9f
|
|
Deal with arbitrary kernel reference time changes: for example,
yanking the kernel reference clock back to a time before vpp started.
Best practice involves aligning the kernel reference clock with
reality prior to starting apps which use 10us granularity timers.
Compute change in the reference and cpu clocks. Recompute the vpp
start time reference and and total cpu clock count, using the current
clock tick rate.
Next, compute a new clock rate sample. If the sample seems sane,
factor it into the exponentially smoothed clock rate and recalculate
total cpu ticks based on the new clock rate.
Type: fix
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: Ib6132ffbbe0e6d140725676de5e35be112a31dfe
|
|
Type: fix
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I4b3ff6e9c8e1d76037b168aeab36dcb5b4482260
|
|
Type: fix
Change-Id: I23250fcbc70086584b5448baec9af9a1528992f5
Signed-off-by: Damjan Marion <damarion@cisco.com>
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Type: refactor
Switch from a wrapped byte space to a "continuous" one wherein fifo
chunks are appended to the fifo as more data is enqueued and chunks are
removed as data is dequeued.
The fifo is still subject to a maximum size, i.e., maximum number of
bytes that can be enqueued, so the max number of chunks associated to
the fifo is also constrained.
When enqueueing data, which must fit within the available free space, if
not enough "supporting" chunk memory is available, the fifo asks the
fifo segment for enough chunk memory to ensure that the write can
succeed. To avoid allocating large amounts of small chunks due to small
writes, if possible, the size of the chunks requested is lower capped by
min_alloc.
When dequeuing data, all the chunks that have been completely drained,
i.e., head moved beyond the chunks’ end bytes, are unlinked from the
fifo and returned to the fifo segment. The one exception to this is the
last chunk which is never unlinked.
Change-Id: I98c1dbd9135fb79650365c7e40c29238b96cd4ee
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Avoid tracking with rbtrees all of the chunks associated to a fifo.
Instead, only track chunks when doing out-of-order operations (peek or
ooo enqueue).
Type: refactor
Change-Id: I9f8bd266211746637d98e6a12ffc4b2d6346950a
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Introduced on intel IceLake uarch.
Type: feature
Change-Id: I1514c76c34e53ce0577666caf32a50f95eb6548f
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Remove duplicate space allocation for the pool header. Not significant
w/ CLIB_CACHE_LINE_BYTES >= 64 since the code rounds the size of the
pool header to an even multiple of the cache line size.
Type: fix
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I923f2a60e7565cf2dfbc18d78264bf82ff30c926
|
|
Type: refactor
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: Id1e7c0926036db4601c91438397ceed22381fc07
|
|
vextq_u8(...) reuqires constant value so instead of
inline function we need to use macro.
Type: fix
Signed-off-by: Damjan Marion <dmarion@me.com>
Change-Id: I9c1d878c9ec750f0ed5b5eac4dffde50e97e7357
|
|
Add an ALWAYS_ASSERT (...) macro, to (a) shut up coverity, and (b)
check the indicated condition in production images.
As in:
p = hash_get(...);
ALWAYS_ASSERT(p) /* was ASSERT(p) */
elt = pool_elt_at_index(pool, p[0]);
This may not be the best way to handle a specific case, but failure to
check return values at all followed by e.g. a pointer dereference
isn't ok.
Type: fix
Ticket: VPP-1837
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: Ia97c641cefcfb7ea7d77ea5a55ed4afea0345acb
|
|
vpclmulqdq is introduced on intel icelake architecture and
allows computing 4 carry-less multiplications in paralled by using
512-bit SIMD registers
Type: feature
Change-Id: Idb09d6f51ba6f116bba11649b2d99f649356d449
Signed-off-by: Damjan Marion <damjan.marion@gmail.com>
|
|
Type: refactor
Change-Id: I61e25942de318d03fb3d75689259709d687479bc
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
This allows us to combine 2 XOR operations into signle instruction
which makes difference in crypto op:
- in x86, by using ternary logic instruction
- on ARM, by using EOR3 instruction (available with sha3 feature)
Type: refactor
Change-Id: Ibdf9001840399d2f838d491ca81b57cbd8430433
Signed-off-by: Damjan Marion <damjan.marion@gmail.com>
|
|
Type: feature
Change-Id: I4f96b0af13b875d491704b010328a1814e1dbda1
Signed-off-by: Damjan Marion <dmarion@me.com>
|
|
For debugging. Do not set this option in production.
Type: feature
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I5e59671c4932e064bc087b85bf9c62c6f3bf48cf
|
|
For people tired of typen CLIB_CACHE_LINE_BYTES....
Type: improvement
Change-Id: I7658a8525ff6e3edc81a29b05a6fda33e537806e
Signed-off-by: Damjan Marion <dmarion@me.com>
|
|
Type: improvement
Change-Id: I310e421513e9d3f96ad7debc72c9407e231962b8
Signed-off-by: Damjan Marion <dmarion@me.com>
|
|
The mheap allocator has been turned off for several releases. This
commit removes the cmake config parameter, parallel support for
dlmalloc and mheap, and the mheap allocator itself.
Type: refactor
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I104f88a1f06e47e90e5f7fb3e11cd1ca66467903
|
|
Type: fix
Ticket: VPP-1837
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I6b1ea13fc83460bf4ee75cb9249d83dddaa64ded
|
|
Type: improvement
Change-Id: Ib2cb708fdcb14fdea9298c10d67f8fe73887f18b
Signed-off-by: Damjan Marion <dmarion@me.com>
|
|
Type: fix
Change-Id: I77b03efcac04cc46550d03657464ab8de5d7da78
Signed-off-by: Klement Sekera <ksekera@cisco.com>
|
|
Type: feature
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: I999836a7893a89aac5243b111eac35fddd03e2a6
|
|
Type: refactor
Signed-off-by: Florin Coras <fcoras@cisco.com>
Change-Id: I13b239cd572ae6dfaec07019d3d9b7c0ed3edcfa
|
|
Type: fix
Change-Id: Ifd61c0683c85fe7340965c225ed23e46ec88e01a
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: feature
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I7e7d95a089dd849c1f01ecea84529d8dbf239f21
|
|
Sporadic reports of os_cpu_clock_frequency() returning 0.0 in highly
parallel container environments.
To avoid immediate division by zero:
Step 1: try estimate_clock_frequency(1e-3).
Step 2: give up. Pretend we have a 2gHz clock.
Type: fix
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I19d0fe5259b757ab778599c7026ce485153b43fa
|