Age | Commit message (Collapse) | Author | Files | Lines |
|
In a case where one pounds on a single kvp in a KVP_AT_BUCKET_LEVEL
table, the code would sporadically return a transitional value (junk)
from a half-deleted kvp. At most, 64-bits worth of the kvp will be
written atomically, so using memset(...) to smear 0xFF's across a kvp
to free it left a lot to be desired.
Performance impact: very mild positive, thanks to FC for doing a
multi-thread host stack perf/scale test.
Added an ASSERT to catch attempts to add a (key,value) pair which
contains the magic "free kvp" value.
Type: fix
Signed-off-by: Dave Barach <dave@barachs.net>
Change-Id: I6a1aa8a2c30bc70bec4b696ce7b17c2839927065
|
|
bihash keys are less than 64-bytes, do not overflow.
Type: fix
Change-Id: Ic55407eb9ccca38058f7e62b363ec05c8445fbcb
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Type: improvement
Change-Id: Ifb0fa114414aa2fdc244f964612ca3ac3e29b5e1
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Template instances can allocate BIHASH_KVP_PER_PAGE data records
tangent to the bucket, to remove a dependent read / prefetch.
Template instances can ask for immediate memory allocation, to avoid
several branches in the lookup path.
Clean up l2 fib, gpb plugin codes: use clib_bihash_get_bucket(...)
Use hugepages for bihash allocation arenas
Type: improvement
Signed-off-by: Dave Barach <dave@barachs.net>
Signed-off-by: Damjan Marion <damarion@cisco.com>
Change-Id: I92fc11bc58e48d84e2d61f44580916dd1c56361c
|
|
Example / unit-test in .../src/plugins/unittest/bihash_test.c
Change-Id: I23fd0ba742d65291667a755965aee1a3d3477ca2
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: Ied34720ca5a6e6e717eea4e86003e854031b6eab
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Add a bucket-level lock bit. Use a spinlock only when actually
allocating, freeing, or splitting a bucket. Should improve
multi-thread add/del performance.
Change-Id: I3e40e2a8371685457f340d6584dea14e3207f2b0
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Looks like CPU doesn't like overlaping loads.
This new codes in some cases shows 3-4 clock improvements.
Change-Id: Ia1b49976ad95140c573f892fdc0a32eebbfa06c8
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
It'll be interesting to see what the perf trend job
says about this change.
Change-Id: I66307a19a865011ac9660108098874fa1481c895
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
bihash_48_8 case:
Scalar code: 6 clocks
SSE4.2 code: 3 clocks
AVX2 code: 2.27 clocks
AVX512 code: 1.5 clocks
Change-Id: I40700175835a1e7321276e47eadbf9771d3c5a68
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I30a3df53bc5fe5ab991a657918eb502bd2913440
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|