summaryrefslogtreecommitdiffstats
path: root/src/vnet/classify
AgeCommit message (Collapse)AuthorFilesLines
2021-01-20classify: Layout classify entry to group data-plane accessed fields onNeale Ranns3-102/+124
one cache line Type: refactor Signed-off-by: Neale Ranns <neale.ranns@cisco.com> Change-Id: I54128ba62f8dcc87c1845b33ed3637112d42a891
2021-01-19classify: crash on classify filter pcap del commandSteven Luong1-1/+2
If classify pcap filter was never configured, typing the delete command causes a crash. The reason is cm->classify_table_index_by_sw_if_index not yet allocated. The fix is to add a check before we access the vector. Type: fix Fixes: gerrit 28475 Signed-off-by: Steven Luong <sluong@cisco.com> Change-Id: Ia33bd91fa82d8ffc4490d4069155980a6e233268
2020-12-15classify: add pcap/trace classfier mgmt API callsJon Loeliger4-188/+709
Add lookup/get/set API calls to manage both PCAP and Trace filtering Classifier tables. The "lookup" call may be used to identify a Classifier table within a chain of tables taht matches a particular mask vector. For efficiency, this call should be used to determine to which table a match vector should be added. The "get" calls return the first table within a chain (either a PCAP or the Trace) set of tables. The "set" call may be used to add a new table to one such chain. If the "sort_masks" flag is set, the tables within the chain are ordered such that the most-specific mask is first, and the least-specific mask is last. A call that "sets" a chain to ~0 will delete and free all the tables with a chain. The PCAP filters are per-interface, with "local0", (that is, sw_if_index == 0) holding the system-wide PCAP filter. The Classifier used a reference-counted "set" for each PCAP or trace filter that it stored. The ref counts were not used, and the vector of tables was only used temporarily to establish a sorted order for tables based on masks. None of that complexity was actually warranted, and where it was used, the same could be achieved more simply. Type: refactor Signed-off-by: Jon Loeliger <jdl@netgate.com> Change-Id: Icc56116cca91b91c631ca0628e814fb53f3677d2
2020-12-14misc: move to new pool_foreach macrosDamjan Marion2-12/+12
Type: refactor Change-Id: Ie67dc579e88132ddb1ee4a34cb69f96920101772 Signed-off-by: Damjan Marion <damarion@cisco.com>
2020-11-17tests: move classifier tests to src/vnet/classify/testDave Wallace2-0/+1059
- Refactor make test code to be co-located with the vpp feature source code. Type: test Signed-off-by: Dave Wallace <dwallacelf@gmail.com> Change-Id: Ibae85a18df0d5a53e2a59c678a2a27499f54ce6d
2020-11-10classify: fix classify filter trace del cli processingJon Loeliger1-35/+49
When a 'del' is used to delete a classify table, only the mask is needed to locate the table. Any match vector is unneeded. The tests failed to notice this, but if the test is run by hand in vppctl, it issues a parse error. Fix the test so that it doesn't supply irrelevant data. Fix the CLI processing to read always complete newline terminated line of input instead. This allows unneeded CLI parameters to be ignored. It also necessitated fixing a trace test which had then erroneously split a single CLI command over multiple lines. While in the area, fix a latent bug on table matching where a test for compatible mask vector sizes were not matching impedance properly (byte vs ux32x4). Type: fix Signed-off-by: Jon Loeliger <jdl@netgate.com> Change-Id: I1177ab1dd417f3d11f30eecbaa2b0fb1015c3ab5
2020-10-28misc: Break the big IP header files to improve compile timeNeale Ranns1-0/+2
Type: refactor Signed-off-by: Neale Ranns <neale.ranns@cisco.com> Change-Id: Id1801519638a9b97175847d7ed58824fb83433d6
2020-10-01classify: Fix a couple bugs in 'pcap filter' command.Jon Loeliger1-12/+12
- Assert a valid set prior to first use. - Sort tables by mask prior to selecting first table - Use actual table indices and not loop index when linking tables Type: fix Change-Id: I9c61c8b7fe97c38faed8f2fc1792d7232799f580 Signed-off-by: Jon Loeliger <jdl@netgate.com>
2020-09-28vppinfra: don't call dlmalloc API directly from the codeDamjan Marion2-7/+7
- it is confusing from end consumer perspective that some thing is somewhere called heap and somewhere mspace - this is base for additional work where heap pointer is not the same thing like mspace Type: improvement Change-Id: I644d5a0de17690d65d164d8cec3c5654571629ef Signed-off-by: Damjan Marion <damarion@cisco.com>
2020-09-28classify: use clib_crc32c on supporting uarchRay Kinsella1-0/+5
Use clib_crc32c in place of clib_xxhash on supporting uarch. Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: Icdfb4ffa92c2c9e7aebc3ec99f20e91392a103ab
2020-06-30classify: fix debug CLIDave Barach1-1/+5
unformat_ip6_mask wasn't accounting for customized field names when deciding if it managed to parse at least one field. Type: fix Signed-off-by: Dave Barach <dave@barachs.net> Change-Id: I26cab4c6828b510e277079628af5115ac43af3ff
2020-05-15misc: removed executable bits from source filesRay Kinsella1-0/+0
Identified and removed executable bit from source files in the tree. find . -perm 755 -name *.[ch] -exec chmod a-x {} \; Type: improvement Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Change-Id: I00710d59fcc46ce5be5233109af4c8077daff74b
2020-02-12classify: fix "show classify filter" debug CLIDave Barach1-3/+1
Null pointer bug, memory leak. D'oh! Type: fix Signed-off-by: Dave Barach <dave@barachs.net> Change-Id: Ic2865757ed9cbb7f48d23c7c30b64299eb5f6674
2020-02-11vppinfra: remove the historical mheap memory allocatorDave Barach1-9/+0
The mheap allocator has been turned off for several releases. This commit removes the cmake config parameter, parallel support for dlmalloc and mheap, and the mheap allocator itself. Type: refactor Signed-off-by: Dave Barach <dave@barachs.net> Change-Id: I104f88a1f06e47e90e5f7fb3e11cd1ca66467903
2020-02-10misc: add FEATURE.yaml filesDave Barach1-0/+10
For src/vnet/classify, src/vnet/cop, src/vnet/pg, and src/vlib/unix Type: docs Signed-off-by: Dave Barach <dave@barachs.net> Change-Id: Ib6ab734608693a1e9562a44808246950616e8d36
2020-01-27classify: pcap / packet trace debug CLI bugsDave Barach1-0/+4
"classify filter trace ... " and "classify filter pcap ..." are mutually exclusive. vnet_pcap_dispatch_trace_configure needs to check for set->table_indices == NULL. Type: fix Ticket: VPP-1827 Signed-off-by: Dave Barach <dave@barachs.net> Change-Id: I43733364087ffb0a43de92e450955033431d559d
2020-01-20classify: fix pcap filter set initFlorin Coras1-4/+2
Type: fix Change-Id: I6a48a6c14bfb84b3460e8211021bc9df6e915dba Signed-off-by: Florin Coras <fcoras@cisco.com>
2019-12-25classify: "classify filter ..." debug CLI cleanupDave Barach1-7/+8
The pcap trace filter initial table index lives in cm->filter_set_by_sw_if_index [0], which corresponds to the "local0" interface. Debug cli makes sure that folks don't accidentally specify the "local0" interface. At least it does now... Fix the "vlib format.c code coverage" test in test/test_vlib.py. Type: fix Change-Id: I35320bc2c8f0c6f1f8c12e3529d1938548185151 Signed-off-by: Dave Barach <dave@barachs.net>
2019-12-17classify: forbid invalid match configBenoît Ganne1-0/+3
Forbid too long match to be configured. Type: fix Change-Id: Icfced0f86821d5febd6a3c81e1315bd9737498c0 Signed-off-by: Benoît Ganne <bganne@cisco.com>
2019-12-10api: multiple connections per processDave Barach1-1/+1
Type: feature Signed-off-by: Dave Barach <dave@barachs.net> Change-Id: I2272521d6e69edcd385ef684af6dd4eea5eaa953
2019-12-05classify: vpp packet tracer supportDave Barach1-32/+119
Configure n-tuple classifier filters which apply to the vpp packet tracer. Update the documentation to reflect the new feature. Add a test vector. Type: feature Signed-off-by: Dave Barach <dave@barachs.net> Change-Id: Iefa911716c670fc12e4825b937b62044433fec36
2019-12-05classify: Fix 2 coverity errorsJon Loeliger1-0/+8
Validate two tainted scalars, filter_sw_if_index, that came from an API message. Type: fix Change-Id: I3ac8a09f91f380185e36babeaa6330691f7cb24b Signed-off-by: Jon Loeliger <jdl@netgate.com>
2019-12-03classify: API cleanupJakub Grajciar2-54/+94
Use consistent API types. Type: fix Change-Id: Ib5b1efa76f0a9cecc0bc146f8f8a47c2442fc1db Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com> Signed-off-by: Ole Troan <ot@cisco.com> Signed-off-by: Paul Vinciguerra <pvinci@vinciconsulting.com>
2019-11-29classify: debug cli %v not %sDave Barach1-2/+2
Type: fix Signed-off-by: Dave Barach <dave@barachs.net> Change-Id: I294f0b773375f6dce020b771db0726ceb5d812cc
2019-09-26misc: add vnet classify filter set supportDave Barach2-29/+190
Type: feature Signed-off-by: Dave Barach <dave@barachs.net> Change-Id: I79b216d2499df143f53977e5b70382f6f887e0bc
2019-09-26classify: use vector code even when data is not alignedDamjan Marion1-142/+126
Type: feature Change-Id: I8f5f4841965beb13ebc8c2a37ce0dc331c920109 Signed-off-by: Damjan Marion <damarion@cisco.com>
2019-09-20misc: classifier-based packet trace filterDave Barach2-3/+393
See .../src/vnet/classify/trace_classify.h for the business end of the scheme. It would be best to hash pkts, prefetch buckets, and do the primary table lookups two at a time. The inline as given works, but perf tuning will be required. "At least it works..." Add "classify filter" debug cli, for example: classify filter mask l3 ip4 src dst \ match l3 ip4 dst 192.168.2.10 src 192.168.1.10 Add "pcap rx | tx trace ... filter" to use the current classify filter chain Patch includes sphinx documentation and doxygen tags. Next step: device-driver integration Type: feature Signed-off-by: Dave Barach <dave@barachs.net> Change-Id: I05b1358a769f61e6d32470e0c87058f640486b26
2019-09-20classify: remove includes from classifier header fileDamjan Marion2-10/+2
Type: refactor Change-Id: I6f0af1c3078edce1c1b29a8b99c4a232d7084d33 Signed-off-by: Damjan Marion <damarion@cisco.com>
2019-07-31vppinfra: refactor test_and_set spinlocks to use clib_spinlock_tjaszha032-10/+6
Spinlock performance improved when implemented with compare_and_exchange instead of test_and_set. All instances of test_and_set locks were refactored to use clib_spinlock_t when possible. Some locks e.g. ssvm synchronize between processes rather than threads, so they cannot directly use clib_spinlock_t. Type: refactor Change-Id: Ia16b5d4cd49209b2b57b8df6c94615c28b11bb60 Signed-off-by: Jason Zhang <jason.zhang2@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Lijian Zhang <Lijian.Zhang@arm.com>
2019-07-30vppinfra: refactor use of CLIB_MEMORY_BARRIER ()jaszha031-2/+1
All instances of test_and_set locks used the following sequence to release the locks: CLIB_MEMORY_BARRIER (); p->lock = 0; // p is a generic struct with a TAS lock Use clib_atomic_release to generate more efficient assembly code. Type: refactor Change-Id: Idca3a38b1cf43578108bdd1afe83b6ebc17a4c68 Signed-off-by: Jason Zhang <jason.zhang2@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Lijian Zhang <Lijian.Zhang@arm.com>
2019-07-30vppinfra: conformed spinlocks to use CLIB_PAUSEjaszha031-1/+2
Modified test-and-set spin locks to call CLIB_PAUSE () when spinning for code consistency. Decreases the memory bandwidth consumed. Type: fix Change-Id: I1cca4f87f44f23f257c7a35466cd2e7767072f51 Signed-off-by: Jason Zhang <jason.zhang2@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Lijian Zhang <Lijian.Zhang@arm.com>
2019-07-25misc: remove unnecessary cast in classifyZhiyong Yang2-24/+14
Type: style Change-Id: I7628f7fba8250afe41f115595cca4129e43350d3 Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
2019-05-22new_log2_pages may increase 2dongjuan1-1/+0
when try_resplit Signed-off-by: dongjuan <dong.juan1@zte.com.cn> Change-Id: I3ebbe7d2d11453700503df7f3be549781d8b73a7
2019-05-16init / exit function orderingDave Barach1-6/+6
The vlib init function subsystem now supports a mix of procedural and formally-specified ordering constraints. We should eliminate procedural knowledge wherever possible. The following schemes are *roughly* equivalent: static clib_error_t *init_runs_first (vlib_main_t *vm) { clib_error_t *error; ... do some stuff... if ((error = vlib_call_init_function (init_runs_next))) return error; ... } VLIB_INIT_FUNCTION (init_runs_first); and static clib_error_t *init_runs_first (vlib_main_t *vm) { ... do some stuff... } VLIB_INIT_FUNCTION (init_runs_first) = { .runs_before = VLIB_INITS("init_runs_next"), }; The first form will [most likely] call "init_runs_next" on the spot. The second form means that "init_runs_first" runs before "init_runs_next," possibly much earlier in the sequence. Please DO NOT construct sets of init functions where A before B actually means A *right before* B. It's not necessary - simply combine A and B - and it leads to hugely annoying debugging exercises when trying to switch from ad-hoc procedural ordering constraints to formal ordering constraints. Change-Id: I5e4353503bf43b4acb11a45fb33c79a5ade8426c Signed-off-by: Dave Barach <dave@barachs.net>
2019-04-08fixing typosJim Thompson1-1/+1
Change-Id: I215e1e0208a073db80ec6f87695d734cf40fabe3 Signed-off-by: Jim Thompson <jim@netgate.com>
2019-03-18vnet: disable the expansion of the heap allocated for classifier tablesAndrew Yourtchenko1-0/+2
Classifier data structures assume the contiguous chunk of memory within the heap. Default heap flags for dlmalloc allow for heap growth. When that happens, the memory becomes discontiguous. This results in symptoms that are more cryptic than necessary. Disabling the expand makes the session allocation behavior of the classifier the same for dlmalloc as for the legacy allocator. Change-Id: I2f725b5f78a31a8eaa5f5a20dfdd7e1129662f6a Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
2019-03-07classify: migrate old MULTIARCH macros to VLIB_NODE_FNFilip Tehlar2-26/+16
Change-Id: I01730ec9eb8033074c8710daf0848c3573293aeb Signed-off-by: Filip Tehlar <ftehlar@cisco.com>
2019-01-02Fixes for buliding for 32bit targets:David Johnson1-0/+1
* u32/u64/uword mismatches * pointer-to-int fixes * printf formatting issues * issues with incorrect "ULL" and related suffixes * structure alignment and padding issues Change-Id: I70b989007758755fe8211c074f651150680f60b4 Signed-off-by: David Johnson <davijoh3@cisco.com>
2018-12-13Fix VPP-1530 Classify session creation errorjackiechen19851-1/+1
Change-Id: I6f877be6b3a1ef7100607560d430400bb824b6ba Signed-off-by: jackiechen1985 <xiaobo.chen@tieto.com>
2018-11-14Remove c-11 memcpy checks from perf-critical codeDave Barach1-20/+20
Change-Id: Id4f37f5d4a03160572954a416efa1ef9b3d79ad1 Signed-off-by: Dave Barach <dave@barachs.net>
2018-10-23c11 safe string handling supportDave Barach2-14/+14
Change-Id: Ied34720ca5a6e6e717eea4e86003e854031b6eab Signed-off-by: Dave Barach <dave@barachs.net>
2018-10-19vppinfra: add atomic macros for __sync builtinsSirshak Das1-1/+1
This is first part of addition of atomic macros with only macros for __sync builtins. - Based on earlier patch by Damjan (https://gerrit.fd.io/r/#/c/10729/) Additionally - clib_atomic_release macro added and used in the absence of any memory barrier. - clib_atomic_bool_cmp_and_swap added Change-Id: Ie4e48c1e184a652018d1d0d87c4be80ddd180a3b Original-patch-by: Damjan Marion <damarion@cisco.com> Signed-off-by: Sirshak Das <sirshak.das@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Reviewed-by: Steve Capper <steve.capper@arm.com>
2018-09-08L2 BVI/FIB: Update L2 FIB table when BVI's MAC changesNeale Ranns3-3/+3
also some moving of l2 headers to reduce dependencies Change-Id: I7a700a411a91451ef13fd65f9c90de2432b793bb Signed-off-by: Neale Ranns <nranns@cisco.com>
2018-08-13classify_add_del_session API: Use more descriptive docstring (VPP-1385)Juraj Sloboda1-1/+4
Change-Id: I30788c0dd1ee012e786bb3127bf2743ab0bfdc70 Signed-off-by: Juraj Sloboda <jsloboda@cisco.com>
2018-08-09Fix "Old Style VLA" build warningsJuraj Sloboda2-7/+36
Change-Id: I8d42f6ed58ec34298d41edcb3d783e7e9ded3eec Signed-off-by: Juraj Sloboda <jsloboda@cisco.com>
2018-07-18Add config option to use dlmalloc instead of mheapDave Barach1-0/+8
Configure w/ --enable-dlmalloc, see .../build-data/platforms/vpp.mk src/vppinfra/dlmalloc.[ch] are slightly modified versions of the well-known Doug Lea malloc. Main advantage: dlmalloc mspaces have no inherent size limit. Change-Id: I19b3f43f3c65bcfb82c1a265a97922d01912446e Signed-off-by: Dave Barach <dave@barachs.net>
2018-02-26Added u8x16,u32x4,u64x2 variants of _zero_byte_mask(x) for ARM/NEON ↵Adrian Oanca1-8/+4
platform. VPP-1129 Change-Id: I954acb56d901e42976e71534317f38d7c4359bcf Signed-off-by: Adrian Oanca <adrian.oanca@enea.com>
2018-02-20vppinfra: CLIB_HAVE_VEC128 mandates SSE4.2Damjan Marion1-1/+1
Change-Id: I6511110d0472203498a4f8741781eeeeb4f90844 Signed-off-by: Damjan Marion <damarion@cisco.com>
2018-02-08classifier-based ACL: testcases for L2 ACLs + fix the enabling of outbound ↵Andrew Yourtchenko1-3/+6
L2 ACL There was no test coverage for the L2 ACL (other than indirect by means of ACL plugin tests), so the enabling of the outbound ACL got fumbled throughout the revisions of the refactoring. Fix both issues - the error and the lack of test coverage for L2 ACL. Change-Id: Ib7f42780ef84b4a4f70bd88d7319aeeda866cf06 Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
2018-02-07classifier-based ACL: refactor + add output ACLAndrew Yourtchenko6-77/+230
For implementation of MACIP ACLs enhancement (VPP-1088), an outbound classifier-based ACL would be needed. There was an existing incomplete code for outbound ACLs, it looked almost exact copy of input ACLs, minus the various enhancements, trying to sync that code seemed error-prone and cumbersome to maintain in the longer run. This change refactors the input+output ACLs processing into a unified routine (thus any changes will have effect on both), and also adds the API to set the output interface ACL, with the same format and semantics as the existing input one (except working on output ACL of course). WARNING: IP outbound ACL in L3 mode clobbers the ip.* fields in the vnet_buffer_opaque_t, since the code is using l2_classify.* The net_buffer (p0)->ip.save_rewrite_length is rescued into l2_classify.pad.l2_len, and used to rewind the header in case of drop, so that ipX_drop prints something sensible. Change-Id: I62f814f1e3650e504474a3a5359edb8a0a8836ed Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
pan class="o">== wi); else _(e_wi == wi || e_wi + 1 == wi || e_wi - 1 == wi); } } } #undef _ return error; } void timing_wheel_validate (timing_wheel_t * w) { uword l; clib_error_t *error = 0; uword n_elts; if (!w->validate) return; n_elts = pool_elts (w->overflow_pool); for (l = 0; l < vec_len (w->levels); l++) { error = validate_level (w, l, &n_elts); if (error) clib_error_report (error); } } always_inline void free_elt_vector (timing_wheel_t * w, timing_wheel_elt_t * ev) { /* Poison free elements so we never use them by mistake. */ if (CLIB_DEBUG > 0) clib_memset (ev, ~0, vec_len (ev) * sizeof (ev[0])); _vec_len (ev) = 0; vec_add1 (w->free_elt_vectors, ev); } static timing_wheel_elt_t * insert_helper (timing_wheel_t * w, uword level_index, uword rtime) { timing_wheel_level_t *level; timing_wheel_elt_t *e; uword wheel_index; /* Circular buffer. */ vec_validate (w->levels, level_index); level = vec_elt_at_index (w->levels, level_index); if (PREDICT_FALSE (!level->elts)) { uword max = w->bins_per_wheel - 1; clib_bitmap_validate (level->occupancy_bitmap, max); vec_validate (level->elts, max); } wheel_index = rtime_to_wheel_index (w, level_index, rtime); level->occupancy_bitmap = clib_bitmap_ori (level->occupancy_bitmap, wheel_index); /* Allocate an elt vector from free list if there is one. */ if (!level->elts[wheel_index] && vec_len (w->free_elt_vectors)) level->elts[wheel_index] = vec_pop (w->free_elt_vectors); /* Add element to vector for this time bin. */ vec_add2 (level->elts[wheel_index], e, 1); return e; } /* Insert user data on wheel at given CPU time stamp. */ static void timing_wheel_insert_helper (timing_wheel_t * w, u64 insert_cpu_time, u32 user_data) { timing_wheel_elt_t *e; u64 dt; uword rtime, level_index; level_index = get_level_and_relative_time (w, insert_cpu_time, &rtime); dt = insert_cpu_time - w->cpu_time_base; if (PREDICT_TRUE (0 == (dt >> BITS (e->cpu_time_relative_to_base)))) { e = insert_helper (w, level_index, rtime); e->user_data = user_data; e->cpu_time_relative_to_base = dt; if (insert_cpu_time < w->cached_min_cpu_time_on_wheel) w->cached_min_cpu_time_on_wheel = insert_cpu_time; } else { /* Time too far in the future: add to overflow vector. */ timing_wheel_overflow_elt_t *oe; pool_get (w->overflow_pool, oe); oe->user_data = user_data; oe->cpu_time = insert_cpu_time; } } always_inline uword elt_is_deleted (timing_wheel_t * w, u32 user_data) { return (hash_elts (w->deleted_user_data_hash) > 0 && hash_get (w->deleted_user_data_hash, user_data)); } static timing_wheel_elt_t * delete_user_data (timing_wheel_elt_t * elts, u32 user_data) { uword found_match; timing_wheel_elt_t *e, *new_elts; /* Quickly scan to see if there are any elements to delete in this bucket. */ found_match = 0; vec_foreach (e, elts) { found_match = e->user_data == user_data; if (found_match) break; } if (!found_match) return elts; /* Re-scan to build vector of new elts with matching user_data deleted. */ new_elts = 0; vec_foreach (e, elts) { if (e->user_data != user_data) vec_add1 (new_elts, e[0]); } vec_free (elts); return new_elts; } /* Insert user data on wheel at given CPU time stamp. */ void timing_wheel_insert (timing_wheel_t * w, u64 insert_cpu_time, u32 user_data) { /* Remove previously deleted elements. */ if (elt_is_deleted (w, user_data)) { timing_wheel_level_t *l; uword wi; /* Delete elts with given user data so that stale events don't expire. */ vec_foreach (l, w->levels) { /* *INDENT-OFF* */ clib_bitmap_foreach (wi, l->occupancy_bitmap, ({ l->elts[wi] = delete_user_data (l->elts[wi], user_data); if (vec_len (l->elts[wi]) == 0) l->occupancy_bitmap = clib_bitmap_andnoti (l->occupancy_bitmap, wi); })); /* *INDENT-ON* */ } { timing_wheel_overflow_elt_t *oe; /* *INDENT-OFF* */ pool_foreach (oe, w->overflow_pool, ({ if (oe->user_data == user_data) pool_put (w->overflow_pool, oe); })); /* *INDENT-ON* */ } hash_unset (w->deleted_user_data_hash, user_data); } timing_wheel_insert_helper (w, insert_cpu_time, user_data); } void timing_wheel_delete (timing_wheel_t * w, u32 user_data) { if (!w->deleted_user_data_hash) w->deleted_user_data_hash = hash_create ( /* capacity */ 0, /* value bytes */ 0); hash_set1 (w->deleted_user_data_hash, user_data); } /* Returns time of next expiring element. */ u64 timing_wheel_next_expiring_elt_time (timing_wheel_t * w) { timing_wheel_level_t *l; timing_wheel_elt_t *e; uword li, wi, wi0; u32 min_dt; u64 min_t; uword wrapped = 0; min_dt = ~0; min_t = ~0ULL; vec_foreach (l, w->levels) { if (!l->occupancy_bitmap) continue; li = l - w->levels; wi0 = wi = current_time_wheel_index (w, li); wrapped = 0; while (1) { if (clib_bitmap_get_no_check (l->occupancy_bitmap, wi)) { vec_foreach (e, l->elts[wi]) min_dt = clib_min (min_dt, e->cpu_time_relative_to_base); if (wrapped && li + 1 < vec_len (w->levels)) { uword wi1 = current_time_wheel_index (w, li + 1); if (l[1].occupancy_bitmap && clib_bitmap_get_no_check (l[1].occupancy_bitmap, wi1)) { vec_foreach (e, l[1].elts[wi1]) { min_dt = clib_min (min_dt, e->cpu_time_relative_to_base); } } } min_t = w->cpu_time_base + min_dt; goto done; } wi = wheel_add (w, wi + 1); if (wi == wi0) break; wrapped = wi != wi + 1; } } { timing_wheel_overflow_elt_t *oe; if (min_dt != ~0) min_t = w->cpu_time_base + min_dt; /* *INDENT-OFF* */ pool_foreach (oe, w->overflow_pool, ({ min_t = clib_min (min_t, oe->cpu_time); })); /* *INDENT-ON* */ done: return min_t; } } static inline void insert_elt (timing_wheel_t * w, timing_wheel_elt_t * e) { u64 t = w->cpu_time_base + e->cpu_time_relative_to_base; timing_wheel_insert_helper (w, t, e->user_data); } always_inline u64 elt_cpu_time (timing_wheel_t * w, timing_wheel_elt_t * e) { return w->cpu_time_base + e->cpu_time_relative_to_base; } always_inline void validate_expired_elt (timing_wheel_t * w, timing_wheel_elt_t * e, u64 current_cpu_time) { if (CLIB_DEBUG > 0) { u64 e_time = elt_cpu_time (w, e); /* Verify that element is actually expired. */ ASSERT ((e_time >> w->log2_clocks_per_bin) <= (current_cpu_time >> w->log2_clocks_per_bin)); } } static u32 * expire_bin (timing_wheel_t * w, uword level_index, uword wheel_index, u64 advance_cpu_time, u32 * expired_user_data) { timing_wheel_level_t *level = vec_elt_at_index (w->levels, level_index); timing_wheel_elt_t *e; u32 *x; uword i, j, e_len; e = vec_elt (level->elts, wheel_index); e_len = vec_len (e); vec_add2 (expired_user_data, x, e_len); for (i = j = 0; i < e_len; i++) { validate_expired_elt (w, &e[i], advance_cpu_time); x[j] = e[i].user_data; /* Only advance if elt is not to be deleted. */ j += !elt_is_deleted (w, e[i].user_data); } /* Adjust for deleted elts. */ if (j < e_len) _vec_len (expired_user_data) -= e_len - j; free_elt_vector (w, e); level->elts[wheel_index] = 0; clib_bitmap_set_no_check (level->occupancy_bitmap, wheel_index, 0); return expired_user_data; } /* Called rarely. 32 bit times should only overflow every 4 seconds or so on a fast machine. */ static u32 * advance_cpu_time_base (timing_wheel_t * w, u32 * expired_user_data) { timing_wheel_level_t *l; timing_wheel_elt_t *e; u64 delta; w->stats.cpu_time_base_advances++; delta = ((u64) 1 << w->n_wheel_elt_time_bits); w->cpu_time_base += delta; w->time_index_next_cpu_time_base_update += delta >> w->log2_clocks_per_bin; vec_foreach (l, w->levels) { uword wi; /* *INDENT-OFF* */ clib_bitmap_foreach (wi, l->occupancy_bitmap, ({ vec_foreach (e, l->elts[wi]) { /* This should always be true since otherwise we would have already expired this element. Note that in the second half of this function we need to take care not to place the expired elements ourselves. */ ASSERT (e->cpu_time_relative_to_base >= delta); e->cpu_time_relative_to_base -= delta; } })); /* *INDENT-ON* */ } /* See which overflow elements fit now. */ { timing_wheel_overflow_elt_t *oe; /* *INDENT-OFF* */ pool_foreach (oe, w->overflow_pool, ({ /* It fits now into 32 bits. */ if (0 == ((oe->cpu_time - w->cpu_time_base) >> BITS (e->cpu_time_relative_to_base))) { u64 ti = oe->cpu_time >> w->log2_clocks_per_bin; if (ti <= w->current_time_index) { /* This can happen when timing wheel is not advanced for a long time (for example when at a gdb breakpoint for a while). */ /* Note: the ti == w->current_time_index means it is also an expired timer */ if (! elt_is_deleted (w, oe->user_data)) vec_add1 (expired_user_data, oe->user_data); } else timing_wheel_insert_helper (w, oe->cpu_time, oe->user_data); pool_put (w->overflow_pool, oe); } })); /* *INDENT-ON* */ } return expired_user_data; } static u32 * refill_level (timing_wheel_t * w, uword level_index, u64 advance_cpu_time, uword from_wheel_index, uword to_wheel_index, u32 * expired_user_data) { timing_wheel_level_t *level; timing_wheel_elt_t *to_insert = w->unexpired_elts_pending_insert; u64 advance_time_index = advance_cpu_time >> w->log2_clocks_per_bin; vec_validate (w->stats.refills, level_index); w->stats.refills[level_index] += 1; if (level_index + 1 >= vec_len (w->levels)) goto done; level = vec_elt_at_index (w->levels, level_index + 1); if (!level->occupancy_bitmap) goto done; while (1) { timing_wheel_elt_t *e, *es; if (clib_bitmap_get_no_check (level->occupancy_bitmap, from_wheel_index)) { es = level->elts[from_wheel_index]; level->elts[from_wheel_index] = 0; clib_bitmap_set_no_check (level->occupancy_bitmap, from_wheel_index, 0); vec_foreach (e, es) { u64 e_time = elt_cpu_time (w, e); u64 ti = e_time >> w->log2_clocks_per_bin; if (ti <= advance_time_index) { validate_expired_elt (w, e, advance_cpu_time); if (!elt_is_deleted (w, e->user_data)) vec_add1 (expired_user_data, e->user_data); } else vec_add1 (to_insert, e[0]); } free_elt_vector (w, es); } if (from_wheel_index == to_wheel_index) break; from_wheel_index = wheel_add (w, from_wheel_index + 1); } timing_wheel_validate (w); done: w->unexpired_elts_pending_insert = to_insert; return expired_user_data; } /* Advance wheel and return any expired user data in vector. */ u32 * timing_wheel_advance (timing_wheel_t * w, u64 advance_cpu_time, u32 * expired_user_data, u64 * next_expiring_element_cpu_time) { timing_wheel_level_t *level; uword level_index, advance_rtime, advance_level_index, advance_wheel_index; uword n_expired_user_data_before; u64 current_time_index, advance_time_index; n_expired_user_data_before = vec_len (expired_user_data); /* Re-fill lower levels when time wraps. */ current_time_index = w->current_time_index; advance_time_index = advance_cpu_time >> w->log2_clocks_per_bin; { u64 current_ti, advance_ti; current_ti = current_time_index >> w->log2_bins_per_wheel; advance_ti = advance_time_index >> w->log2_bins_per_wheel; if (PREDICT_FALSE (current_ti != advance_ti)) { if (w->unexpired_elts_pending_insert) _vec_len (w->unexpired_elts_pending_insert) = 0; level_index = 0; while (current_ti != advance_ti) { uword c, a; c = current_ti & (w->bins_per_wheel - 1); a = advance_ti & (w->bins_per_wheel - 1); if (c != a) expired_user_data = refill_level (w, level_index, advance_cpu_time, c, a, expired_user_data); current_ti >>= w->log2_bins_per_wheel; advance_ti >>= w->log2_bins_per_wheel; level_index++; } } } advance_level_index = get_level_and_relative_time (w, advance_cpu_time, &advance_rtime); advance_wheel_index = rtime_to_wheel_index (w, advance_level_index, advance_rtime); /* Empty all occupied bins for entire levels that we advance past. */ for (level_index = 0; level_index < advance_level_index; level_index++) { uword wi; if (level_index >= vec_len (w->levels)) break; level = vec_elt_at_index (w->levels, level_index); /* *INDENT-OFF* */ clib_bitmap_foreach (wi, level->occupancy_bitmap, ({ expired_user_data = expire_bin (w, level_index, wi, advance_cpu_time, expired_user_data); })); /* *INDENT-ON* */ } if (PREDICT_TRUE (level_index < vec_len (w->levels))) { uword wi; level = vec_elt_at_index (w->levels, level_index); wi = current_time_wheel_index (w, level_index); if (level->occupancy_bitmap) while (1) { if (clib_bitmap_get_no_check (level->occupancy_bitmap, wi)) expired_user_data = expire_bin (w, advance_level_index, wi, advance_cpu_time, expired_user_data); /* When we jump out, we have already just expired the bin, corresponding to advance_wheel_index */ if (wi == advance_wheel_index) break; wi = wheel_add (w, wi + 1); } } /* Advance current time index. */ w->current_time_index = advance_time_index; if (vec_len (w->unexpired_elts_pending_insert) > 0) { timing_wheel_elt_t *e; vec_foreach (e, w->unexpired_elts_pending_insert) insert_elt (w, e); _vec_len (w->unexpired_elts_pending_insert) = 0; } /* Don't advance until necessary. */ /* However, if the timing_wheel_advance() hasn't been called for some time, the while() loop will ensure multiple calls to advance_cpu_time_base() in a row until the w->cpu_time_base is fresh enough. */ while (PREDICT_FALSE (advance_time_index >= w->time_index_next_cpu_time_base_update)) expired_user_data = advance_cpu_time_base (w, expired_user_data); if (next_expiring_element_cpu_time) { u64 min_t; /* Anything expired? If so we need to recompute next expiring elt time. */ if (vec_len (expired_user_data) == n_expired_user_data_before && w->cached_min_cpu_time_on_wheel != 0ULL) min_t = w->cached_min_cpu_time_on_wheel; else { min_t = timing_wheel_next_expiring_elt_time (w); w->cached_min_cpu_time_on_wheel = min_t; } *next_expiring_element_cpu_time = min_t; } return expired_user_data; } u8 * format_timing_wheel (u8 * s, va_list * va) { timing_wheel_t *w = va_arg (*va, timing_wheel_t *); int verbose = va_arg (*va, int); u32 indent = format_get_indent (s); s = format (s, "level 0: %.4e - %.4e secs, 2^%d - 2^%d clocks", (f64) (1 << w->log2_clocks_per_bin) / w->cpu_clocks_per_second, (f64) (1 << w->log2_clocks_per_wheel) / w->cpu_clocks_per_second, w->log2_clocks_per_bin, w->log2_clocks_per_wheel); if (verbose) { int l; s = format (s, "\n%Utime base advances %Ld, every %.4e secs", format_white_space, indent + 2, w->stats.cpu_time_base_advances, (f64) ((u64) 1 << w->n_wheel_elt_time_bits) / w->cpu_clocks_per_second); for (l = 0; l < vec_len (w->levels); l++) s = format (s, "\n%Ulevel %d: refills %Ld", format_white_space, indent + 2, l, l < vec_len (w->stats.refills) ? w->stats. refills[l] : (u64) 0); } return s; } /* * fd.io coding-style-patch-verification: ON * * Local Variables: * eval: (c-set-style "gnu") * End: */