diff options
Diffstat (limited to 'docs/developer/corefeatures/fib/graphwalks.rst')
-rw-r--r-- | docs/developer/corefeatures/fib/graphwalks.rst | 80 |
1 files changed, 80 insertions, 0 deletions
diff --git a/docs/developer/corefeatures/fib/graphwalks.rst b/docs/developer/corefeatures/fib/graphwalks.rst new file mode 100644 index 00000000000..e740660a2ed --- /dev/null +++ b/docs/developer/corefeatures/fib/graphwalks.rst @@ -0,0 +1,80 @@ +.. _graphwalks: + +Graph Walks +^^^^^^^^^^^^ + +All FIB object types are allocated from a VPP memory pool [#f13]_. The objects are thus +susceptible to memory re-allocation, therefore the use of a bare "C" pointer to refer +to a child or parent is not possible. Instead there is the concept of a *fib_node_ptr_t* +which is a tuple of type,index. The type indicates what type of object it is +(and hence which pool to use) and the index is the index in that pool. This allows +for the safe retrieval of any object type. + +When a child resolves via a parent it does so knowing the type of that parent. The +child to parent relationship is thus fully known to the child, and hence a forward +walk of the graph (from child to parent) is trivial. However, a parent does not choose +its children, it does not even choose the type. All object types that form part of the +FIB control plane graph all inherit from a single base class; *fib_node_t*. A *fib_node_t* +identifies the object's index and its associated virtual function table provides the +parent a mechanism to visit that object during the walk. The reason for a back-walk +is to inform all children that the state of the parent has changed in some way, and +that the child may itself need to update. + +To support the many to one, child to parent, relationship a parent must maintain a +list of its children. The requirements of this list are; + +- O(1) insertion and delete time. Several child-parent relationships are made/broken during route addition/deletion. +- Ordering. High priority children are at the front, low priority at the back (see section Fast Convergence) +- Insertion at arbitrary locations. + +To realise these requirements the child-list is a doubly linked-list, where each element +contains a *fib_node_ptr_t*. The VPP pool memory model applies to the list elements, so +they are also identified by an index. When a child is added to a list it is returned the +index of the element. Using this index the element can be removed in constant time. +The list supports 'push-front' and 'push-back' semantics for ordering. To walk the children +of a parent is then to iterate this list. + +A back-walk of the graph is a depth first search where all children in all levels of the +hierarchy are visited. Such walks can therefore encounter all object instances in the +FIB control plane graph, numbering in the millions. A FIB control-plane graph is cyclic +in the presence of a recursion loop, so the walk implementation has mechanisms to detect +this and exit early. + +A back-walk can be either synchronous or asynchronous. A synchronous walk will visit the +entire section of the graph before control is returned to the caller, an asynchronous +walk will queue the walk to a background process, to run at a later time, and immediately +return to the caller. To implement asynchronous walks a *fib_walk_t* object it added to +the front of the parent's child list. As children are visited the *fib_walk_t* object +advances through the list. Since it is inserted in the list, when the walk suspends +and resumes, it can continue at the correct location. It is also safe with respect to +the deletion of children from the list. New children are added to the head of the list, +and so will not encounter the walk, but since they are new, they already have the up to +date state of the parent. + +A VLIB process 'fib-walk' runs to perform the asynchronous walks. VLIB has no priority +scheduling between respective processes, so the fib-walk process does work in small +increments so it does not block the main route download process. Since the main download +process effectively has priority numerous asynchronous back-walks can be started on the +same parent instance before the fib-walk process can run. FIB is a 'final state' application. +If a parent changes n times, it is not necessary for the children to also update n +times, instead it is only necessary that this child updates to the latest, or final, +state. Consequently when multiple walks on a parent (and hence potential updates to a +child) are queued, these walks can be merged into a single walk. This +is the main reason the walks are designed this way, to eliminate (as +much as possible) redundant work and thus converge the system as fast +as possible. + +Choosing between a synchronous and an asynchronous walk is therefore a trade-off between +time it takes to propagate a change in the parent to all of its children, versus the +time it takes to act on a single route update. For example, if a route update were to +affect millions of child recursive routes, then the rate at which such updates could be +processed would be dependent on the number of child recursive route which would not be +good. At the time of writing FIB2.0 uses synchronous walk in all locations except when +walking the children of a path-list, and it has more than 32 [#f15]_ children. This avoids the +case mentioned above. + +.. rubric:: Footnotes: + +.. [#f13] Fast memory allocation is crucial to fast route update times. +.. [#f14] VPP may be written in C and not C++ but inheritance is still possible. +.. [#f15] The value is arbitrary and yet to be tuned. |