diff options
author | Neale Ranns <nranns@cisco.com> | 2020-11-09 10:09:42 +0000 |
---|---|---|
committer | Florin Coras <florin.coras@gmail.com> | 2021-01-14 19:55:55 +0000 |
commit | dfd3954c0427422e2739b858d1e18503a5c59970 (patch) | |
tree | 13225967f028f7d386b8da863656b5e576c0b463 /docs/gettingstarted/developers/fib20/routes.rst | |
parent | 1b5ca985dc51bea730ce5ee799641c75f73a0f26 (diff) |
docs: Update FIB documentation
Type: docs
Signed-off-by: Neale Ranns <nranns@cisco.com>
Change-Id: I3dfde4520a48c945ca9707accabbe1735c1a8799
Diffstat (limited to 'docs/gettingstarted/developers/fib20/routes.rst')
-rw-r--r-- | docs/gettingstarted/developers/fib20/routes.rst | 198 |
1 files changed, 154 insertions, 44 deletions
diff --git a/docs/gettingstarted/developers/fib20/routes.rst b/docs/gettingstarted/developers/fib20/routes.rst index 1ee09ced448..313a86c3af4 100644 --- a/docs/gettingstarted/developers/fib20/routes.rst +++ b/docs/gettingstarted/developers/fib20/routes.rst @@ -3,37 +3,109 @@ Routes ^^^^^^ -The control plane will install a route in a table for a prefix via a list of paths. -The prime function of the FIB is to *resolve* that route. To resolve a route is to -construct an object graph that fully describes all elements of the route. In Figure 3 -the route is resolved as the graph is complete from *fib_entry_t* to *ip_adjacency_t*. +Basics +------ -In some routing models a VRF will consist of a set of tables for IPv4 and IPv6, and -unicast and multicast. In VPP there is no such grouping. Each table is distinct from -each other. A table is identified by its numerical ID. The ID range is separate for -each address family. +The anatomy of a route is crucial to understand: -A table is comprised of two route data-bases; forwarding and non-forwarding. The +.. code-block:: console + + 1.1.1.0/24 via 10.0.0.1 eth0 + +A route is composed of two parts; **what** to match against and **how** to forward +the matched packets. In the above example we want to match packets +whose destination IP address is in the 1.1.1.0/24 subnet and then we +want to forward those packet to 10.0.0.1 on interface eth0. We +therefore want to match the **prefix** 1.1.1.0/24 and forward on the +**path** to 10.0.0.1, eth0. + +Matching on a prefix is the particular task of the IP FIB, matching on +other packet attributes is done by other subsystems, e.g. matching on +MPLS labels in the MPLS-FIB, or matching on a tuple in ACL based +forwarding (ABF), 'matching' on all packets that arrive on an L3 +interface (l3XC). Although these subsystems match on different +properties, they share the infrastructure on **how** to forward +matched packets, that is they share the **paths**. The FIB paths (or +really the path-list) thus provide services to clients, this service +is to **contribute** forwarding, this, in terms that will be made +clear in later sections, is to provide the DPO to use. + +The prime function of the FIB is to *resolve* the paths for a +route. To resolve a route is to construct an object graph that fully +describes how to forward matching packets. This means that the graph +must terminate with an object (the leaf node) that describes how +to send a packet on an interface [#f1]_, i.e what encap to add to the +packet and what interface to send it to; this is the purpose of the IP +adjacency object. In Figure 3 the route is resolved as the graph is +complete from *fib_entry_t* to *ip_adjacency_t*. + + +Thread Model +^^^^^^^^^^^^ + +The FIB is not thread safe. All actions on the FIB are expected to +occur exclusively in the main thread. However, the data-structures +that FIB updates to add routes are thread safe, +w.r.t. addition/deletion and read, therefore routes can be added +without holding the worker thread barrier lock. + + +Tables +------ + +An IP FIB is a set of prefixes against which to match; it is +sub-address family (SAFI) specific (i.e. there is one for ipv4 and ipv6, unicast +and multicast). An IP Table is address family (AFI) specific (i.e. the +'table' includes the unicast and multicast FIB). + +Each FIB is identified by the SAFI and instance number (the [pool] +index), each table is identified by the AFI and ID. The table's ID is +assigned by the user when the table is constructed. Table ID 0 is +reserved for the global/default table. + +In most routing models a VRF is composed of an IPv4 and IPv6 table, +however, VPP has no construct to model this association, it deals only +with tables and FIBs. + +A unicast FIB is comprised of two route data-bases; forwarding and non-forwarding. The forwarding data-base contains routes against which a packet will perform a longest prefix match (LPM) in the data-plane. The non-forwarding DB contains all the routes -with which VPP has been programmed some of these routes may be unresolved for reasons -that prevent their insertion into the forwarding DB -(see section: Adjacency source FIB entries). +with which VPP has been programmed. Some of these routes may be +unresolved, preventing their insertion into the forwarding DB. +(see section: Adjacency source FIB entries). + +Model +----- The route data is decomposed into three parts; entry, path-list and paths; -* The *fib_entry_t*, which contains the routes prefix, is representation of that prefix's entry in the FIB table. -* The *fib_path_t* is a description of where to send the packets destined to the route's prefix. There are several types of path. +* The *fib_entry_t*, which contains the route's prefix, is the representation of that prefix's entry in the FIB table. +* The *fib_path_t* is a description of where to send the packets destined to the route's prefix. There are several types of path, including: * Attached next-hop: the path is described with an interface and a next-hop. The next-hop is in the same sub-net as the router's own address on that interface, hence the peer is considered to be *attached* - * Attached: the path is described only by an interface. All address covered by the prefix are on the same L2 segment to which that router's interface is attached. This means it is possible to ARP for any address covered by the prefix which is usually not the case (hence the proxy ARP debacle in IOS). An attached path is only appropriate for a point-to-point (P2P) interface where ARP is not required, i.e. a GRE tunnel. + * Attached: the path is described only by an interface. An + attached path means that all addresses covered by the route's + prefix are on the same L2 segment to which that router's + interface is attached. This means it is possible to ARP for any + address covered by the route's prefix. If this is not the case + then another device in that L2 segment needs to run proxy + ARP. An attached path is really only appropriate for a point-to-point + (P2P) interface where ARP is not required, i.e. a GRE tunnel. On + a p2p interface, attached and attached-nexthop paths will + resolve via a special 'auto-adjacency'. This is an adjacency + whose next-hop is the all zeros address and describes the only + peer on the link. * Recursive: The path is described only via the next-hop and table-id. - * De-aggregate: The path is described only via the special all zeros address and a table-id. This implies a subsequent lookup in the table should be performed. + * De-aggregate: The path is described only via the special all + zeros address and a table-id. This implies a subsequent lookup + in the table should be performed. -* The *fib_path_list_t* represents the list of paths from which to choose one when forwarding. The path-list is a shared object, i.e. it is the parent to multiple fib_entry_t children. In order to share any object type it is necessary for a child to search for an existing object matching its requirements. For this there must be a data-base. The key to the path-list data-base is a combined description of all of the paths it contains [#f2]_. Searching the path-list database is required with each route addition, so it is populated only with path-lists for which sharing will bring convergence benefits (see Section: :ref:`fastconvergence`). + * There are other path types, please consult the code. + +* The *fib_path_list_t* represents the list of paths from which to choose when forwarding. A path-list is a shared object, i.e. it is the parent to multiple fib_entry_t children. In order to share any object type it is necessary for a child to search for an existing object matching its requirements. For this there must be a database. The key to the path-list database is a combined description of all of the paths it contains [#f2]_. Searching the path-list database is required with each route addition, so it is populated only with path-lists for which sharing will bring convergence benefits (see Section: :ref:`fastconvergence`). .. figure:: /_images/fib20fig2.png @@ -41,7 +113,7 @@ Figure 2: Route data model class diagram Figure 2 shows an example of a route with two attached-next-hop paths. Each of these paths will *resolve* by finding the adjacency that matches the paths attributes, which -are the same as the key for the adjacency data-base [#f3]_. The *forwarding information (FI)* +are the same as the key for the adjacency database [#f3]_. The *forwarding information (FI)* is the set of adjacencies that are available for load-balancing the traffic in the data-plane. A path *contributes* an adjacency to the route's forwarding information, the path-list contributes the full forwarding information for IP packets. @@ -60,7 +132,7 @@ convergence (see section :ref:`fastconvergence`). FIB sources """"""""""" There are various entities in the system that can add routes to the FIB tables. -Each of these entities is termed a *source* When the same prefix is added by different +Each of these entities is termed a *source*. When the same prefix is added by different sources the FIB must arbitrate between them to determine which source will contribute the forwarding information. Since each source determines the forwarding information using different best path and loop prevention algorithms, it is not correct for the @@ -70,17 +142,17 @@ priority assignment [#f4]_. The FIB must maintain the information each source ha so it can be restored should that source become the best source. VPP has two *control-plane* sources; the API and the CLI the API has the higher priority. Each *source* data is represented by a *fib_entry_src_t* object of which a -*fib_entry_t* maintains a sorted vector.n A prefix is *connected* when it is -applied to a routers interface. +*fib_entry_t* maintains a sorted vector. The following configuration: .. code-block:: console - $ set interface address 192.168.1.1/24 GigabitEthernet0/8/0 + $ set interface ip address GigabitEthernet0/8/0 192.168.1.1/24 results in the addition of two FIB entries; 192.168.1.0/24 which is connected and -attached, and 192.168.1.1/32 which is connected and local (a.k.a receive or for-us). +attached, and 192.168.1.1/32 which is connected and local (a.k.a. +receive or for-us). A prefix is *connected* when it is applied to a router's interface. Both prefixes are *interface* sourced. The interface source has a high priority, so the accidental or nefarious addition of identical prefixes does not prevent the router from correctly forwarding. Packets matching a connected prefix will @@ -95,9 +167,10 @@ route, which resolves via an attached path; $ ip route add table X 10.10.10.0/24 via gre0 -as mentioned before, these are only appropriate for point-to-point links. An -attached-host prefix is covered by either an attached prefix (note that connected -prefixes are also attached). If table X is not the table to which gre0 is bound, +as mentioned before, these are only appropriate for point-to-point +links. + +If table X is not the table to which gre0 is bound, then this is the case of an attached export (see the section :ref:`attachedexport`). Adjacency source FIB entries @@ -110,7 +183,7 @@ route is of the form: $ ip route add table X 10.0.0.1/32 via 10.0.0.1 GigabitEthernet0/8/0 -It is a host prefix with a path whose next-hop address is the same. This route +This is a host prefix with a path whose next-hop address is the same host. This route highlights the distinction between the route's prefix - a description of the traffic to match - and the path - a description of where to send the matched traffic. Table X is the same table to which the interface is bound. FIB entries that are @@ -133,22 +206,11 @@ where a route maintains a dependency relationship with the route that is its les specific cover. When this cover changes (i.e. there is a new covering route) or the forwarding information of the cover is updated, then the covered route is notified. Adj-fibs that fail this cover check are not installed in the fib_table_t's forwarding -table, there are only present in the non-forwarding table. +table, they are only present in the non-forwarding table. Overlapping sub-nets are not supported, so no adj-fib has multiple paths. The control plane is expected to remove a prefix configured for an interface before the interface -changes RF. - -So while the following configuration is accepted: - -.. code-block:: console - - $ set interface address 192.168.1.1/32 GigabitEthernet0/8/0 - $ ip arp 192.168.1.2 GigabitEthernet0/8/0 dead.dead.dead - $ set interface ip table GigabitEthernet0/8/0 2 - -it does not result in the desired behaviour, where the adj-fib and connected adjacencies are -moved to table 2. +changes VRF. Recursive Routes """""""""""""""" @@ -219,17 +281,65 @@ when the loop breaks, the affected children and be updated. Output labels """"""""""""" -A route may have associated out MPLS labels [#f11]_. These are labels that are expected +A route may have associated output MPLS labels [#f11]_. These are labels that are expected to be imposed on a packet as it is forwarded. It is important to note that an MPLS -label is per-route and per-path, therefore, even though routes share paths the do not +label is per-route and per-path, therefore, even though routes share paths they do not necessarily have the same label for that path [#f12]_. A label is therefore uniquely associated to a *fib_entry_t* and associated with one of the *fib_path_t* to which it forwards. -MPLS labels are modelled via the generic concept of a *path-extension* A *fib_entry_t* -therefore has a vector of zero to many *fib_path_ext_t objects* to represent the labels +MPLS labels are modelled via the generic concept of a *path-extension*. A *fib_entry_t* +therefore has a vector of zero to many *fib_path_ext_t* objects to represent the labels with which it is configured. + +Delegates +^^^^^^^^^ + +A common software development pattern, a delegate is a means to +extend the functionality of one object through composition of +another, these other objects are called delegates. Both +**fib_entry_t** and **ip_adjacency_t** support extension via delegates. + +The FIB uses delegates to add functionality when those functions are +required by only a few objects instances rather than all of them, to +save on memory. For example, building/contributing a load-balance +object used to forward non-EOS MPLS traffic is only required for a +fib_entry_t that corresponds to a BGP peer and that peer is +advertising labeled route - there are only a few of +these. See **fib_entry_delegate.h** for a full list of delegate types. + + +Tracking +^^^^^^^^ + +A prime service FIB provides for other sub-system is the ability to +'track' the forwarding for a given next-hop. For example, a tunnel +will want to know how to forward to its destination address. It can +therefore request of the FIB to track this host-prefix and inform it +when the forwarding for that prefix changes. + +FIB tracking sources a host-prefix entry in the FIB using the 'recusive +resolution (RR)' source, it exactly the same way that a recursive path +does. If the entry did not previsouly exist, then the RR source will +inherit (and track) forwarding from its covering prefix, therefore all +packets that match this entry are forwarded in the same way as if the +entry did not exist. The tunnel that is tracking this FIB entry will +become a child dependent. The benefit to creating the entry, is that +it now exists in the FIB node graph, so all actions that happen on its +parents, are propagated to the host-prefix entry and consequently to +the tunnel. + +FIB provides a wrapper to the sourcing of the host-prefix using a +delegate attached to the entry, and the entry is RR sourced only once. +. The benefit of this aproach is that each time a new client tracks +the entry it doesn't RR source it. When an entry is sourced all its +children are updated. Thus, new clients tracking an entry is +O(n^2). With the tracker as indirection, the entry is sourced only once. + + .. rubric:: Footnotes: +.. [#f1] Or terminate in an object that transitions the packet out of + the FIB domain, e.g. a drop. .. [#f2] Optimisations .. [#f3] Note it is valid for either interface to be bound to a different table than table 1 .. [#f4] The engaged reader can see the full priority list in vnet/vnet/fib/fib_entry.h |