VNET (VPP Network Stack) ======================== The files associated with the VPP network stack layer are located in the *./src/vnet* folder. The Network Stack Layer is basically an instantiation of the code in the other layers. This layer has a vnet library that provides vectorized layer-2 and 3 networking graph nodes, a packet generator, and a packet tracer. In terms of building a packet processing application, vnet provides a platform-independent subgraph to which one connects a couple of device-driver nodes. Typical RX connections include "ethernet-input" \[full software classification, feeds ipv4-input, ipv6-input, arp-input etc.\] and "ipv4-input-no-checksum" \[if hardware can classify, perform ipv4 header checksum\]. Effective graph dispatch function coding ---------------------------------------- Over the 15 years, multiple coding styles have emerged: a single/dual/quad loop coding model (with variations) and a fully-pipelined coding model. Single/dual loops ----------------- The single/dual/quad loop model variations conveniently solve problems where the number of items to process is not known in advance: typical hardware RX-ring processing. This coding style is also very effective when a given node will not need to cover a complex set of dependent reads. Here is an quad/single loop which can leverage up-to-avx512 SIMD vector units to convert buffer indices to buffer pointers: ```c static uword simulated_ethernet_interface_tx (vlib_main_t * vm, vlib_node_runtime_t * node, vlib_frame_t * frame) { u32 n_left_from, *from; u32 next_index = 0; u32 n_bytes; u32 thread_index = vm->thread_index; vnet_main_t *vnm = vnet_get_main (); vnet_interface_main_t *im = &vnm->interface_main; vlib_buffer_t *bufs[VLIB_FRAME_SIZE], **b; u16 nexts[VLIB_FRAME_SIZE], *next; n_left_from = frame->n_vectors; from = vlib_frame_vector_args (frame); /* * Convert up to VLIB_FRAME_SIZE indices in "from" to * buffer pointers in bufs[] */ vlib_get_buffers (vm, from, bufs, n_left_from); b = bufs; next = nexts; /* * While we have at least 4 vector elements (pkts) to process.. */ while (n_left_from >= 4) { /* Prefetch next quad-loop iteration. */ if (PREDICT_TRUE (n_left_from >= 8)) { vlib_prefetch_buffer_header (b[4], STORE); vlib_prefetch_buffer_header (b[5], STORE); vlib_prefetch_buffer_header (b[6], STORE); vlib_prefetch_buffer_header (b[7], STORE); } /* * $$$ Process 4x packets right here... * set next[0..3] to send the packets where they need to go */ do_something_to (b[0]); do_something_to (b[1]); do_something_to (b[2]); do_something_to (b[3]); /* Process the next 0..4 packets */ b += 4; next += 4; n_left_from -= 4; } /* * Clean up 0...3 remaining packets at the end of the incoming frame */ while (n_left_from > 0) { /* * $$$ Process one packet right here... * set next[0..3] to send the packets where they need to go */ do_something_to (b[0]); /* Process the next packet */ b += 1; next += 1; n_left_from -= 1; } /* * Send the packets along their respective next-node graph arcs * Considerable locality of reference is expected, most if not all * packets in the inbound vector will traverse the same next-node * arc */ vlib_buffer_enqueue_to_next (vm, node, from, nexts, frame->n_vectors); return frame->n_vectors; } ``` Given a packet processing task to implement, it pays to scout around looking for similar tasks, and think about using the same coding pattern. It is not uncommon to recode a given graph node dispatch function several times during performance optimization. Creating Packets from Scratch ----------------------------- At times, it's necessary to create packets from scratch and send them. Tasks like sending keepalives or actively opening connections come to mind. Its not difficult, but accurate buffer metadata setup is required. ### Allocating Buffers Use vlib_buffer_alloc, which allocates a set of buffer indices. For low-performance applications, it's OK to allocate one buffer at a time. Note that vlib_buffer_alloc(...) does NOT initialize buffer metadata. See below. In high-performance cases, allocate a vector of buffer indices, and hand them out from the end of the vector; decrement _vec_len(..) as buffer indices are allocated. See tcp_alloc_tx_buffers(...) and tcp_get_free_buffer_index(...) for an example. ### Buffer Initialization Example The following example shows the **main points**, but is not to be blindly cut-'n-pasted. ```c u32 bi0; vlib_buffer_t *b0; ip4_header_t *ip; udp_header_t *udp; /* Allocate a buffer */ if (vlib_buffer_alloc (vm, &bi0, 1) != 1) return -1; b0 = vlib_get_buffer (vm, bi0); /* Initialize the buffer */ VLIB_BUFFER_TRACE_TRAJECTORY_INIT (b0); /* At this point b0->current_data = 0, b0->current_length = 0 */ /* * Copy data into the buffer. This example ASSUMES that data will fit * in a single buffer, and is e.g. an ip4 packet. */ if (have_packet_rewrite) { clib_memcpy (b0->data, data, vec_len (data)); b0->current_length = vec_len (data); } else { /* OR, build a udp-ip packet (for example) */ ip = vlib_buffer_get_current (b0); udp = (udp_header_t *) (ip + 1); data_dst = (u8 *) (udp + 1); ip->ip_version_and_header_length = 0x45; ip->ttl = 254; ip->protocol = IP_PROTOCOL_UDP; ip->length = clib_host_to_net_u16 (sizeof (*ip) + sizeof (*udp) + vec_len(udp_data)); ip->src_address.as_u32 = src_address->as_u32; ip->dst_address.as_u32 = dst_address->as_u32; udp->src_port = clib_host_to_net_u16 (src_port); udp->dst_port = clib_host_to_net_u16 (dst_port); udp->length = clib_host_to_net_u16 (vec_len (udp_data)); clib_memcpy (data_dst, udp_data, vec_len(udp_data)); if (compute_udp_checksum) { /* RFC 7011 section 10.3.2. */ udp->checksum = ip4_tcp_udp_compute_checksum (vm, b0, ip); if (udp->checksum == 0) udp->checksum = 0xffff; } b0->current_length = vec_len (sizeof (*ip) + sizeof (*udp) + vec_len (udp_data)); } b0->flags |= (VLIB_BUFFER_TOTAL_LENGTH_VALID; /* sw_if_index 0 is the "local" interface, which always exists */ vnet_buffer (b0)->sw_if_index[VLIB_RX] = 0; /* Use the default FIB index for tx lookup. Set non-zero to use another fib */ vnet_buffer (b0)->sw_if_index[VLIB_TX] = 0; ``` If your use-case calls for large packet transmission, use vlib_buffer_chain_append_data_with_alloc(...) to create the requisite buffer chain. ### Enqueueing packets for lookup and transmission The simplest way to send a set of packets is to use vlib_get_frame_to_node(...) to allocate fresh frame(s) to ip4_lookup_node or ip6_lookup_node, add the constructed buffer indices, and dispatch the frame using vlib_put_frame_to_node(...). ```c vlib_frame_t *f; f = vlib_get_frame_to_node (vm, ip4_lookup_node.index); f->n_vectors = vec_len(buffer_indices_to_send); to_next = vlib_frame_vector_args (f); for (i = 0; i < vec_len (buffer_indices_to_send); i++) to_next[i] = buffer_indices_to_send[i]; vlib_put_frame_to_node (vm, ip4_lookup_node_index, f); ``` It is inefficient to allocate and schedule single packet frames. That's typical in case you need to send one packet per second, but should **not** occur in a for-loop! Packet tracer ------------- Vlib includes a frame element \[packet\] trace facility, with a simple debug CLI interface. The cli is straightforward: "trace add input-node-name count" to start capturing packet traces. To trace 100 packets on a typical x86\_64 system running the dpdk plugin: "trace add dpdk-input 100". When using the packet generator: "trace add pg-input 100" To display the packet trace: "show trace" Each graph node has the opportunity to capture its own trace data. It is almost always a good idea to do so. The trace capture APIs are simple. The packet capture APIs snapshoot binary data, to minimize processing at capture time. Each parti
.. _hardwarecommands:

.. toctree::

Show Hardware-Interfaces
========================
Display more detailed information about all or a list of given
interfaces. The verboseness of the output can be controlled by the
following optional parameters:

-  brief: Only show name, index and state (default for bonded
   interfaces).
-  verbose: Also display additional attributes (default for all other
   interfaces).
-  detail: Also display all remaining attributes and extended
   statistics.

**To limit the output of the command to bonded interfaces and their
slave interfaces, use the '*bond*' optional parameter.**

Summary/Usage
-------------

.. code-block:: shell

    show hardware-interfaces [brief|verbose|detail] [bond] [<interface> [<interface> [..]]] [<sw_idx> [<sw_idx> [..]]].

Examples
--------
Example of how to display default data for all interfaces:

.. code-block:: console

    vpp# show hardware-interfaces
                  Name                Idx   Link  Hardware
    GigabitEthernet7/0/0               1     up   GigabitEthernet7/0/0
      Ethernet address ec:f4:bb:c0:bc:fc
      Intel e1000
        carrier up full duplex speed 1000 mtu 9216
        rx queues 1, rx desc 1024, tx queues 3, tx desc 1024
        cpu socket 0
    GigabitEthernet7/0/1               2     up   GigabitEthernet7/0/1
      Ethernet address ec:f4:bb:c0:bc:fd
      Intel e1000
        carrier up full duplex speed 1000 mtu 9216
        rx queues 1, rx desc 1024, tx queues 3, tx desc 1024
        cpu socket 0
    VirtualEthernet0/0/0               3     up   VirtualEthernet0/0/0
      Ethernet address 02:fe:a5:a9:8b:8e
    VirtualEthernet0/0/1               4     up   VirtualEthernet0/0/1
      Ethernet address 02:fe:c0:4e:3b:b0
    VirtualEthernet0/0/2               5     up   VirtualEthernet0/0/2
      Ethernet address 02:fe:1f:73:92:81
    VirtualEthernet0/0/3               6     up   VirtualEthernet0/0/3
      Ethernet address 02:fe:f2:25:c4:68
    local0                             0    down  local0
      local

Example of how to display '*verbose*' data for an interface by name and software index (where 2 is the software index):

.. code-block:: console

    vpp# show hardware-interfaces GigabitEthernet7/0/0 2 verbose
                   Name                Idx   Link  Hardware
    GigabitEthernet7/0/0               1     up   GigabitEthernet7/0/0
      Ethernet address ec:f4:bb:c0:bc:fc
      Intel e1000
        carrier up full duplex speed 1000 mtu 9216
        rx queues 1, rx desc 1024, tx queues 3, tx desc 1024
        cpu socket 0
    GigabitEthernet7/0/1               2    down  GigabitEthernet7/0/1
      Ethernet address ec:f4:bb:c0:bc:fd
      Intel e1000
        carrier up full duplex speed 1000 mtu 9216
        rx queues 1, rx desc 1024, tx queues 3, tx desc 1024
        cpu socket 0

Clear Hardware-Interfaces
=========================

Clear the extended statistics for all or a list of given interfaces
(statistics associated with the '*show hardware-interfaces*' command).


Summary/Usage
-------------

.. code-block:: shell

    clear hardware-interfaces [<interface> [<interface> [..]]] [<sw_idx> [<sw_idx> [..]]].
                

Examples
--------

Example of how to clear the extended statistics for all interfaces:


.. code-block:: console

        vpp# clear hardware-interfaces

Example of how to clear the extended statistics for an interface by name and software index (where 2 is the software index): 

.. code-block:: console

        vpp# clear hardware-interfaces GigabitEthernet7/0/0 2
structs an arbitrary set of packet classifier tables for use with "pcap rx | tx | drop trace," and with the vpp packet tracer on a per-interface or system-wide basis. Packets which match a rule in the classifier table chain will be traced. The tables are automatically ordered so that matches in the most specific table are tried first. It's reasonably likely that folks will configure a single table with one or two matches. As a result, we configure 8 hash buckets and 128K of match rule space by default. One can override the defaults by specifying "buckets " and "memory-size " as desired. To build up complex filter chains, repeatedly issue the classify filter debug CLI command. Each command must specify the desired mask and match values. If a classifier table with a suitable mask already exists, the CLI command adds a match rule to the existing table. If not, the CLI command add a new table and the indicated mask rule ### Configure a simple pcap classify filter ``` classify filter pcap mask l3 ip4 src match l3 ip4 src 192.168.1.11 pcap trace rx max 100 filter ``` ### Configure a simple per-interface capture filter ``` classify filter GigabitEthernet3/0/0 mask l3 ip4 src match l3 ip4 src 192.168.1.11" pcap trace rx max 100 intfc GigabitEthernet3/0/0 ``` Note that per-interface capture filters are _always_ applied. ### Clear per-interface capture filters ``` classify filter GigabitEthernet3/0/0 del ``` ### Configure another fairly simple pcap classify filter ``` classify filter pcap mask l3 ip4 src dst match l3 ip4 src 192.168.1.10 dst 192.168.2.10 pcap trace tx max 100 filter ``` ### Configure a vpp packet tracer filter ``` classify filter trace mask l3 ip4 src dst match l3 ip4 src 192.168.1.10 dst 192.168.2.10 trace add dpdk-input 100 filter ``` ### Clear all current classifier filters ``` classify filter [pcap | | trace] del ``` ### To inspect the classifier tables ``` show classify table [verbose] ``` The verbose form displays all of the match rules, with hit-counters. ### Terse description of the "mask " syntax: ``` l2 src dst proto tag1 tag2 ignore-tag1 ignore-tag2 cos1 cos2 dot1q dot1ad l3 ip4 ip6 version hdr_length src[/width] dst[/width] tos length fragment_id ttl protocol checksum version traffic-class flow-label src dst proto payload_length hop_limit protocol l4 tcp udp src_port dst_port src dst # ports src_port dst_port ``` To construct **matches**, add the values to match after the indicated keywords in the mask syntax. For example: "... mask l3 ip4 src" -> "... match l3 ip4 src 192.168.1.11" ## VPP Packet Generator We use the VPP packet generator to inject packets into the forwarding graph. The packet generator can replay pcap traces, and generate packets out of whole cloth at respectably high performance. The VPP pg enables quite a variety of use-cases, ranging from functional testing of new data-plane nodes to regression testing to performance tuning. ## PG setup scripts PG setup scripts describe traffic in detail, and leverage vpp debug CLI mechanisms. It's reasonably unusual to construct a pg setup script which doesn't include a certain amount of interface and FIB configuration. For example: ``` loop create set int ip address loop0 192.168.1.1/24 set int state loop0 up packet-generator new { name pg0 limit 100 rate 1e6 size 300-300 interface loop0 node ethernet-input data { IP4: 1.2.3 -> 4.5.6 UDP: 192.168.1.10 - 192.168.1.254 -> 192.168.2.10 UDP: 1234 -> 2345 incrementing 286 } } ``` A packet generator stream definition includes two major sections: - Stream Parameter Setup - Packet Data ### Stream Parameter Setup Given the example above, let's look at how to set up stream parameters: - **name pg0** - Name of the stream, in this case "pg0" - **limit 1000** - Number of packets to send when the stream is enabled. "limit 0" means send packets continuously. - **maxframe \** - Maximum frame size. Handy for injecting multiple frames no larger than \. Useful for checking dual / quad loop codes - **rate 1e6** - Packet injection rate, in this case 1 MPPS. When not specified, the packet generator injects packets as fast as possible - **size 300-300** - Packet size range, in this case send 300-byte packets - **interface loop0** - Packets appear as if they were received on the specified interface. This datum is used in multiple ways: to select graph arc feature configuration, to select IP FIBs. Configure features e.g. on loop0 to exercise those features. - **tx-interface \** - Packets will be transmitted on the indicated interface. Typically required only when injecting packets into post-IP-rewrite graph nodes. - **pcap \** - Replay packets from the indicated pcap capture file. "make test" makes extensive use of this feature: generate packets using scapy, save them in a .pcap file, then inject them into the vpp graph via a vpp pg "pcap \" stream definition - **worker \** - Generate packets for the stream using the indicated vpp worker thread. The vpp pg generates and injects O(10 MPPS / core). Use multiple stream definitions and worker threads to generate and inject enough traffic to easily fill a 40 gbit pipe with small packets. ### Data definition Packet generator data definitions make use of a layered implementation strategy. Networking layers are specified in order, and the notation can seem a bit counter-intuitive. In the example above, the data definition stanza constructs a set of L2-L4 headers layers, and uses an incrementing fill pattern to round out the requested 300-byte packets. - **IP4: 1.2.3 -> 4.5.6** - Construct an L2 (MAC) header with the ip4 ethertype (0x800), src MAC address of 00:01:00:02:00:03 and dst MAC address of 00:04:00:05:00:06. Mac addresses may be specified in either _xxxx.xxxx.xxxx_ format or _xx:xx:xx:xx:xx:xx_ format. - **UDP: 192.168.1.10 - 192.168.1.254 -> 192.168.2.10** - Construct an incrementing set of L3 (IPv4) headers for successive packets with source addresses ranging from .10 to .254. All packets in the stream have a constant dest address of 192.168.2.10. Set the protocol field to 17, UDP. - **UDP: 1234 -> 2345** - Set the UDP source and destination ports to 1234 and 2345, respectively - **incrementing 256** - Insert up to 256 incrementing data bytes. Obvious variations involve "s/IP4/IP6/" in the above, along with changing from IPv4 to IPv6 address notation. The vpp pg can set any / all IPv4 header fields, including tos, packet length, mf / df / fragment id and offset, ttl, protocol, checksum, and src/dst addresses. Take a look at ../src/vnet/ip/ip[46]_pg.c for details. If all else fails, specify the entire packet data in hex: - **hex 0xabcd...** - copy hex data verbatim into the packet When replaying pcap files ("**pcap \**"), do not specify a data stanza. ### Diagnosing "packet-generator new" parse failures If you want to inject packets into a brand-new graph node, remember to tell the packet generator debug CLI how to parse the packet data stanza. If the node expects L2 Ethernet MAC headers, specify ".unformat_buffer = unformat_ethernet_header": ``` /* *INDENT-OFF* */ VLIB_REGISTER_NODE (ethernet_input_node) = { .unformat_buffer = unformat_ethernet_header, }; ``` Beyond that, it may be necessary to set breakpoints in .../src/vnet/pg/cli.c. Debug image suggested. When debugging new nodes, it may be far simpler to directly inject ethernet frames - and add a corresponding vlib_buffer_advance in the new node - than to modify the packet generator. ## Debug CLI The descriptions above describe the "packet-generator new" debug CLI in detail. Additional debug CLI commands include: ``` vpp# packet-generator enable [] ``` which enables the named stream, or all streams. ``` vpp# packet-generator disable [] ``` disables the named stream, or all streams. ``` vpp# packet-generator delete ``` Deletes the named stream. ``` vpp# packet-generator configure [limit ] [rate ] [size -] ``` Changes stream parameters without having to recreate the entire stream definition. Note that re-issuing a "packet-generator new" command will correctly recreate the named stream.